|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: Little, Timothy (TLittle
ThomasGlobal.com)
Date: Thu Nov 20 2008 - 15:30:03 CST
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
We are using MySQL 5.0.22 on CENTOS/redhat linux. The table and database character-sets are all utf8.
We have a database supporting numerous languages. Of course, full-text works beautifully with most of the languages.
But Chinese and Japanese are giving us problems, and there is NO reason why it should be a problem since we are taking measures to help the database see word-breaks.
When we insert the Chinese and Japanese passages, they have spaces (normal ASCII $14-#32) between each word (verified). So basically if you have two words like {APPLE}{DRUM} then we put {APPLE} then space then {DRUM}. If you have UTF-8 then you can look at this sample, 三坐标测量机 固定架
When we try to match either {APPLE} or {DRUM} individually (or technically 三坐标测量机 or 固定架 ) then MySQL fails to find a match against anything. But clearly it should find those.
MySQL is only finding matches for Japanese and Chinese on exact full-string matches, which is clearly less than ideal.
I have already changed the ft min length setting to 1, to no avail.
What is going wrong, and how do I fix this?
Here is my sample query (selecting for ONE word
select *
from category_attributes
where match ( value ) against ( '三坐标测量机' ) > 0
When I replace the word with固定架 then it still doesn't match anything. And there is a row with merely
三坐标测量机 space固定架
Tim...
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]