Maybe I miss some settings (please direct me to RTFM), but
UTF8-database with utf8 general-ci + utf8 (russian) texts give me bad and unexpected results - nothing found, nothing filtered - when I try to find any text (which exist in texts) in russian
More exactly - search-result window will show almost all (pure english content excluded) records instead of requested
Example (just one) - tried to search word =search&serendipity[searchTerm]=%D0%B2%D0%BE%D0%B7%D1%80%D0%B0%D1%81%D1%82&serendipity[searchButton]=%3E]возраст from this entry
English term
=search&serendipity[searchTerm]=idiot&serendipity[searchButton]=%3E]idiot found and filtered correctly
QuickSearch and 8bit terms
-
LazyBadger
- Regular
- Posts: 176
- Joined: Mon Aug 25, 2008 12:25 pm
- Location: Russia
- Contact:
QuickSearch and 8bit terms
Quis custodiet ipsos custodes?
-
garvinhicking
- Core Developer
- Posts: 30022
- Joined: Tue Sep 16, 2003 9:45 pm
- Location: Cologne, Germany
- Contact:
Re: QuickSearch and 8bit terms
Hi!
Which MySQL version are you using? If you have phpmyadmin, try to execute this query:
and see if that returns proper entries? Maybe the UTF-8 characters are not properly en/decoded, you could edit the include/functions_entries.inc.php file and in the serendipity_searchEntries() function replace this:
with:
to see what the actuall sql query is...
HTH,
Garvin
Which MySQL version are you using? If you have phpmyadmin, try to execute this query:
Code: Select all
SELECT * FROM serendipity_entries WHERE MATCH(title,body,extended) AGAINST('Russioan characters here')
Code: Select all
$search =& serendipity_db_query($querystring);
Code: Select all
$search =& serendipity_db_query($querystring);
echo "QUERY: " . $querystring;
HTH,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
-
LazyBadger
- Regular
- Posts: 176
- Joined: Mon Aug 25, 2008 12:25 pm
- Location: Russia
- Contact:
Re: QuickSearch and 8bit terms
Server version: 5.1.49garvinhicking wrote: Which MySQL version are you using?
0 stringsIf you have phpmyadmin, try to execute this query
At least it seems so: on UTF8 PHPAdmin page all russian content are shown in unreadable formMaybe the UTF-8 characters are not properly en/decoded
Query 1 - "разница"echo "QUERY: " . $querystring;
Code: Select all
QUERY: SELECT ep_sticky.value AS orderkey, ep_cache_extended.value AS ep_cache_extended, ep_cache_body.value AS ep_cache_body, e.id, e.authorid, a.realname AS author, e.allow_comments, e.moderate_comments, a.email, e.timestamp, e.comments, e.title, e.body, e.extended, e.trackbacks, e.exflag, e.isdraft, e.last_modified, a.username AS loginname FROM s_entries e LEFT JOIN s_authors a ON e.authorid = a.authorid LEFT JOIN s_entrycat ec ON e.id = ec.entryid LEFT OUTER JOIN s_entryproperties ep_cache_extended ON (e.id = ep_cache_extended.entryid AND ep_cache_extended.property = 'ep_cache_extended') LEFT OUTER JOIN s_entryproperties ep_cache_body ON (e.id = ep_cache_body.entryid AND ep_cache_body.property = 'ep_cache_body') LEFT OUTER JOIN s_entryproperties ep_access ON (e.id = ep_access.entryid AND ep_access.property = 'ep_access') LEFT JOIN s_entryproperties ep_sticky ON (e.id = ep_sticky.entryid AND ep_sticky.property = 'ep_is_sticky') WHERE (MATCH(title,body,extended) AGAINST('разница')) AND isdraft = 'false' AND timestamp <= 1315845900 AND (ep_access.property IS NULL OR ep_access.value = 'public') GROUP BY e.id ORDER BY timestamp DESC LIMIT 15QUERY: SELECT ep_sticky.value AS orderkey, ep_cache_extended.value AS ep_cache_extended, ep_cache_body.value AS ep_cache_body, e.id, e.authorid, a.realname AS author, e.allow_comments, e.moderate_comments, a.email, e.timestamp, e.comments, e.title, e.body, e.extended, e.trackbacks, e.exflag, e.isdraft, e.last_modified, a.username AS loginname FROM s_entries e LEFT JOIN s_authors a ON e.authorid = a.authorid LEFT JOIN s_entrycat ec ON e.id = ec.entryid LEFT OUTER JOIN s_entryproperties ep_cache_extended ON (e.id = ep_cache_extended.entryid AND ep_cache_extended.property = 'ep_cache_extended') LEFT OUTER JOIN s_entryproperties ep_cache_body ON (e.id = ep_cache_body.entryid AND ep_cache_body.property = 'ep_cache_body') LEFT OUTER JOIN s_entryproperties ep_access ON (e.id = ep_access.entryid AND ep_access.property = 'ep_access') LEFT JOIN s_entryproperties ep_sticky ON (e.id = ep_sticky.entryid AND ep_sticky.property = 'ep_is_sticky') WHERE (MATCH(title,body,extended) AGAINST('разница*' IN BOOLEAN MODE)) AND isdraft = 'false' AND timestamp <= 1315845900 AND (ep_access.property IS NULL OR ep_access.value = 'public') GROUP BY e.id ORDER BY timestamp DESC LIMIT 15- nothing found for разница (must be at least 1 hit)
- empty search-result window (now)
Query 2 - "возраст"
Code: Select all
QUERY: SELECT ep_sticky.value AS orderkey, ep_cache_extended.value AS ep_cache_extended, ep_cache_body.value AS ep_cache_body, e.id, e.authorid, a.realname AS author, e.allow_comments, e.moderate_comments, a.email, e.timestamp, e.comments, e.title, e.body, e.extended, e.trackbacks, e.exflag, e.isdraft, e.last_modified, a.username AS loginname FROM s_entries e LEFT JOIN s_authors a ON e.authorid = a.authorid LEFT JOIN s_entrycat ec ON e.id = ec.entryid LEFT OUTER JOIN s_entryproperties ep_cache_extended ON (e.id = ep_cache_extended.entryid AND ep_cache_extended.property = 'ep_cache_extended') LEFT OUTER JOIN s_entryproperties ep_cache_body ON (e.id = ep_cache_body.entryid AND ep_cache_body.property = 'ep_cache_body') LEFT OUTER JOIN s_entryproperties ep_access ON (e.id = ep_access.entryid AND ep_access.property = 'ep_access') LEFT JOIN s_entryproperties ep_sticky ON (e.id = ep_sticky.entryid AND ep_sticky.property = 'ep_is_sticky') WHERE (MATCH(title,body,extended) AGAINST('возраст')) AND isdraft = 'false' AND timestamp <= 1315845900 AND (ep_access.property IS NULL OR ep_access.value = 'public') GROUP BY e.id ORDER BY timestamp DESC LIMIT 15- total records: 72 (with unwanted entries \no string inside\ and without expected)
Quis custodiet ipsos custodes?
-
LazyBadger
- Regular
- Posts: 176
- Joined: Mon Aug 25, 2008 12:25 pm
- Location: Russia
- Contact:
Re: QuickSearch and 8bit terms
Notes
1. Table
ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
2. Serendipity used without "SET NAMES"
1. Table
ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
2. Serendipity used without "SET NAMES"
Quis custodiet ipsos custodes?
-
LazyBadger
- Regular
- Posts: 176
- Joined: Mon Aug 25, 2008 12:25 pm
- Location: Russia
- Contact:
Re: QuickSearch and 8bit terms
Well, Garvin, we can close this issue
Second blog with (almost) the same settings works with 8bit russian text, single difference is SET NAMES ON for good blog.
I suppose, this value have to be documented in user's doc as mandatory for non-English UTF8 blogs
Second blog with (almost) the same settings works with 8bit russian text, single difference is SET NAMES ON for good blog.
I suppose, this value have to be documented in user's doc as mandatory for non-English UTF8 blogs
Quis custodiet ipsos custodes?
-
garvinhicking
- Core Developer
- Posts: 30022
- Joined: Tue Sep 16, 2003 9:45 pm
- Location: Cologne, Germany
- Contact:
Re: QuickSearch and 8bit terms
Hi!
Yes, definitely SET NAMES is the default and should always be used to properly utilize UTF-8.
Users should only change that if they have charset problems (usually fmor an upgraded blog or oder MySQL version) - we AFAIK also state that this setting should only be changed with care...
regards
Garvin
Yes, definitely SET NAMES is the default and should always be used to properly utilize UTF-8.
Users should only change that if they have charset problems (usually fmor an upgraded blog or oder MySQL version) - we AFAIK also state that this setting should only be changed with care...
regards
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/