Quicksearch doesn't work with russian lng

Found a bug? Tell us!!
granit

Quicksearch doesn't work with russian lng

Post by granit »

Quicksearch doesn't work with russian lng. But, all blog in UTF-8, well problem with unicode too. This problem I see in 0.8 version, now I upgrade to 0.9.1 and problem still unresolved.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Quicksearch doesn't work with russian lng

Post by garvinhicking »

The problem is not SErendipity, it's your SQL database which has trouble.

You should try to configure your SQL server so that it uses UTF-8 tables, and then configure your blog to also use UTF-8 language (but russian already uses this, AFAIK).

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Guest

Post by Guest »

OK
I misunderstood slightly about "configure your mysql server". I have normal shared hosting, which does not have root-access, I have only shell and phpmyadmin. Could you give me a clue how to convert tables to UTF-8? A shell command? Pleeeeease :roll: ;)
Guest

Post by Guest »

So you are using a MySQL database, right? Which version of that?

You need tohaveatleast MySQL 4.1 to support UTF-8 databases properly. Then phpMyAdmin offers you to change the charset of tables and databases! Please have alook at the phpmyadmin documentation or forum to see how exactly to achieve that;but it's not veryhard. :)

Bestregards,
Garvin
granit

Post by granit »

yes. I use MySQL 4.1.15
I look charset of tables in phpmyadmin - utf8_unicode_ci . Quicksearch doesn't work.

When I used wordpress the search was working without any tricks like "configure the server", so in my opinion it turns out to be not only the problem of the SQL server, but Serendipity also. Once again, what are the conditions I have to follow (blog charset, server charset etc) to make the search work right?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

But you did set your serendipity configto "UTF-8", right?

Did you check the columns of your tables, if they also all are utf_8_unicode_ci?

Are you using mysqli or mysql library (PHP4 or PHP5?) When not using mysqli, you must ask your sysadmin if the default connection charset of mysql is UTF-8.

Sretting up a working UTF-8 mysql is a bit difficult andhardtocheck...

Bestregards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
granit

Post by granit »

garvinhicking wrote:But you did set your serendipity configto "UTF-8", right?
yes
garvinhicking wrote:Did you check the columns of your tables, if they also all are utf_8_unicode_ci
yes, all - utf8_unicode_ci

and on main page of phpmyadmin - MySQL charset: UTF-8 Unicode (utf8)
garvinhicking wrote:Are you using mysqli or mysql library (PHP4 or PHP5?)
turn on - php4. But, I have a posssibility to turn on php5(run PHP as CGI only)

and I have turn on run PHP as CGI by default


So, what a problem with it?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Maybe you could set PHP5 active then. Then enter the s9y config, and use "mysqli" as your database extension.

Maybe this helps...if not, I'm a bit out of clues because I have little knowledge foreign/russian special characters...

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
granit

Post by granit »

I did it(turn on php5,mysqli) , but it doesn't help. There is 3 encoding in Russian language - win1251, koi8-r and ibm866 for DOS(not using now). But it is unicode, and I think that all of 3 Russian encoding include in unicode. I think that it problem with unicode, not with Russian language as such.And I think that it's problem with realization of searching mechanism(mybe not, but how wordpress normally working, on my host?). Generally speaking — its bug of serendipity.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

WordPress AFAIK does not use the MYSQL fulltext search capability; it only uses a LIKE search, which does not employ mysql character set conversions and word/string boundaries.

What you could try to do is to debug the query that s9y is sending.

Open your include/functions_entries.inc.php file and search for the serendipity_searchEntries() function.There you'll see how a $query variable is built. Befor it is executed via serendipity_db_query, try to echo that variablE:

Code: Select all

echo $sql;
then you can seeit in the browser and maybetry it in phpMyAdmin or tell me how the query looks?

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
granit

Post by granit »

ok, I did it.

include/functions_entries.inc.php - put in 'echo $querystring;'


Code: Select all

$querystring = "SELECT $distinct
                            e.id,
                            e.authorid,
                            a.realname AS author,
                            a.email,
                            ec.categoryid,
                            e.timestamp,
                            e.comments,
                            e.title,
                            e.body,
                            e.extended,
                            e.trackbacks,
                            e.exflag
                    {$serendipity['fullCountQuery']}
                    $group
                  ORDER BY  timestamp DESC
                    $limit";
   [color=red] echo $querystring;[/color]
    $search = serendipity_db_query($querystring);

and this is result

Code: Select all

SELECT e.id, e.authorid, a.realname AS author, a.email, ec.categoryid, e.timestamp, e.comments, e.title, e.body, e.extended, e.trackbacks, e.exflag FROM serendipity_entries e LEFT JOIN serendipity_authors a ON e.authorid = a.authorid LEFT JOIN serendipity_entrycat ec ON e.id = ec.entryid LEFT JOIN serendipity_category c ON ec.categoryid = c.categoryid LEFT JOIN serendipity_authorgroups AS acl_a ON acl_a.authorid = 1 LEFT JOIN serendipity_access AS acl_acc ON ( acl_acc.artifact_mode = 'read' AND acl_acc.artifact_type = 'category' AND acl_acc.artifact_id = c.categoryid ) WHERE MATCH(title,body,extended) AGAINST('бегемот') AND isdraft = 'false' AND timestamp <= 1136647395 AND ( c.categoryid IS NULL OR ( acl_acc.groupid = acl_a.groupid OR acl_acc.groupid = 0) OR ( acl_acc.artifact_id IS NULL ) ) GROUP BY e.id ORDER BY timestamp DESC LIMIT 15
granit

Post by granit »

I made some analysis, it takes a unicode text, considers it as a collection of bytes, but not letters and each letter above 128 of code it utf-8-izes again and we got a real mess. I do not know in which place there is a failure, in which item of the chain - maybe in the settings of php or MySQl or serendipity problem, but on the other server, my friend's, the search does not work either. I more and more believe that this is a problem of serendipity.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

And if you execute that piece of SQL via phpMyAdmin, does it yield results?

If only I would get along with russian characters; I am quite sure it's a FULLTEXT MySQL issue with the character set or mysql version.

You could most probably fix up your problem by changing this code in the include/functions_entries.inc.php file:

Code: Select all

MATCH(title,body,extended) AGAINST('$term')
replace that with:

Code: Select all

title like '%$term%' OR body like '%$term%' OR extended LIKE '%$term%'
This will then use the same technique like WP, but it performs a lot slower...

best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
granit

Post by granit »

thnx a lot, the search is working again. too bad you don't know cyrillic charset, i'd likely help the devteam with intricate ecodings but i'm not a php-coder. I love SE, and i don't like WP at all. Thanks for wasting your time on me :)
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

If I even knew how to properly enter cyrillic codes!:-D

However, if the current search helped you, I am much relieved. I don't consider this wasted time atall, it just nags at me to not know the real problem why the mysql fulltext didn't work.

Maybe, when your provider upgrades hismysql to 5.0 you might want to try it again, as MySQL5 has been improved a lot onmultilingual issues...

Bestregards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Post Reply