Page 1 of 1

Spartacus: Character encoding bug in package_event.xml

Posted: Thu Aug 27, 2009 9:15 am
by VladaAjgl
Hi,
I observe behavior of plugin listing and installation which in my opinion is false.

how to reproduce the bug:
- set language as czech (cs - utf-8)
- admin section -> plugin configuration -> install new sidebar plugins
- see for example Recaptcha plugin (serendipity_event_recaptcha). Important: you must not have this plugin on your server. There is description like:

Code: Select all

Pøi vkládání komentáøù používá systém kryptogramù Recaptcha (je tøeba pøedem zažádat o pøístupový klíè)
- Click to install it, the plugin will download from online repository to your server.
- admin section -> plugin configuration -> install new sidebar plugins
- see Recaptcha plugin again. Now the description encoding is correct:

Code: Select all

Při vkládání komentářů používá systém kryptogramů Recaptcha (je třeba předem zažádat o přístupový klíč)
my hints and explanation:
- the file package_event.xml that contains info about plugins not downloaded to the server is probably created in local, non UTF-8 encoding
- when using local encoding (in case of czech language it is win-1250) in UTF-8 page, you see squares and other weired symbols
- Spartacus gives priority to locally stored (on the server) language files. So if there already exist language file, it displays its content, then the description of plugin is correct.
- Previous behavior has another consequence. If spartacus finds locally stored english file, but does not find czech (for example) file, it displays english descripiton regardless of czech description in package_event.xml. So when you download and install some plugin and then uninstall it, it remains on your server. If tomorrow somebody translates this plugin, you will never see the translation of the descripition, even if it exists. In other words, the separate function of upgrading language file of the plugin would be nice:-)

Regards
Vlad

Re: Spartacus: Character encoding bug in package_event.xml

Posted: Thu Aug 27, 2009 10:01 am
by garvinhicking
Hi!

Hm, inside Spartacus the generation of those files is done with the emerge_spartacus.php and emerge.sh scripts.

For german, the file does contain proper UTF-8, so I wonder why it might not work for czech. Sadly I can't really read your characters and don'T really use the Win-1250 charset, so my means are little to really see how/when the error occurs. Maybe you have an idea what the script does wrong?

The root of it would be that the build-process uses the "local" version of a language, and then later encodes it to UTF-8. Maybe what would help is to instead always load up the UTF-8 version of a language.

I've just made an important update to the freetag plugin so I don't want to garbage the build process right now, but in a few days, I could simply try to load the UTF8 version and see if it generates the proper files?

Regards,
Garvin

Re: Spartacus: Character encoding bug in package_event.xml

Posted: Thu Aug 27, 2009 12:38 pm
by VladaAjgl
Hi Garvin,
garvinhicking wrote:Maybe you have an idea what the script does wrong?
Yeas, I have:-) I looked a little on emerge_spartacus.php.
garvinhicking wrote: The root of it would be that the build-process uses the "local" version of a language, and then later encodes it to UTF-8. Maybe what would help is to instead always load up the UTF-8 version of a language.
All the trouble is that the script uses a php function utf8_encode. This function expects its parameter to be in ISO-8859-1 encoding. See http://cz2.php.net/manual/en/function.utf8-encode.php. So when german local files are in this encoding, it works. But when czech files are in win-1250 and utf8_encode treats it as ISO-8859-1, it makes false characters.
garvinhicking wrote: I could simply try to load the UTF8 version and see if it generates the proper files?
This should work pretty well.

Bye
Vlad

Re: Spartacus: Character encoding bug in package_event.xml

Posted: Sun Nov 22, 2009 9:31 am
by VladaAjgl
Hi Garvin,
I see there is no progress yet. I modified the emerge_spartacus.php script to use utf-8 language files for generation on .xml files. This should work correctly. Could you try it?
Regards,
Vlad

Re: Spartacus: Character encoding bug in package_event.xml

Posted: Sun Nov 22, 2009 12:33 pm
by garvinhicking
Hi!

Sorry, this got completely out of sight for me. I'll try your patch and put it on my todolist, sadly I'll need another 2 weeks until I can get started on this!

Best regards,
Garvin

Re: Spartacus: Character encoding bug in package_event.xml

Posted: Sun Nov 29, 2009 9:47 pm
by garvinhicking
Hi!

Just used this file. Let's see what changes it brings for tomorrow.

Regards,
Garvin

Re: Spartacus: Character encoding bug in package_event.xml

Posted: Fri Dec 11, 2009 6:38 pm
by VladaAjgl
Hi!
Great! Now it works perfect on my side. At least for czech language:-)
Regards Vlad