Special characters in the days of week

Found a bug? Tell us!!
Post Reply
fapo
Posts: 4
Joined: Sat Sep 08, 2007 1:41 pm
Location: Italy

Special characters in the days of week

Post by fapo »

I found a thin problem in the days of week.
In my language (italian) many days finish with an accented "i" (ì) and in Serendipity 1.2 those letters get substituted with "?".

To fix the problem I modified the file include/functions.inc.php , row 206:

Code: Select all

    return main_mb('ucfirst', main_strftime($cache[$format], (int)$time, $useOffset, $useDate));
in:

Code: Select all

    return htmlentities(main_mb('ucfirst', main_strftime($cache[$format], (int)$time, $useOffset, $useDate)));
Now works well. For safety, I tested with Firefox 2.0.0.6 (Linux), Opera 9.20 (Linux), IE 6.0 (Windows), IE 7.0 (Windows).

Ask for a maintainer to consider the modification. Need no citations.

Fabio.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Special characters in the days of week

Post by garvinhicking »

Hi!

Your problem is that you are using a system locale that does not fit your serendipity character set. If you're using UTF-8, you need to make sure your server has your UTF-8 locale of italian.

Using your patch would yield invalid html entities in all UTF-8 powered blogs with umlauts, who use the proper locale.

Thanks a lot though for discussing this here!

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
fapo
Posts: 4
Joined: Sat Sep 08, 2007 1:41 pm
Location: Italy

Re: Special characters in the days of week

Post by fapo »

Thanks for your answer.
garvinhicking wrote:Hi!

Your problem is that you are using a system locale that does not fit your serendipity character set. If you're using UTF-8, you need to make sure your server has your UTF-8 locale of italian.
I use a free of charge hosting service (helloweb.eu): in the control panel I can only specify my nation, but no charsets.
The server report me that accept both ISO-8859-1 and UTF-8 charset.

(The service is very known in Italy and the other blogs/websites have not this kind of problem probably due to charset compatibility).

I seen with google that is diffused in many italian bloggers. Till now I found only 3 websites displaying the accented "i" correctly (2 developers and 1 politic). The rest have keep it so or they changed theme (with the days of the week abbreviated).

(you can search in www.google.it with keywords: "Amministrazione di Serendipity", click over "Torna al weblog" and search for "Lunedì", "Martedì")
garvinhicking wrote: Using your patch would yield invalid html entities in all UTF-8 powered blogs with umlauts, who use the proper locale.
Consider that my patch is very poor because I'm not a PHP expert and I have no time at this moment to implement something better. And my "charsets" knowledge is very limited... :roll:

I tried to do a htmlentities("Lunedì"); in Helloweb.eu and in effect the charset is different: it's shown bad (on Windows and Ubuntu Linux, with charset ISO-8859-15 by default).
The result is "Lunedì" instead of "Lunedì", but in Serendipity is shown well anyway, with modification (right UTF-8 headers?).

(The first link shows bad "Lunedì" because I kept the browser to decide - ISO-8859-15)
http://fapo.helloweb.eu/studiovideo/2007/tests/data.php
http://fapo.helloweb.eu/studiovideo/200 ... hpinfo.php
(my brand new blog, patched)
http://www.studiovideo.tk/
(another blog with the problem, the host is on payment fee; I think GoDaddy servers or affiliated)
http://www.frikkysound.com/blog/

Another doubt is that the problem is not present in the article and comment bodies, but only in all the days of week...

I have no time these days to make some more, but I'm also thinking about a special option in the Serendipity panel, to permit a non-standard but simple remedy for anyone...

Cheers,
Fabio.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Special characters in the days of week

Post by garvinhicking »

Hi!
I use a free of charge hosting service (helloweb.eu): in the control panel I can only specify my nation, but no charsets.
Maybe you can ask your provider whether they provide UTF-8 charset locales, like de_DE.UTF-8 and de_DE.ISO-8859-1.
(The service is very known in Italy and the other blogs/websites have not this kind of problem probably due to charset compatibility)
Maybe other PHP applications do not use locale system, but Serendipity does. :-)
The result is "Lunedì" instead of "Lunedì", but in Serendipity is shown well anyway, with modification (right UTF-8 headers?).
It might be shown well, but try to validate it with a W3 Validator, it will result in a validating error.

So we can't really fix this with encoding. There is only one way to handle it, and that would be to install the matching locale on your server. The only other option is to not use locales for dates, which would mean to translate each weekday and monthname for ever s9y language we have. So, I'd really rely on the locale system, usually providers can install missing locales in a matter of a few seconds.

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
fapo
Posts: 4
Joined: Sat Sep 08, 2007 1:41 pm
Location: Italy

Re: Special characters in the days of week

Post by fapo »

garvinhicking wrote:It might be shown well, but try to validate it with a W3 Validator, it will result in a validating error.
? Really I tried it, but there is no error about that:

http://validator.w3.org/check?uri=http% ... &verbose=1
(many errors are due to my articles)

and the 33rd row is incredibly correct (&igrave):
"<h3 class="main_date">Venerdì, 7 settembre 2007</h3>"

I have not well understood why you say that UTF-8 won't consider ì : in Italy web developers use many times this tag (like the &uuml in dutch).
Maybe I missing something: in "Lunedì", the resulting "i" character (after htmlentities()) won't be a single (or double) 8-bit character but a string "&igrave", right ? :confused:
garvinhicking wrote:So we can't really fix this with encoding. There is only one way to handle it, and that would be to install the matching locale on your server. The only other option is to not use locales for dates, which would mean to translate each weekday and monthname for ever s9y language we have. So, I'd really rely on the locale system, usually providers can install missing locales in a matter of a few seconds.

Best regards,
Garvin
Understand, in effect I have too much doubts to follow my point of view.
I'll try to make present to helloweb.eu , thank you so much for your patience and availability.

Cheers,
Fabio
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Special characters in the days of week

Post by garvinhicking »

Hi!

True, my new tests showed that all validators are fine with having a "ä" in a document that is XHTML 1.1 and UTF-8. I was pretty sure, that those umlauts were no longer allowed when using XHTML, and that one had to use the propper 2-byte UTF-8 marks.

So, maybe your patch is really the best way to deal with ISO vs. UTF-locales. I must make more thorough checks, but if it works fine with you, please use your patch. I'll try to make some complete tests in the future and decide how it can be put to use in Serendipity! Thanks a lot!

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Post Reply