Languages other than en-US or Latin

I've found that Beaumont is usable with Japanese in unicode but there are some annoying problems. In the order of technical complexity (in my guess), they are:

1. URL links look strange. It will look like:

http://xxx.com/page.php?tag=----

This is not very good. I don't mind having to enter English title to use for URL, if necessary...

2. Files with name in Japanese get broken filenames. In many cases the Japanese part just gets removed and sometimes I get names like .jpg, that's right,  filenames in Japanese still use .jpg extension for JPEG files. (Did you know that?)

3. Tags. Tags are comma- or space-separated so it should be relatively easy but Japanese phrases are all but ignored.

4. Search. I bet this is a bit tricky because word boundaries are not obvious, but it would be very nice to be able to use search in multiple languages.
rewrite rules
by root / at 21:25 on January 24, 2007

If you turn on URL rewriting (and install the supplied .htaccess file), URLs will be generated as /p/tag

Re: rewrite rules
by Ryuji / at 07:27 on January 25, 2007

Yes, I used URL rewriting initially as it is recommended in the siteframe.ini, but I realized that the edit page gets confused, probably because my page titles often used all-Japanese phrases and the page names are -, --, ---, ----, etc. I would greatly appreciate some minimal level of nonenglish support in the future versions...

others
by root / at 21:27 on January 24, 2007

2) yes, it attempts to replace all non-ASCII characters with "-".
3) Hm, I'm just using the PHP split() function, so maybe that's not multi-language aware.
4) Yeah, at this time, the search table is defined as UTF-8, and it uses MySQL's text search function. I don't know what it would take to get it to work with Japanese.

The summary in RSS feed also gets corrupted sometimes
by Ryuji / at 17:00 on January 28, 2007

The lead text followed by a ... is not very predictable if the text is in Japanese. Sometimes the last character gets broken. More often, there is no text and just ... is given to the summary. Is this easy to fix?