This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Subject: multilanguage site From: "Francesco Pasqualini" <f.pasqualini@cpsinformatica.it> Date: Tue, 29 Aug 2000 14:58:13 +0200 can someone suggest me the best way to build a multilanguage web site (english, french, ..). I'm using Apache + mod_perl + Apache::asp (for applications) Can be usefull XML/XSL whit AxKit ? Is there any example/guideline ? === Subject: Re: multilanguage site From: Matt Sergeant <matt@sergeant.org> Date: Tue, 29 Aug 2000 14:15:41 +0100 (BST) On Tue, 29 Aug 2000, Francesco Pasqualini wrote: > can someone suggest me the best way to build a multilanguage web site > (english, french, ..). > I'm using Apache + mod_perl + Apache::asp (for applications) > > Can be usefull XML/XSL whit AxKit ? > Is there any example/guideline ? This month's Web Techniques is all about this (albeit in a framework independant manner). I suggest you try as hard as you can to get a copy as it covers way more than I could possibly type here. Also look up content negotiation in the Apache docs. === Subject: RE: multilanguage site From: Jerrad Pierce <Jerrad.Pierce@networkengines.com> Date: Tue, 29 Aug 2000 09:24:36 -0400 Try this: http://webtechniques.com/archives/2000/09/yunker/ and perhaps this: http://webtechniques.com/archives/2000/09/lagon/ === Subject: Re: multilanguage site From: David Hodgkinson <daveh@hodgkinson.org> Date: 29 Aug 2000 14:29:34 +0100 Francesco Pasqualini" <f.pasqualini@cpsinformatica.it> writes: > can someone suggest me the best way to build a multilanguage web site > (english, french, ..). > I'm using Apache + mod_perl + Apache::asp (for applications) > > Can be usefull XML/XSL whit AxKit ? > Is there any example/guideline ? I'm interested in this too :-) The Deep Purple site just went vaguely multilingual, but I'm doing this with straight Apache MultiViews (which _are_ honoured by SSI, which is nice) and I can see this becoming a huge headache. I'd like to do it with the Template Toolkit if at all possible. === Subject: Re: multilanguage site From: Joshua Chamas <joshua@chamas.com> Date: Tue, 29 Aug 2000 13:10:46 -0700 Francesco Pasqualini wrote: > > can someone suggest me the best way to build a multilanguage web site > (english, french, ..). > I'm using Apache + mod_perl + Apache::asp (for applications) > > Can be usefull XML/XSL whit AxKit ? > Is there any example/guideline ? > The approach used by Paul at RedHat seems to have been to wrap internationalized messages with <tag>message</tag> where <tag> is an XMLSub, which would do a lookup at runtime into a message catalog for the right message, based on what language the client was set to. I'm sure its much more complicated than that, but that was the gist of it. === Subject: Re: multilanguage site From: Paul Lindner <plindner@redhat.com> Date: Tue, 29 Aug 2000 13:18:26 -0700 On Tue, Aug 29, 2000 at 01:10:46PM -0700, Joshua Chamas wrote: > Francesco Pasqualini wrote: > > > > can someone suggest me the best way to build a multilanguage web site > > (english, french, ..). > > I'm using Apache + mod_perl + Apache::asp (for applications) > > > > Can be usefull XML/XSL whit AxKit ? > > Is there any example/guideline ? > > > > The approach used by Paul at RedHat seems to have been > to wrap internationalized messages with <tag>message</tag> > where <tag> is an XMLSub, which would do a lookup at runtime > into a message catalog for the right message, based on what > language the client was set to. I'm sure its much more > complicated than that, but that was the gist of it. Yeah, it's more complicated than that. :-) Basically there are four tools that we use, based on a hacked version of Locale::PGetText, and the standard .po file format provided by GNU gettext. The tools are: XText - extracts <msg>xxx</msg> text, Apps::gettext() strings into messages.po ... then we cp messages.po to messages.<LANGCODE>.po and convert MsgProcess - processes messages.<LANGCODE>.po into messages.db msgmerge - standard GNU gettext stuff. At runtime the code dynamically looks up the message text in the local messages.db file. Let me know if anyone is interested in this stuff. It's a bit rough at this point but works quite well for us. === Subject: Re: multilanguage site From: "Eric L. Brine" <ebrine@home.com> Date: Fri, 01 Sep 2000 23:18:13 -0400 As far as I can tell there's no way in html to indicate to the browser > that a chunk of content is in some other encoding other than what was > specified in the headers or meta tag. There's no <span charset=...> > attribute or anything like that. This seems to make truly multilingual > pages really awkward. > You basically must use an encoding like UTF-8 which can reach the > entire unicode character set or else you cannot mix languages. Not quite. To display characters not in the current character set, use "&...;" encodings, such as "é" and "✏" (where 9999 is unicode). === Subject: Re: multilanguage site From: Matt Sergeant <matt@sergeant.org> Date: Sat, 2 Sep 2000 08:50:34 +0100 (BST) On 1 Sep 2000, Greg Stark wrote: > > > >> can someone suggest me the best way to build a multilanguage web site > > >> (english, french, ..). > > >> I'm using Apache + mod_perl + Apache::asp (for applications) > > I'm really interested in what other people are doing here. We've just released > our first cut at i18n and it's going fairly well. But so far we haven't dealt > with the big bugaboo, character encoding. > > One major problem I anticipate is what to do when individual include files are > not available in the local language. For iso-8859-1 encoded languages that's > not a major hurdle as we can simply use the english text until it's > translated. But for other encodings does it make sense to include english > text? > > If we use UTF-8 all the ascii characters would display properly, but do most > browsers support UTF-8 now? Or do people still use BIG5, EUS, etc? My experience has been really good. With 4.x+ browsers UTF8 displays just fine, with the obvious caveat that you have to be using the right fonts. Generally the people you are displaying to have the right fonts (otherwise they wouldn't be able to use their computers!). My only problems were two things: 1. Title bars in Linux just displayed junk. This was probably both an encoding/window manager issue and a font issue. 2. People don't want their content in UTF8 - they want it in the character set they are used to, like ISO-8859-2. So I added support in AxKit for alternate output encodings. Of course being XML, AxKit handles different character sets in included files just fine - everything is UTF8 to axkit. > As far as I can tell there's no way in html to indicate to the browser that a > chunk of content is in some other encoding other than what was specified in > the headers or meta tag. There's no <span charset=...> attribute or anything > like that. Yes, there is. > This seems to make truly multilingual pages really awkward. You > basically must use an encoding like UTF-8 which can reach the entire unicode > character set or else you cannot mix languages. Or use AxKit ;-) === Subject: Re: multilanguage site From: "Eric L. Brine" <ebrine@home.com> Date: Sat, 02 Sep 2000 13:08:05 -0400 As far as I can tell there's no way in html to indicate to the > > browser that a chunk of content is in some other encoding other > > than what was specified in the headers or meta tag. There's no > > <span charset=...> attribute or anything like that. > > Yes, there is. None exists in the standard, as seen below, and I don't see anything in CSS either. <!ELEMENT SPAN - - (%inline;)* -- generic language/style container --> <!ATTLIST SPAN %attrs; -- %coreattrs, %i18n, %events -- %reserved; -- reserved for possible future use -- > <!ENTITY % attrs "%coreattrs; %i18n; %events;"> <!ENTITY % coreattrs "id ID #IMPLIED -- document-wide unique id -- class CDATA #IMPLIED -- space-separated list of classes -- style %StyleSheet; #IMPLIED -- associated style info -- title %Text; #IMPLIED -- advisory title --" > <!ENTITY % i18n "lang %LanguageCode; #IMPLIED -- language code -- dir (ltr|rtl) #IMPLIED -- direction for weak/neutral text --" > <!ENTITY % events "onclick %Script; #IMPLIED -- a pointer button was clicked -- ondblclick %Script; #IMPLIED -- a pointer button was double clicked-- onmousedown %Script; #IMPLIED -- a pointer button was pressed down -- onmouseup %Script; #IMPLIED -- a pointer button was released -- onmouseover %Script; #IMPLIED -- a pointer was moved onto -- onmousemove %Script; #IMPLIED -- a pointer was moved within -- onmouseout %Script; #IMPLIED -- a pointer was moved away -- onkeypress %Script; #IMPLIED -- a key was pressed and released -- onkeydown %Script; #IMPLIED -- a key was pressed down -- onkeyup %Script; #IMPLIED -- a key was released --" > === Subject: Re: multilanguage site From: Matt Sergeant <matt@sergeant.org> Date: Sun, 3 Sep 2000 07:41:46 +0100 (BST) On Sat, 2 Sep 2000, Eric L. Brine wrote: > > > > As far as I can tell there's no way in html to indicate to the > > > browser that a chunk of content is in some other encoding other > > > than what was specified in the headers or meta tag. There's no > > > <span charset=...> attribute or anything like that. > > > > Yes, there is. > > None exists in the standard, as seen below, and I don't see anything in > CSS either. My bad. I was mistaken by HTML form's accept-charset attribute. === Subject: Re: multilanguage site From: =?UTF-8?Q?Ri=C4=8Dardas_=C4=8Cepas?= <rch@richard.eu.org> Date: Sun, 3 Sep 2000 06:27:38 +0200 On Fri Sep 1 23:18:13 2000 -0400 Eric L. Brine wrote: > > > You basically must use an encoding like UTF-8 which can reach the > > entire unicode character set or else you cannot mix languages. > > Not quite. To display characters not in the current character set, use > "&...;" encodings, such as "é" and "✏" (where 9999 is > unicode). > This would require unicode capable browser anyway. Even more, Netscape v4 doesn't show these escapes unless you set encoding to utf-8. === Subject: Re: [OT] multilanguage site From: "G.W. Haywood" <ged@www.jubileegroup.co.uk> Date: Sun, 3 Sep 2000 08:49:12 +0100 (BST) Hi all, On Sun, 3 Sep 2000, [UTF-8] Ri