svlug-sort_locale_sort_order_interactions

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



Date: Fri, 18 Oct 2002 12:33:19 -0400
From: George Georgalis <georgw@galis.org>
To: svlug@lists.svlug.org
Subject: [svlug] sort not -d

I running sort on a set of email addresses, but it's ignoring
the non-alph-numeric characters and returning this

amanda
A.Maso
amie.g
amy.be

when I expected this

A.Maso
amanda
amie.g
amy.be

I am not using the -d switch, and I even tried recompiling sort from
debian/main/t/textutils/textutils_2.0.orig.tar.gz
thinking there was something wrong with my distro's sort.

I get the same results with the new package. The man/info pages didn't
give me any clues, is there some way I can fix this? I'm looking at
src/sort.c now.

--===============5965081979081539==
Content-Type: message/rfc822
MIME-Version: 1.0

Date: Fri, 18 Oct 2002 10:04:28 -0700
From: Romain Kang <romain@kzsu.stanford.edu>
To: George Georgalis <georgw@galis.org>
Cc: svlug@lists.svlug.org
Subject: [svlug] Re: sort not -d
Message-ID: <20021018170428.GA62020@kzsu.stanford.edu>
In-Reply-To: <20021018123319.F16606@trot>
References: <20021018123319.F16606@trot>
Content-Type: text/plain; charset=us-ascii
MIME-Version: 1.0
Precedence: list
Message: 4

Hmm, the sort works fine for me.  Perhaps you have a funky locale
somewhere in your environment?   The Red Hat sort man page says:

       ***  WARNING  ***  The locale specified by the environment
       affects sort order.  Set LC_ALL=C to get  the  traditional
       sort order that uses native byte values.


--===============5965081979081539==
Content-Type: message/rfc822
MIME-Version: 1.0

Date: Fri, 18 Oct 2002 13:08:46 -0400
From: George Georgalis <georgw@galis.org>
To: svlug@lists.svlug.org
Subject: [svlug] Re: sort not -d
Message-ID: <20021018130846.I16606@trot>
In-Reply-To: <200210181645.AA20530@proxima.ucsd.edu.UCSD.EDU>;
	from cdl@proxima.ucsd.edu on Fri, Oct 18, 2002 at 09:45:46AM -0700
References: <200210181645.AA20530@proxima.ucsd.edu.UCSD.EDU>
Content-Type: text/plain; charset=us-ascii
MIME-Version: 1.0
Precedence: list
Message: 5

On Fri, Oct 18, 2002 at 09:45:46AM -0700, Carl Lowenstein wrote:
>> Date: Fri, 18 Oct 2002 12:33:19 -0400
>> From: George Georgalis <georgw@galis.org>
>> To: kplug-list@kernel-panic.org
>> Subject: sort not -d
>>
>> I running sort on a set of email addresses, but it's ignoring
>> the non-alph-numeric characters and returning this
>>
>> amanda
>> A.Maso
>> amie.g
>> amy.be
>>
>> when I expected this
>>
>> A.Maso
>> amanda
>> amie.g
>> amy.be
>>
>> I am not using the -d switch, and I even tried recompiling sort from
>> debian/main/t/textutils/textutils_2.0.orig.tar.gz
>> thinking there was something wrong with my distro's sort.
>>
>> I get the same results with the new package. The man/info pages didn't
>> give me any clues, is there some way I can fix this? I'm looking at
>> src/sort.c now.
>
>You have been bitten by an Internationalization (I18N) bug.  Somewhere
>there is a LOCALE environment variable which is set to English_US.
>Unfortunately, the collation sequence for English_US is not what most
>of us expect.  Set your LOCALE to C.  I think the problem is buried deep
>inside libc.
>
>I don't have my hands on a new-enough Linux machine at this instant to
>give the exact incantation, but it has gone around a couple of times on
>this mailing list.
>
>OK, I found it by grep on my miscellaneous saved mail.
>
>	LC_COLLATE=en_US	# default setting, undesirable
>	LC_COLLATE=C
>	LC_COLLATE=POSIX	# two different ways to get what you want


export LC_COLLATE=C

Works perfect, thanks Carl.



--===============5965081979081539==
Content-Type: message/rfc822
MIME-Version: 1.0

Date: Fri, 18 Oct 2002 13:21:25 -0400
From: George Georgalis <georgw@galis.org>
To: svlug@lists.svlug.org
Subject: [svlug] Re: sort not -d
Message-ID: <20021018132125.K16606@trot>
In-Reply-To: <20021018170428.GA62020@kzsu.stanford.edu>;
	from romain@kzsu.stanford.edu on Fri, Oct 18, 2002 at 10:04:28AM -0700
References: <20021018123319.F16606@trot>
	<20021018170428.GA62020@kzsu.stanford.edu>
Content-Type: text/plain; charset=us-ascii
MIME-Version: 1.0
Precedence: list
Message: 6

On Fri, Oct 18, 2002 at 10:04:28AM -0700, Romain Kang wrote:
>Hmm, the sort works fine for me.  Perhaps you have a funky locale
>somewhere in your environment?   The Red Hat sort man page says:
>
>       ***  WARNING  ***  The locale specified by the environment
>       affects sort order.  Set LC_ALL=C to get  the  traditional
>       sort order that uses native byte values.
>

odd, I didn't see that :)

man 1 locale is interesting too.



===

the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu