Previous Top


If you look at the pattern of dedicated bits in UTF-8 you might wonder if you could keep going... you can, and that was clearly the original idea.

It's very interesting that with perl you can still use the 21 to 31 bit range of characters internally (and you can save it to files, provided you use perl's "lax" form of UTF-8 encoding). It's hard to think of a good reason to do this, though

(Tom Christiansen has an interesting suggestion he included in the 4th camel, an idiom where you mark which portion of a string you've handled by shifting it's codepoints way up, and then back down when you're done...)
Next Top

doom@kzsu.stanford.edu