[ProgSoc] Unicode glyph "meta" information?

Bryn Davies curious.jp at gmail.com
Sun Sep 7 23:10:25 EST 2008


On Sun, Sep 7, 2008 at 10:20 PM, Rob Howard <rhoward at progsoc.org> wrote:
> In any case, when I receive an email that contains unicode (Japanese in
> this case), the glyphs end up being rendered with something that almost
> exactly matches the reading for the characters. For example, an email
> contains a "アツィラ" (a-te-i-ra) in the message body, which is rendered
> in mutt as "ATiR".

 Hi Rob,

 That is interesting. At first I thought your program might be parsing
the "name" property, but it wouldn't make much sense to write T when
you could write TSU instead. There's a list of Unicode properties here
- <http://unicode.org/cldr/utility/properties.jsp> and I thought this
might be the result of the properties Simple_Uppercase_Mapping field
or similar, but a look at the codepoint for ツ in
<http://unicode.org/cldr/utility/character.jsp> doesn't show those
properties as being anything other than the default.

 Is it just using a mapping file? I know the perl people have
something like this (connected to ftp://ftp.unicode.org/MAPPINGS/) but
perhaps it's system wide.

 B.

-- 
http://progsoc.org/~curious/
餓鬼も人数 。


More information about the Progsoc mailing list