2016-10-29


It is so heart-warming to read of guys who just  have  the  right
view onto the world around them:

     There are several reasons why our library does not fol-
     low the ANSI design for wide and multi-byte characters.
     The ANSI model was designed by  a  committee,  untried,
     almost  as an afterthought, whereas we wanted to design
     as we built. (We made several major changes to the  in-
     terface  as  we  became  familiar with the problems in-
     volved.) We disagree with ANSI C’s handling of  invalid
     multi-byte  sequences.  Also, the ANSI C library is in-
     complete: although it contains  some  crucial  routines
     for  handling wide and multi-byte characters, there are
     some serious omissions. For example, our  software  can
     exploit the fact that UTF preserves ASCII characters in
     the byte stream. We could remove that assumption by re-
     placing  all  calls  to  strchr with utfrune and so on.
     (Because of the weaker properties of the original  UTF,
     we  have actually done so.) ANSI C cannot: the standard
     says nothing about the representation, so portable code
     should   never  call  strchr,  yet  there  is  no  ANSI
     equivalent to utfrune. ANSI  C  simultaneously  invali-
     dates strchr and offers no replacement.

     Finally, ANSI did nothing to integrate wide  characters
     into  the  I/O  system: it gives no method for printing
     wide characters. We therefore  needed  to  invent  some
     things  and  decided  to invent everything. In the end,
     some of our entry points do correspond closely to  ANSI
     routines  --  for example chartorune and runetochar are
     similar to mbtowc and wctomb -- but  Plan  9’s  library
     defines more functionality, enough to write real appli-
     cations comfortably.  [0]


[0]  http://doc.cat-v.org/plan_9/4th_edition/papers/utf


http://marmaro.de/lue/        markus schnalke <meillo@marmaro.de>