2.2 Overview of surfaces
For various
practical considerations, it sometimes happens that
the codes making up a text, written in a particular
charset, cannot simply be put out in a file one
after another without creating problems or breaking
other things. Sometimes, 8-bit codes cannot be
written on a 7-bit medium, variable length codes
need kind of envelopes, newlines require special
treatment, etc. We sometimes have to apply
surfaces to a stream of codes, which
surfaces are kind of tricks used to fit the charset
into those practical constraints. Moreover, similar
surfaces or tricks may be useful for many unrelated
charsets, and many surfaces can be used at once
over a single charset.
So, Recode has
machinery to describe a combination of a charset
with surfaces used over it in a file. We would use
the expression pure charset for
referring to a charset free of any surface, that
is, the conceptual association between integer
codes and character intents.
It is not
always clear if some transformation will yield a
charset or a surface, especially for those
transformations which are only meaningful over a
single charset. The Recode library is not overly
picky as identifying surfaces as such: when it is
practical to consider a specialised surface as if
it were a charset, this is preferred, and done.
|