Free recode package

Next: , Previous: Charset overview, Up: Introduction


2.2 Overview of surfaces

For various practical considerations, it sometimes happens that the codes making up a text, written in a particular charset, cannot simply be put out in a file one after another without creating problems or breaking other things. Sometimes, 8-bit codes cannot be written on a 7-bit medium, variable length codes need kind of envelopes, newlines require special treatment, etc. We sometimes have to apply surfaces to a stream of codes, which surfaces are kind of tricks used to fit the charset into those practical constraints. Moreover, similar surfaces or tricks may be useful for many unrelated charsets, and many surfaces can be used at once over a single charset.

So, Recode has machinery to describe a combination of a charset with surfaces used over it in a file. We would use the expression pure charset for referring to a charset free of any surface, that is, the conceptual association between integer codes and character intents.

It is not always clear if some transformation will yield a charset or a surface, especially for those transformations which are only meaningful over a single charset. The Recode library is not overly picky as identifying surfaces as such: when it is practical to consider a specialised surface as if it were a charset, this is preferred, and done.