8.3 ASCII 7-bits, BS to
overstrike
This
charset is available in Recode under the name
ASCII-BS, with BS as an
acceptable alias.
The file is straight ASCII, seven bits only.
According to the definition of ASCII, diacritics
are applied by a sequence of three characters: the
letter, one BS, the diacritic mark. We
deviate slightly from this by exchanging the
diacritic mark and the letter so, on a screen
device, the diacritic will disappear and let the
letter alone. At recognition time, both methods are
acceptable.
The French
quotes are coded by the sequences:
< BS " or
" BS < for the opening
quote and > BS " or
" BS >
for the closing quote. This artificial convention
was inherited in straight ASCII-BS
from habits around Bang-Bang entry,
and is not well known. But we decided to stick to
it so that ASCII-BS charset will not
lose French quotes.
The
ASCII-BS charset is independent of
ASCII, and different. The following
examples demonstrate this, knowing at advance that
‘!2’ is the
Bang-Bang way of representing an
e with an acute accent. Compare:
% echo \!2 | recode -v bang..l1/d
Request: Bang-Bang..ISO-8859-1/Decimal-1
233, 10
with:
% echo \!2 | recode -v bang..bs/d
Request: Bang-Bang..ISO-8859-1..ASCII-BS/Decimal-1
39, 8, 101, 10
In the
first case, the e with an acute accent
is merely transmitted by the
Latin-1..ASCII mapping, not having a
special recoding rule for it. In the
Latin-1..ASCII-BS case, the acute
accent is applied over the e with a
backspace: diacriticised characters have special
rules. For the ASCII-BS charset,
reversibility is still possible, but there might be
difficult cases.
|