1 Quick Tutorial
So,
really, you just are in a hurry to use Recode, and
do not feel like studying this manual? Even reading
this paragraph slows you down? We might have a
problem, as you will have to do some guess work,
and might not become very proficient unless you
have a very solid intuition....
Let me use
here, as a quick tutorial, an actual reply of mine
to a Recode user, who writes:
My situation is this—I occasionally get email
with special characters in it. Sometimes this
mail is from a user using IBM software and
sometimes it is a user using Mac software. I
myself am on a SPARC Solaris machine.
Your
situation is similar to mine, except that I
often receive email needing recoding, that
is, much more than occasionally! The usual
recodings I do are Mac to Latin-1 , IBM
page codes to Latin-1, Easy-French to
Latin-1, remove Quoted-Printable,
remove Base64. These are so frequent that I made
myself a few two-keystroke Emacs commands to filter
the Emacs region. This is very convenient for me. I
also resort to many other email conversions, yet
more rarely than the frequent cases above.
It seems like this should be doable
using Recode. However, when I try something like
‘recode mac
macfile.txt’ I get nothing out—no
error, no output, nothing.
Note: For the following discussion
to be true, you should have something like
‘export
LANG=fr_FR.ISO-8859-1’ in your
environment, the important bit here being the
specification of an preferred charset.
Presuming
you are using some recent version of Recode, the
command:
recode mac macfile.txt
is a request for recoding
macfile.txt
over itself, overwriting the original, from
Macintosh usual character code and Macintosh end of
lines, to Latin-1 and Unix end of
lines. This is overwrite mode. If you want to use
Recode as a filter, which is probably what you
need, rather do:
recode mac
and give your Macintosh file as standard
input, you'll get the Latin-1
file on standard output. The above command is an
abbreviation for any of:
recode mac..
recode mac..l1
recode mac..Latin-1
recode mac/CR..Latin-1/
recode Macintosh..ISO_8859-1
recode Macintosh/CR..ISO_8859-1/
That is, a
CR surface, encoding newlines with
ASCII <CR>, is first to be removed (this is a
default surface for ‘mac’), then the Macintosh
charset is converted to Latin-1 and no
surface is added to the result (there is no default
surface for ‘l1’). If you want
‘mac’ code
converted, but you know that newlines are already
coded the Unix way, just do:
recode mac/
the slash then overriding the default
surface with empty, that is, none. Here are other
easy recipes:
recode pc to filter IBM-PC code and CR-LF (default) to Latin-1
recode pc/ to filter IBM-PC code to Latin-1
recode 850 to filter code page 850 and CR-LF (default) to Latin-1
recode 850/ to filter code page 850 to Latin-1
recode /qp to remove quoted printable
The last
one is indeed equivalent to any of:
recode /qp..
recode l1/qp..l1/
recode ISO_8859-1/Quoted-Printable..ISO_8859-1/
Here are
some reverse recipes:
recode ..mac to filter Latin-1 to Macintosh code and CR (default)
recode ..mac/ to filter Latin-1 to Macintosh code
recode ..pc to filter Latin-1 to IBM-PC code and CR-LF (default)
recode ..pc/ to filter Latin-1 to IBM-PC code
recode ..850 to filter Latin-1 to code page 850 and CR-LF (default)
recode ..850/ to filter Latin-1 to code page 850
recode ../qp to force quoted printable
In all the
above calls, replace ‘recode’ by ‘recode -f’ if you want to
proceed despite recoding errors. If you do not use
‘-f’ and
there is an error, the recoding output will be
interrupted after first error in filter mode, or
the file will not be replaced by a recoded copy in
overwrite mode.
You may use
‘recode -l’
to get a list of available charsets and surfaces,
and ‘recode
--help’ to get a quick summary of
options. The above output is meant for those having
already read this manual, so let me dare a
suggestion: why could not you find a few more
minutes in your schedule to peek further down,
right into the following chapters!
|