The MIME specification provides a mechanism to label the character set used in a plain text message. Specifically, a "charset=" parameter can be specified as part of the Content-type: header line. Various character set names are defined in MIME, including US-ASCII (the default), ISO-8859-1, ISO-8859-2, and so on. Additional character set names will undoubtedly be added to the list in the future.
Most existing systems and user agents, however, do not provide any
mechanism for generating these character set labels. In particular,
plain text messages sent from VMS MAIL are not properly labelled. The
charsetesc channel keywords provide a per-channel
mechanism to specify character set names to be inserted into message
headers. Each keyword requires a single argument giving the character
set name. The names are not checked for validity. Note, however, that
character set conversion can only be done on character sets specified
in the PMDF character set definition file
found in the PMDF table directory, (i.e.,
PMDF_TABLE:charsets.txt on OpenVMS or
/pmdf/table/charsets.txt on UNIX or
C:\pmdf\table\charsets.txt on NT). The names defined in
this file should be used if possible.
charset7 character set name is used if the message
contains only seven bit characters; the
charset8 will be
used if eight bit data is found in the message;
will be used if a message containing only seven bit data happens to
contain the escape character. If the appropriate keyword is not
specified no character set name will be inserted into the Content-type:
These character set specifications never override existing labels; that is, they have no effect if a message already has a character set label or is of a type other than text.
It is usually appropriate to label the PMDF local channel as follows:
l ... charset7 US-ASCII charset8 ISO-8859-1 ... official-host-name
OpenVMS systems actually use the DEC Multinational Character Set (DEC-MCS). The character set is very close to ISO-8859-1, however, so this labelling will work well enough in most cases. If absolute accuracy is an issue, the local channel can be marked as using DEC-MCS
l ... charset7 US-ASCII charset8 DEC-MCS ... official-host-name
charsetesc keyword tends to be particularly useful on
channels that receive unlabelled messages using Japanese or Korean
character sets that contain the escape character.