Top |
typedef | EncaAnalyser |
EncaEncoding | |
#define | ENCA_CS_UNKNOWN |
enum | EncaSurface |
enum | EncaCharsetFlags |
enum | EncaNameStyle |
enum | EncaErrno |
#define | ENCA_NOT_A_CHAR |
typedef struct _EncaEncoding EncaEncoding;
Encoding, i.e. charset and surface.
This is what enca_analyse()
and enca_analyse_const()
return.
The charset
field is an opaque numerical charset identifier, which has no
meaning outside Enca library.
You will probably want to use it only as enca_charset_name()
argument.
It is only guaranteed not to change meaning
during program execution time; change of its interpretation (e.g. due to
addition of new charsets) is not considered API change.
The surface
field is a combination of EncaSurface flags. You may want
to ignore it completely; you should use enca_set_interpreted_surfaces()
to disable weird surfaces then.
#define ENCA_CS_UNKNOWN (-1)
Unknown character set id.
Use enca_charset_is_known()
to check for unknown charset instead of direct
comparsion.
Surface flags.
End-of-lines are represented with CR's. |
||
End-of-lines are represented with LF's. |
||
End-of-lines are represented with CRLF's. |
||
Several end-of-line types, mixed. |
||
End-of-line concept not applicable (binary data). |
||
Mask for end-of-line surfaces. |
||
Odd and even bytes swapped. |
||
Reversed byte sequence in 4byte words. |
||
Chunks with both endianess, concatenated. |
||
Mask for permutation surfaces. |
||
Quoted printables. |
||
Recode `remove' surface. |
||
Unknown surface. |
||
Mask for all bits, withnout ENCA_SURFACE_UNKNOWN. |
Charset properties.
Flags ENCA_CHARSET_7BIT
, ENCA_CHARSET_8BIT
, ENCA_CHARSET_16BIT
,
ENCA_CHARSET_32BIT
tell how many bits a `fundamental piece' consists of.
This is different from bits per character; r.g. UTF-8 consists of 8bit
pieces (bytes), but character can be composed from 1 to 6 of them.
Characters are represented with 7bit characters. |
||
Characters are represented with bytes. |
||
Characters are represented with 2byte words. |
||
Characters are represented with 4byte words. |
||
One characters consists of one fundamental piece. |
||
One character consists of variable number of fundamental pieces. |
||
Charset is binary from ASCII viewpoint. |
||
Language dependent (8bit) charset. |
||
Multibyte charset. |
Charset naming styles and conventions.
Error codes.