A report on Character encoding

Punched tape with the word "Wikipedia" encoded in ASCII. Presence and absence of a hole represents 1 and 0, respectively; for example, "W" is encoded as "1010111".
Hollerith 80-column punch card with EBCDIC character set
365x365px

Process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers.

- Character encoding
Punched tape with the word "Wikipedia" encoded in ASCII. Presence and absence of a hole represents 1 and 0, respectively; for example, "W" is encoded as "1010111".

67 related topics with Alpha

Overall

Windows-1251

3 links

Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages.

Binary Ordered Compression for Unicode

1 links

MIME compatible Unicode compression scheme.

MIME compatible Unicode compression scheme.

This Unicode encoding is designed to be useful for compressing short strings, and maintains code point order.

GB 2312

4 links

GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters.

Various CJK encodings, including four based on KS X 1001, supported by Mozilla Firefox as of 2004. (This support has been reduced in later versions to avoid certain cross site scripting attacks.)

KS X 1001

2 links

Various CJK encodings, including four based on KS X 1001, supported by Mozilla Firefox as of 2004. (This support has been reduced in later versions to avoid certain cross site scripting attacks.)
Diagram of Johab encoding as stipulated by KS X 1001
Layout of EBCDIC-based Johab variant when in double-byte state

KS X 1001, "Code for Information Interchange (Hangul and Hanja)", formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer.

ISO/IEC 8859-8

2 links

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings.

An early version from Baudot's 1888 US patent, listing A through Z, and ∗ (Erasure)

Baudot code

1 links

An early version from Baudot's 1888 US patent, listing A through Z, and ∗ (Erasure)
An early "piano" Baudot keyboard
British variant of ITA2
Paper tape with holes representing the "Baudot–Murray Code". Note the fully punched columns of "Delete/Letters select" codes at start of the message (on the right) which were used to cut the band easily between distinct messages. The message then starts with a figure shift control followed by a carriage return.
Keyboard of a teleprinter using the Baudot code (US variant), with FIGS and LTRS shift keys
Weather teleprinter encoding
Table of ITA2 codes (expressed as hexadecimal numbers)
A four-row teletype keyboard with Roman and Cyrillic letters.

The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s.

ISO/IEC 8859-2

2 links

ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987.

ISO/IEC 8859-4

2 links

ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988.

MIME

0 links

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs.

ISO/IEC 8859-3

2 links

ISO/IEC 8859-3:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 3: Latin alphabet No. 3, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988.