File format

A file format is a standard way that information is encoded for storage in a computer file.wikipedia
Computer file

There are different types of computer files, designed for different purposes.

Proprietary format

File formats may be either proprietary or free and may be either unpublished or open.
A proprietary format is a file format of a company, organization, or individual that contains data that is ordered and stored according to a particular encoding-scheme, designed by the company or organization to be secret, such that the decoding and interpretation of this stored data is easily accomplished only with particular software or hardware that the company itself has developed.


OGM.oggOgg Writ
Other file formats, however, are designed for storage of several different types of data: the Ogg format can act as a container for different types of multimedia including any combination of audio and video, with or without text (such as subtitles), and metadata.
Ogg has come to stand for the file format, as part of the larger multimedia project.


Exchangeable image file formatExif metadataembedded tags
For example, most image files store information about image format, size, resolution and color space, and optionally authoring information such as who made the image, when and where it was made, what camera model and photographic settings were used (Exif), and so on.
Exchangeable image file format (officially Exif, according to JEIDA/JEITA/CIPA specifications) is a standard that specifies the formats for images, sound, and ancillary tags used by digital cameras (including smartphones), scanners and other systems handling image and sound files recorded by digital cameras.

Filename extension

This portion of the filename is known as the filename extension.
It is more common, especially in binary files, for the file itself to contain internal metadata describing its contents.

Digital container format

A container or wrapper format is a metafile format whose specification describes how different elements of data and metadata coexist in a computer file.


Extensible Markup LanguageXML documentXML parser
HTML files, for example, might begin with the string <html> (which is not case sensitive), or an appropriate document type definition that starts with <!DOCTYPE HTML>, or, for XHTML, the XML identifier, which begins with <?xml.
Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

Amiga Hunk

Another operating system using magic numbers is AmigaOS, where magic numbers were called "Magic Cookies" and were adopted as a standard system to recognize executables in Hunk executable file format and also to let single programs, tools and utilities deal automatically with their saved data files, or any other kind of file types when saving and loading data.
Hunk is the executable file format of tools and programs of the Amiga Operating System based on Motorola 68000 CPU and other processors of the same family.

Plain text

One artifact of this approach is that the system can easily be tricked into treating a file as a different format simply by renaming it—an HTML file can, for instance, be easily treated as plain text by renaming it from filename.html to filename.txt.
The purpose of using plain text today is primarily independence from programs that require their very own special encoding or formatting or file format.


Another method was the FourCC method, originating in OSType on Macintosh, later adapted by Interchange File Format (IFF) and derivatives.
A FourCC ("four-character code") is a sequence of four bytes (typically ASCII) used to uniquely identify data formats.


The ".exe" would be hidden and an unsuspecting user would see "Holiday photo.jpg", which would appear to be a JPEG image, usually unable to harm the machine.


Portable Document FormatPDF format.pdf
Some common and standard types use a domain called public (e.g. public.png for a Portable Network Graphics image), while other domains can be used for third-party types (e.g. com.adobe.pdf for Portable Document Format).
The Portable Document Format (PDF) (redundantly: PDF format) is a file format developed by Adobe in the 1990s to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.


Active Preservation at The National Archives - PRONOM Technical Registry and DROID file format identification toolPRONOM Persistent Unique Identifier (PUID)PRONOM technical registry
The PRONOM Persistent Unique Identifier (PUID) is an extensible scheme of persistent, unique and unambiguous identifiers for file formats, which has been developed by The National Archives of the UK as part of its PRONOM technical registry service.
PRONOM was the first and remains, to date, the only operational public file format registry in the world, although the "Magic File" repository of the File Command has served this role in a less formal capacity for two decades.

File system

In the original FAT file system, file names were limited to an eight-character identifier and a three-character extension, known as an 8.3 filename.
On macOS, the filetype can come from the type code, stored in file's metadata, or the filename extension.

Comma-separated values

For example, word-processors such as troff, Script, and Scribe, and database export files such as CSV.
The use of the comma as a field separator is the source of the name for this file format.

Header (computing)

The metadata contained in a file header are usually stored at the start of the file, but might be present in other areas too, often including the end, depending on the file format or the type of data contained.

Amiga support and maintenance software

This system was then enhanced with the Amiga standard Datatype recognition system.
Modern Amiga-like operating systems such as AmigaOS 4.0 and MorphOS can handle also MIME types.


JavaScript Object NotationJSON SchemaECMA-404
is an open-standard file format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value).

Audio file format

An audio file format is a file format for storing digital audio data on a computer system.


ArthurRISC-OSArthur (operating system)
RISC OS uses a similar system, consisting of a 12-bit number which can be looked up in a table of descriptions—e.g. the hexadecimal number FF5 is "aliased" to PoScript, representing a PostScript file.
To determine file type, the OS uses metadata instead of file extensions.

Comparison of executable file formats

This is a comparison of binary executable file formats which, once loaded by a suitable executable loader, can be directly executed by the CPU rather than become interpreted by software.

Digital preservation

Although not yet widely used outside of UK government and some digital preservation programmes, the PUID scheme does provide greater granularity than most alternative schemes.
This may include conversion of resources from one file format to another (e.g., conversion of Microsoft Word to PDF or OpenDocument) or from one operating system to another (e.g., Windows to Linux) so the resource remains fully accessible and functional.

Data conversion

Data conversion is the conversion of computer data from one format to another.

List of archive formats

This is a list of file formats used by archivers and compressors used to create archive files.

List of file signatures

This is a list of file signatures, data used to identify or verify the content of a file.