A text file is a computer file that stores a typed document as a series of alphanumeric characters, usually without visual formatting information. The content may be a personal note or list, a journal or newspaper article, a book, or any other text that can be rendered accurately in typewritten form. Text files are similar to word processing files in that the content of both is primarily textual; they differ in that text files usually do not record information such as character style and size, pagination, or other details that would specify the appearance of a finished document. Some computer operating systems make a basic distinction between a text file, which is intended to be translated directly into human-readable text, and a binary file, which is interpreted directly by the computer.
In most of the schemes used for encoding text, each character is assigned a numeric value, with the text then written as a string of binary numbers. One family of encoding schemes, called the American Standard Code for Information Interchange (ASCII), became a widely-used standard early in the history of computing, despite its poor support for languages other than English. The ISO 8859 family of codes has provided much better support for languages based on the Latin alphabet and similar alphabets, but has been unable to encode the characters from East Asian languages like Japanese, leading to a proliferation of incompatible standards.
More recently, the Unicode® Consortium has been developing an encoding system called Unicode® that has the goal of assigning a unique number to every character used in every language on earth. This will allow a single code to be used for every language, and allow texts from multiple languages to appear in a single file. The first portion of Unicode is based on ISO 8859, which is itself based on ASCII. Using Unicode® can have advantages even in English-speaking countries, as text encoded using older schemes may display minor inconsistencies when moved from system to system.
Advantages of text files include small size and versatility. Kilobytes or megabytes smaller than the same data stored in other formats, they can be rapidly and massively exchanged via email or disk. Most can be opened on computers running diverse operating systems, using very basic software. The primary disadvantage is the lack of formatting. A text file may be a poor choice for representing a document that contains images or that relies on design elements to communicate its meaning — a file containing tabular data, mathematical formulas, or concrete poetry, for instance.
Text files are generally intended to be read and edited by humans, but not all of them contain content that is primarily for human consumption. Most programming code is stored in a text file prior to being compiled — that is, translated into a machine-readable binary file. Files may also contain machine-readable textual tags that give formatting information in addition to plain text. For instance, a Hypertext Markup Language (HTML) file can be opened as a plain text file in a text editor, or display as a formatted web page after being interpreted by a web browser. Similar schemes include LaTeX, used for laying out scientific papers, and Extensible Markup Language (XML), used for structuring data.