Written Languages

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
Line 14: Line 14:
  
 
While writing is, in its basic form, a visual medium, various non-visual representations also exist, such as the tactile code of [[Braille]] and the auditory [[Morse code]].  These are different "file formats" for the purpose of this site, but they map onto underlying writing systems which are common to varied representations of the same language.
 
While writing is, in its basic form, a visual medium, various non-visual representations also exist, such as the tactile code of [[Braille]] and the auditory [[Morse code]].  These are different "file formats" for the purpose of this site, but they map onto underlying writing systems which are common to varied representations of the same language.
 +
 +
Another thing that can vary across writing systems is whether writing is from left-to-right or right-to-left, or sometimes top-to-bottom.
 +
 +
== Alphabetic systems ==
 +
* Arabic alphabet
 +
* Cyrillic alphabet (Russian, etc.)
 +
* Greek alphabet
 +
* Hebrew alphabet
 +
* Latin alphabet (English, French, Italian, etc.)
 +
 +
== Other systems ==
 +
* Chinese script
 +
* Japanese script
 +
* Korean script
 +
 +
== Other things expressed in writing ==
 +
In addition to those representing the sounds or words of a human language, written communication sometimes uses other symbols and systems to express things like numbers or dates, in systems that can vary culturally or be internationally standardized, sometimes independently of what language the document is in.
 +
 +
* Hindu-Arabic numerals (used nearly universally in Western culture)
 +
* Date and time formats
 +
* Mathematical notation
 +
* Roman numerals
  
 
== References ==
 
== References ==

Revision as of 01:21, 29 October 2012

File Formats > Languages > Written Languages

Writing dates back to approximately the 4th millennium BC, and marks the boundary between "prehistoric" and "historic" times.

Written language generally consists of a set of symbols (alphabetic, ideographic, or other) which represents an underlying language (usually derived from a spoken language) and is in turn given a physical or electronic representation as marks on a medium (such as paper) or digitally-encoded characters via a Character Encoding. Physical-media written language can also be digitized as graphics. The process of converting an image of written text into digitized characters (for further processing or indexing) is known as Optical Character Recognition (OCR).

One should keep in mind the distinction between the abstract characters of a writing system and the specific "glyphs" that may represent them visually; the latter can vary by font style and exist in a variety of printed and handwritten versions. Just what is a "separate character" versus a stylistic variation on one can be a somewhat arbitrary distinction; the letters "i" and "j" were at one point considered variations on a single letter of the Latin alphabet, while there continues to be controversy over which characters in the Chinese, Japanese, and Korean writing systems should be considered distinct.

While writing is, in its basic form, a visual medium, various non-visual representations also exist, such as the tactile code of Braille and the auditory Morse code. These are different "file formats" for the purpose of this site, but they map onto underlying writing systems which are common to varied representations of the same language.

Another thing that can vary across writing systems is whether writing is from left-to-right or right-to-left, or sometimes top-to-bottom.

Contents

Alphabetic systems

  • Arabic alphabet
  • Cyrillic alphabet (Russian, etc.)
  • Greek alphabet
  • Hebrew alphabet
  • Latin alphabet (English, French, Italian, etc.)

Other systems

  • Chinese script
  • Japanese script
  • Korean script

Other things expressed in writing

In addition to those representing the sounds or words of a human language, written communication sometimes uses other symbols and systems to express things like numbers or dates, in systems that can vary culturally or be internationally standardized, sometimes independently of what language the document is in.

  • Hindu-Arabic numerals (used nearly universally in Western culture)
  • Date and time formats
  • Mathematical notation
  • Roman numerals

References

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox