Written Languages
| Dan Tobias  (Talk | contribs)  (→Links and references) | Havoc Crow  (Talk | contribs)  | ||
| (17 intermediate revisions by one user not shown) | |||
| Line 7: | Line 7: | ||
| |caption=Rosetta Stone | |caption=Rosetta Stone | ||
| }} | }} | ||
| − | |||
| − | |||
| Writing dates back to approximately the 4th millennium BC, and marks the boundary between "prehistoric" and "historic" times. | Writing dates back to approximately the 4th millennium BC, and marks the boundary between "prehistoric" and "historic" times. | ||
| − | Written language generally consists of a set of symbols (alphabetic, ideographic, or other) which represents an underlying language (usually derived from a [[Spoken Languages|spoken language]]) and is in turn given a physical or electronic representation as marks on a medium (such as [[paper]]) or digitally-encoded characters via a [[ | + | Written language generally consists of a set of symbols (alphabetic, ideographic, or other) which represents an underlying language (usually derived from a [[Spoken Languages|spoken language]]) and is in turn given a physical or electronic representation as marks on a medium (such as [[paper]]) or digitally-encoded characters via a [[character encoding]].  Physical-media written language can also be digitized as [[graphics]].  The process of converting an image of written text into digitized characters (for further processing or indexing) is known as Optical Character Recognition (OCR). | 
| One should keep in mind the distinction between the abstract characters of a writing system and the specific "glyphs" that may represent them visually; the latter can vary by font style and exist in a variety of printed and handwritten versions.  Just what is a "separate character" versus a stylistic variation on one can be a somewhat arbitrary distinction; the letters "i" and "j" were at one point considered variations on a single letter of the Latin alphabet, while there continues to be controversy over which characters in the Chinese, Japanese, and Korean writing systems should be considered distinct. | One should keep in mind the distinction between the abstract characters of a writing system and the specific "glyphs" that may represent them visually; the latter can vary by font style and exist in a variety of printed and handwritten versions.  Just what is a "separate character" versus a stylistic variation on one can be a somewhat arbitrary distinction; the letters "i" and "j" were at one point considered variations on a single letter of the Latin alphabet, while there continues to be controversy over which characters in the Chinese, Japanese, and Korean writing systems should be considered distinct. | ||
| Line 26: | Line 24: | ||
| * [[Hebrew alphabet]] | * [[Hebrew alphabet]] | ||
| * [[Latin alphabet]] (English, French, Italian, etc.) | * [[Latin alphabet]] (English, French, Italian, etc.) | ||
| + | * [[Phoenician alphabet]] (ancestor of most modern alphabets) | ||
| * [[International Phonetic Alphabet]], not actually used as the alphabet of any language, but used to transcribe pronunciations in all languages | * [[International Phonetic Alphabet]], not actually used as the alphabet of any language, but used to transcribe pronunciations in all languages | ||
| Line 36: | Line 35: | ||
| == Miscellaneous == | == Miscellaneous == | ||
| + | * [[CAVE Language]] | ||
| * [[Gregg shorthand]] | * [[Gregg shorthand]] | ||
| + | * [[Izografia]] | ||
| + | * [[Stenotype]] | ||
| == Other things expressed in writing == | == Other things expressed in writing == | ||
| Line 42: | Line 44: | ||
| * [[Date and time formats]] | * [[Date and time formats]] | ||
| − | |||
| − | |||
| * [[Musical notation]] | * [[Musical notation]] | ||
| − | * [[ | + | ** [[Drum tablature]] | 
| + | ** [[Guitar tablatures]] | ||
| + | ** ''(See also [[Audio and Music]])'' | ||
| + | * [[Numeric and counting systems]] | ||
| == Specific formalized types of written (or drawn) matter == | == Specific formalized types of written (or drawn) matter == | ||
| Line 68: | Line 71: | ||
| * [http://thefire.org/article/16392.html Dixie State University bans Greek letters from organization names] | * [http://thefire.org/article/16392.html Dixie State University bans Greek letters from organization names] | ||
| * [http://www.newyorker.com/online/blogs/books/2013/07/the-unread-the-mystery-of-the-voynich-manuscript.html The Unread: The Mystery of the Voynich Manuscript] | * [http://www.newyorker.com/online/blogs/books/2013/07/the-unread-the-mystery-of-the-voynich-manuscript.html The Unread: The Mystery of the Voynich Manuscript] | ||
| + | * [http://www.bbc.co.uk/news/uk-england-beds-bucks-herts-26198471 Breakthrough over 600-year-old mystery manuscript] | ||
| * [http://the-dimka.livejournal.com/6645.html Codex Seraphinianus] | * [http://the-dimka.livejournal.com/6645.html Codex Seraphinianus] | ||
| * [http://boingboing.net/2013/11/05/the-box-of-crazy-amazing.html Another weird codex, allegedly found by the trash] | * [http://boingboing.net/2013/11/05/the-box-of-crazy-amazing.html Another weird codex, allegedly found by the trash] | ||
| Line 75: | Line 79: | ||
| * [http://mentalfloss.com/article/27476/ray-cats-artificial-moons-and-atomic-priesthood-how-government-plans-protect-our Discussion of how to label atomic waste to be understandable by distant-future generations] | * [http://mentalfloss.com/article/27476/ray-cats-artificial-moons-and-atomic-priesthood-how-government-plans-protect-our Discussion of how to label atomic waste to be understandable by distant-future generations] | ||
| * [http://www.telegraph.co.uk/culture/books/10620324/How-to-decipher-a-4000-year-old-tax-return.html How to decipher a 4,000-year-old tax return] | * [http://www.telegraph.co.uk/culture/books/10620324/How-to-decipher-a-4000-year-old-tax-return.html How to decipher a 4,000-year-old tax return] | ||
| + | * [http://kheafield.com/professional/stanford/crawl_paper.pdf N-gram Counts and Language Models from the Common Crawl] | ||
| + | ** [http://statmt.org/ngrams/ Data release] | ||
| + | * [http://emop.tamu.edu/ Early Modern OCR Project] | ||
| + | * [http://nlp.cs.berkeley.edu/ocular.shtml Ocular Historical Document Recognition System] | ||
| + | * [http://motherboard.vice.com/read/the-secret-codes-that-cartels-use-to-send-orders-from-prison?trk_source=recommended The Secret Codes That Cartel Bosses Use to Send Handwritten Orders from Prison] | ||
Latest revision as of 10:00, 24 July 2025
Writing dates back to approximately the 4th millennium BC, and marks the boundary between "prehistoric" and "historic" times.
Written language generally consists of a set of symbols (alphabetic, ideographic, or other) which represents an underlying language (usually derived from a spoken language) and is in turn given a physical or electronic representation as marks on a medium (such as paper) or digitally-encoded characters via a character encoding. Physical-media written language can also be digitized as graphics. The process of converting an image of written text into digitized characters (for further processing or indexing) is known as Optical Character Recognition (OCR).
One should keep in mind the distinction between the abstract characters of a writing system and the specific "glyphs" that may represent them visually; the latter can vary by font style and exist in a variety of printed and handwritten versions. Just what is a "separate character" versus a stylistic variation on one can be a somewhat arbitrary distinction; the letters "i" and "j" were at one point considered variations on a single letter of the Latin alphabet, while there continues to be controversy over which characters in the Chinese, Japanese, and Korean writing systems should be considered distinct.
While writing is, in its basic form, a visual medium, various non-visual representations also exist, such as the tactile code of Braille and the auditory Morse code. These are different "file formats" for the purpose of this site, but they map onto underlying writing systems which are common to varied representations of the same language.
Another thing that can vary across writing systems is whether writing is from left-to-right or right-to-left, or sometimes top-to-bottom.
[edit] Alphabetic systems
- Arabic alphabet
- Cyrillic alphabet (Russian, etc.)
- Greek alphabet
- Hangul (Korean)
- Hebrew alphabet
- Latin alphabet (English, French, Italian, etc.)
- Phoenician alphabet (ancestor of most modern alphabets)
- International Phonetic Alphabet, not actually used as the alphabet of any language, but used to transcribe pronunciations in all languages
[edit] Non-alphabetic systems
[edit] Miscellaneous
[edit] Other things expressed in writing
In addition to those representing the sounds or words of a human language, written communication sometimes uses other symbols and systems to express things like numbers or dates, in systems that can vary culturally or be internationally standardized, sometimes independently of what language the document is in.
- Date and time formats
-  Musical notation
- Drum tablature
- Guitar tablatures
- (See also Audio and Music)
 
- Numeric and counting systems
[edit] Specific formalized types of written (or drawn) matter
[edit] Links and references
- Writing: Wikipedia
- 7 Ancient Writing Systems That Haven’t Been Deciphered Yet (Mental Floss)
- Breakthrough in world's oldest undeciphered writing (BBC)
- Writing systems of the world
- Rosetta Project
- The Art of Onfim: Medieval Novgorod Through the Eyes of a Child
- Forensic Stylometry: The Science That Uncovered J.K. Rowling’s Literary Hocus-Pocus
- Restoring the forgotten Javanese script through Wikimedia
- Turkey repeals ban on letters Q, W, and X
- Dixie State University bans Greek letters from organization names
- The Unread: The Mystery of the Voynich Manuscript
- Breakthrough over 600-year-old mystery manuscript
- Codex Seraphinianus
- Another weird codex, allegedly found by the trash
- Anatomy of a spambot
- Ancient times table hidden in Chinese bamboo strips
- The Indian grammar begun: or, An essay to bring the Indian language into rules, for the help of such as desire to learn the same, for the furtherance of the Gospel among them. (1666)
- Discussion of how to label atomic waste to be understandable by distant-future generations
- How to decipher a 4,000-year-old tax return
- N-gram Counts and Language Models from the Common Crawl
- Early Modern OCR Project
- Ocular Historical Document Recognition System
- The Secret Codes That Cartel Bosses Use to Send Handwritten Orders from Prison


