WordPerfect

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(References)
Line 9: Line 9:
 
==Introduction==
 
==Introduction==
 
Name for both word processing application and file format.
 
Name for both word processing application and file format.
 +
 
==Printer definitions==
 
==Printer definitions==
 
WordPerfect uses so called 'printer definitions' for "pretty printing".
 
WordPerfect uses so called 'printer definitions' for "pretty printing".
[[category:File Format]]
+
 
 +
==Detecting WordPerfect files==
 +
The "signature bytes" at the beginning of a WordPerfect file are (hex) FF 57 50 43, which spells "WPC" after a flag character #255.
 +
 
 +
==Extracting plain-text content==
 +
If you're a programmer attempting to get a program to extract the plain text out of a WordPerfect document, and are not interested in the fancy formatting and other features, this is a fairly simple process; just make the program skip the parts that are not text. When reading through the characters of the file in order, this pseudocode manipulates them (using decimal values of the characters/bytes):
 +
 
 +
For each character c, if its value is:
 +
  #128, #160: treat as space ' '
 +
  #169..#171, #173, #174: treat as dash '-'
 +
  #192..#236: skip ahead and ignore all characters until another occurrence of character c is found; resume at the following character
 +
  #0..#31, #129..#159, #161..#168, #172, #175..#255: ignore (control characters)
 +
  else treat as regular text character
  
 
==Developer utilities==
 
==Developer utilities==

Revision as of 23:59, 1 December 2012

File Format
Name WordPerfect
Ontology
Extension(s) .wpd, .wp, .wp4, .wp5, .wp6, .wp7

WordPerfect is a word processor that was extremely popular in the 1980s and 1990s. It was first developed on a Data General computer at Brigham Young University in 1979, but later ported to many different operating systems, and was most popular in its PC/MS-DOS version. Currently, only the Windows version is being developed and maintained, though WordPerfect never achieved the dominance in that platform that it had in DOS.

Contents

Introduction

Name for both word processing application and file format.

Printer definitions

WordPerfect uses so called 'printer definitions' for "pretty printing".

Detecting WordPerfect files

The "signature bytes" at the beginning of a WordPerfect file are (hex) FF 57 50 43, which spells "WPC" after a flag character #255.

Extracting plain-text content

If you're a programmer attempting to get a program to extract the plain text out of a WordPerfect document, and are not interested in the fancy formatting and other features, this is a fairly simple process; just make the program skip the parts that are not text. When reading through the characters of the file in order, this pseudocode manipulates them (using decimal values of the characters/bytes):

For each character c, if its value is:
  #128, #160: treat as space ' '
  #169..#171, #173, #174: treat as dash '-'
  #192..#236: skip ahead and ignore all characters until another occurrence of character c is found; resume at the following character
  #0..#31, #129..#159, #161..#168, #172, #175..#255: ignore (control characters)
  else treat as regular text character

Developer utilities

References

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox