Common Log Format

From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name Common Log Format

The Common Log Format is a standardized log format used by a number of web servers to keep track of accesses to websites. It is the format used by default in Apache. The default filename is access_log, which has no extension. Sometimes a server will be set to "auto-rotate" old log files, where an extension will get added to indicate past logs by number relative to the present, like "001" for the previous one, "002" for the one before, and so on, or else a specific date or some other scheme. You can also configure the server to use the filename of your choice (e.g., if you host multiple virtual domains, you may want to use a different log file for each), and some people choose names with an extension (e.g., .log) to make the files easier to deal with in operating systems or software that identify file types by extension.

The format is defined by this expression in the httpd.conf (Apache) file:

"%h %l %u %t \"%r\" %>s %b"

This consists of the following space-separated fields:

  • Hostname or IP address of accesser of site. If a proxy server is between the end-user and the server, that might get logged here instead of the actual accesser's address.
  • RFC 1413 identity of client; this is noted by Apache as unreliable, and is usually blank (represented by a hyphen (-) in the file).
  • Username of user accessing document; will be a hyphen (-) for public web sites that have no user access controls.
  • Timestamp string surrounded by square brackets, e.g. [12/Dec/2012:12:12:12 -0500]
  • HTTP request surrounded by double quotes, e.g., "GET /stuff.html HTTP/1.1"
  • HTTP status code: 200 for successful access, 404 for not-found, and other codes.
  • Number of bytes transferred in requested object

Note that this format does not include the user agent string or referrer; you need to use the Combined Log Format to include these fields.


Personal tools