Combined Log Format

From Just Solve the File Format Problem
Revision as of 00:15, 12 February 2020 by Dan Tobias (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
File Format
Name Combined Log Format
Ontology

The Combined Log Format is a standardized log format used by a number of web servers to keep track of accesses to websites. It is one of the formats available in Apache., and is similar to the Common Log Format except for the addition of two more fields, the referer and user agent. The default filename is access_log, which has no extension. Sometimes a server will be set to "auto-rotate" old log files, where an extension will get added to indicate past logs by number relative to the present, like "001" for the previous one, "002" for the one before, and so on, or else a specific date or some other scheme. You can also configure the server to use the filename of your choice (e.g., if you host multiple virtual domains, you may want to use a different log file for each), and some people choose names with an extension (e.g., .log) to make the files easier to deal with in operating systems or software that identify file types by extension.

The format is defined by this expression in the httpd.conf (Apache) file:

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\""

This consists of the following space-separated fields:

  • Hostname or IP address of accesser of site. If a proxy server is between the end-user and the server, that might get logged here instead of the actual accesser's address.
  • RFC 1413 identity of client; this is noted by Apache as unreliable, and is usually blank (represented by a hyphen (-) in the file).
  • Username of user accessing document; will be a hyphen (-) for public web sites that have no user access controls.
  • Timestamp string surrounded by square brackets, e.g. [12/Dec/2012:12:12:12 -0500]
  • HTTP request surrounded by double quotes, e.g., "GET /stuff.html HTTP/1.1"
  • HTTP status code: 200 for successful access, 404 for not-found, and other codes.
  • Number of bytes transferred in requested object
  • Referer: URL where user came from to get to your site, if sent by client to server (surrounded by double quotes)
  • User agent string sent by client (surrounded by double quotes). Can be used to identify what browser was used, but can be misleading.

References

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox