Just Solve the File Format Problem:Community portal


 * ''please add your signature by typing ~ if you add or reply

Open issues
Below is a list of "issues" which would ordinarily be in a ticketing system of some kind, but are here on the Wiki instead, because that's how we roll. As things are resolved, they will be moved to the Discussion page. If there's an appeal or an issue, the conversation can continue there - this page will be for open issues.

Use of case in URLS / links. I went through all the electronic format types pages, and tried to normalise all the pages where I could (there was a mix of link structures - I've tried to get them all (apart from animation - I've been at it all day!) so they are file extension - file type name. I notice that we have a mix of upper and lower case file extension through out. This means we may have 2 links which should point to the same URL (e.g. mix and MIX) is this a known issue with the current layout? --JaygattusoNLNZ (talk) 01:32, 20 November 2012 (UTC)
 * Since you're linking both the extension and the name, does that mean that there are supposed to be separate articles for each? I don't know if there's really a need for "mainspace" articles by extension, since there are already categories for that purpose; you can browse them through Category:File formats by extension. Dan Tobias (talk) 02:12, 20 November 2012 (UTC)
 * I just copied the most common model that I found on the formats pages. The problem is, if you don't homogenize the method, the linking/crosslinking doesn't work properly. All instances of .doc (for example) should point to the same resource page / disambiguation page. If someone has linked to only format in one place (e.g. MS Word (.doc)), and someone else the extension (MS Word - doc), we can't makes sure they point to the same place. The problem occurs because format names and extensions are used interchangeably. You raise an interesting question about the relationship between the ext and the format name. I would argue they are not equal (1:1), nor (1:many) / (many:1) so it makes sense to protect both aspects as definable things - the extension because that's whats most commonly searched for and referred to by users and 'format name' because its more accurate. How is the Category:File formats by extension populated? --JaygattusoNLNZ (talk) 18:31, 20 November 2012 (UTC)
 * The categories are inserted when you use the ext template in the infobox. My preference is to have articles by actual format name and use multiple navigation aids (menus, cats, etc.) to get to them. Dan Tobias (talk) 01:38, 21 November 2012 (UTC)

Article naming convention
As mentioned above, there's some dispute over whether to name articles after the full name of a format or its file extension. If using full names, you then get into issues of whether to use the full technical name or a shorter thing that's more popularly used, and in some cases that's even the same as the extension (GIF, for instance). And you also get into tricky issues of capitalization: all-caps like an acronym, all-lowercase like filenames are often done (though this is OS-dependent; some, like MS-DOS, use all-uppercase), or mixed case (proper names capitalized)? And then there's the disambiguation issue of how to name articles on different things that have the same name, which happens sometimes even with long official names, but even more often with short acronyms and file extensions. But there's also yet another issue of which things get separate articles and which are combined, like formats that have had many different versions, etc.

Currently you have things like CI and CT, recently-created articles that represent two different file types within the data of one type of music tracker. The spec document they link to is the same one, which documents all the file types used in that tracker. Unless there's going to be really a lot to say about each of the specific file types, my own preference would be to have one article called CyberTracker that discusses all the formats used by the program in question, with subheaders within the article for the different file types, and all the extensions listed in the infobox (and hence in associated categories). If any other indices by extension are built up, they'd also have entries for both CI and CT. For instance, when I documented Softdisk Family Tree, I covered all the various file formats in one article, though there are several versions and multiple files for each. Dan Tobias (talk) 13:39, 21 November 2012 (UTC)
 * I realise I'm as guilty of this as anyone, having used both forms at some point (e.g. Surprise! Adlib Tracker v2.0 and CI). Indeed, the two articles - CI and CT - you refer to were created by me. I guess in general I would favour using a descriptive page name rather than simply the file extension - that seems to be something that's being taken care of by infoboxes and categories.


 * On the issue of what gets a separate page and what doesn't, I guess that just comes down to individual discretion. There will be instances where a format has undergone a number of minor revisions over time or has a number of minor variants (e.g. the variant forms of Chaos Music Composer's CMC) where it would make sense to keep them all to a single page, while a major revision would necessitate a multi-page approach (e.g. the shift with Capella from the binary CAP to the XML-based CapXML format).


 * However, I'm not sure I agree with CI and CT having a single CyberTracker page. While both link to the same spec document and both are used by the same program, they are different formats serving different purposes. I think in general we should try and distinguish between program and file format - S3M doesn't belong on the ScreamTracker page, although each should link to the other. Halftheisland (talk) 14:04, 21 November 2012 (UTC)


 * Since the purpose of the wiki is to document file formats, I think it's good that as many formats as possible are listed in the category pages and that you can browse these pages for format extensions. Sometimes it might be better to link multiple extension to the same article (e.g. a specific application), but not always. I think it is difficult to come up with a strict rule for this (but maybe recommendations and, even better, good examples). --PN (talk) 15:08, 21 November 2012 (UTC)


 * It's a judgment call, certainly. It depends on how the files are typically encountered, distributed, used, etc., and how they're thought of by people who use them; if a bunch of file types related to a particular program are usually found together as part of a larger data set, they most likely belong together in one article (with subsections to describe the function of the particular files), but if they're distinct entities with their own particular treatment (like separate areas of file trading sites for enthusiasts) they should have separate articles, though more descriptive names like "CyberTracker instrument file" might be better than a cryptic and likely ambiguous CI. Dan Tobias (talk) 15:46, 21 November 2012 (UTC)
 * And then, somebody has also used a robot to create pages in a separate namespace devoted to file extensions, like cin. That's yet another navigational system for getting to information by extension, though those pages oddly don't actually have direct links to the normal pages here about those file formats. Dan Tobias (talk) 15:56, 21 November 2012 (UTC)
 * Yes, that was me with Bender the bot. Still experimenting with it and working on creating a list of all pages in relation to extensions. Maurice.de.rooij (talk) 15:22, 22 November 2012 (UTC)
 * What I'd like to avoid is the messy format somebody did to a few index pages like Compression, where each line has separately hyperlinked format names and extensions (not always in a consistent order) where often one or the other is a redlink, or one redirects to the other, or one is just a disambiguation page, making a somewhat confusing hodgepodge. Dan Tobias (talk) 16:22, 21 November 2012 (UTC)
 * I've started rearranging the Compression page to be a little less messy. Dan Tobias (talk) 16:56, 22 November 2012 (UTC)