Jump to: navigation, search

Meaningful filenames

421 bytes added, 10 February
See also: add reference
This article is Article about naming files in a meaningful way. Naturally files should have unique names so we don't end up with several files with the same or very similar names. This article takes file naming one step further by looking at how the file name itself can carry useful information about the file.
= Why meaningful filenames =
** HTML, the language of webpages, uses tags like ''<span style="normalText">Example</span>''. Here the meta data describes the style of the text, ie: ''Example'' is ''normalText''
** EXIF ([ Wikipedia's ''EXIF'' entry]) is a way of storing meta data in image files, like when the photo was taken and what type of camera was used.
* Database systems (GRAMPS Gramps is a database system for genealogy) can store a huge amount of data about data. They're are very efficient at this job and very powerful.** Google Search uses a database to remember what web pages are about, and tells you when you ask.
So why not use one of those options?
* EXIF is great, but only for some types of files (not supported in JPEG 2000, PNG, or GIF), there are lots of different systems for different types of files. People are working hard to improve this situation all the time.
* HTML is great if you can store all your information as HTML files, but HTML files cannot contain other files, they just point to them. So we'd basically end up making a website about our files.
* A database, well we already use this when we use GRAMPSGramps. The GRAMPS Gramps database stores lots of information about the files and records it records. But GRAMPS Gramps does not store the actual file inside the database. If the connection between GRAMPS Gramps and the data it is describing is broken, then the files are just files. They contain no more information than they did when you first ''imported'' them into GRAMPSGramps.
This system of ''meaningful filenames'' has the following aims:
* Preserving enough metadata to give the file's content context without GRAMPSGramps* Creating file names normal people can understand so they can see what the file is about without GRAMPSGramps
* Creating file names which a computer can process easily so files need to be batch processed and metadata can be read directly from the file name without possible confusion
* Creating a system simple enough to use all the time for every file
To be understandable we need to be able to use full words where appropriate.
To be computer readable we need to seperate separate the parts in a way which a script can easily recognise and, more importantly, in a way which would never occur in real language. So it would be no good to mark a ''name'' section with the word ''name'' if we also can use the word name somewhere in the file where it is not meant to be a marker.
To be simple enough to remember the system should not be too complicated, after all GRAMPS Gramps is meant to store the real information, this is just a supplement.
== What's in a name? ==
It would be nice if we could have files called
Marriage of Mary Angus Jones and Matthew Williams, 2nd Dec 1923 (William Angus is to Mary's right).jpg
But this meets only one of the criteria above, that of ''understandable filenames''. How can a computer know who got married? what their surnames are? and so on. And anyway because of the limitations of ''Portable Filenames'' we can't have file names like that. We have to drop the reliance on capitalisation, drop the spaces, drop the comma and drop the brackets. To be computer readable we need to separate the sections with a system of markers to indicate where the surname, event name etc are.
So what sections do we want to be able to identify? Here's a basic list that should be enough for most situation, remember that GRAMPS Gramps stores the more complex information, we're just trying to give a useful structure to our files.
* Surname
* Firstname
== Source events ==
The GEDCOM 5.5 standard defines so few events as to be useless. The GRAMPS Gramps XML schema defines no events as these can be made by the user. This all seems fair enough since events are highly culture based. The situations where I think a set of events should be defined are those which will be connected with source records. GEDCOM has a reasonable group of those but they are heavily based in western christian culture. The solution must be language and culture dependent. Here's my list:
'''marriage''' is for an actual marriage event and all the associated documentation, including possible divorce and separation documentation.
This could be parsed (by GRAMPSGramps?) as the description:
'''Event:''' Marriage
This could be parsed (by GRAMPSGramps?) as the description:
'''Source:''' Census
= Gramps ID based =
{{man note|This is another attempt by [[User:Duncan|Duncan Lithgow]] to find a good system. It is not finshed finished so feel free to add comments and correct any obvious mistakes.}}
Here's the records we'll use as examples. They involve Mary Agnes Williams (daughter of John Williams and Anna Matthews). She married Anders Sørensen (son of Anders Sørensen and Anna ?) and they had a daughter Anna Sorensen, note the spelling change.
== Record types ==
The record types tell us what the record is about. GRAMPS Gramps ID's use the first character to denote the type of item the ID refers to. Sticking to something already thought and taking the most relevant ones to stored records these can be used as the following tags for record types:
* I-- Individual
Obviously this method is limited in scope to some kinds of sources, and doesn't make sense for naming photos or documents that aren't part of a larger source (e.g. an ID card).
A really good article on why and how to implement this kind of naming system is [ Hierarchical Sources] by Tony Proctor.
= See also =
* [[Organise your records]]
* Gramps User maillist thread: [ Organizing media files]
= External links =
* [ File Naming / Organization Methods?] from [ What do I know?]

Navigation menu