GEPS 015: Repository Research Support
This is a work in progress, I hope to develop the content over the next couple of weeks - rjt
1. Plan a research visit to a Repository
Aunt Martha (AM) is planning a visit to the UK National Archives at Kew, London. She wants to produce a report from GRAMPS that tells her all of the Sources that are held in the National Archives that might be of interest to her. She also wants the report to give her a list of all the people in her database that could potentially appear in those sources, without showing those that she already has the source attached to. She would also like the report to include a table with a row for each person and a column for each piece of information that the Source should contain so that she can print it out and fill it while she is at Kew.
2. Import Sources for a Repository
AM has discovered the Ancestry is a great Repository that she knows is a good place to find genealogy sources relevant to her database. She wants to import the information about the sources that are held by Ancestry so that she can start her research. She clicks on the 'Import Repository' button and gets a list of all of the Repositories that are in the online GRAMPS Repository database and imports Ancestry. This populates the Sources that are contained in Ancestry and add Ancestry to her list of Repositories.
3. Create Research Plan for a Person
AM sits down at her computer. she has an hour to spare and what to progress her research. She selects her Grandfather, Frank in the Person View. GRAMPS shows her a research plan for Frank that shows all the Sources that Frank might be found in and the type of information that can be found in each Source. It also shows her the Repositories that contain those sources so that she can immediately start to look for the information.
The first report is a research plan for a Repository. I shows all the sources that are held in that repository, if there are people that might appear in them. For each source if shows the candidate people and provides a template for recording the information found in the source.
The second report is an individual research plan. I shows all the sources that the person might be listed in, these would be filtered so that those that are already listed for the individual are excluded (or may listed at the end).
There are two primary queries:
* get_all_people_that_might_appear_in_source(source) * get_all_sources_that_a_person_might_appear_in(person)
Both of these queries require some way of matching a Person to a Source. There are two types of source meta-data that could be used for doing this matching:
* dates - if a Source had a start and end date we could match this against the result of probably_alive * places - places are trickier, what we want to ask is is this person likely to have lived somewhere in the region covered by this source. An initial implementation might associate a Place object with a Source and match if the Person has any Place or Address references that match all of the fields that are set in the Sources Place. So for a Census the Source's Place would say England and if any of the Addresses on the Person also had England it would be a match. Source might need to have multiple Places and the matching algorithm might need to be rather fuzzy. A 'default place' might be need to cover all the People that have no Address or Place references.
I think that the date matching is clearly simpler than the place matching and should be the initial target.
To be able to produce these reports it is going to be necessary to record additional information in the database. Most of this additional information is recorded against Sources but some will also be needed against people and possibly against Repositories.
- It must be possible to exclude Sources altogether. Many Sources are not related to documentary evidence and you would not want the cluttering up the reports.
- It must be possible to exclude Sources on an Person. If you know that you have checked a Source for an Individual you want to exclude that Source from showing up next time you run the reports.
It should be possible to make a start on this by storing the Source meta-data in the key/value data of the Source. This will allow an initial proof-of-concept of the reports without touching the database schema.
A more general version of probably_alive would be needed that can take a range and decide if the Person might be alive during that period.
Jerome's RepositoriesReport is an example what can be done at the moment. This GEP seeks to develop this idea further.