Difference between revisions of "GEPS 015: Repository Research Support"

From Gramps
Jump to: navigation, search
m (User Stories: formatting and minor expansion to supported progressive Repository exploration)
m (Important issues: rabbit hole avoidance)
Line 63: Line 63:
 
# It must be possible to exclude Sources altogether. Many Sources are not related to documentary evidence and you would not want them cluttering up the reports.
 
# It must be possible to exclude Sources altogether. Many Sources are not related to documentary evidence and you would not want them cluttering up the reports.
 
# It must be possible to exclude Sources on an Person. If you know that you have checked a Source for an Individual you want to exclude that Source from showing up next time you run the reports.
 
# It must be possible to exclude Sources on an Person. If you know that you have checked a Source for an Individual you want to exclude that Source from showing up next time you run the reports.
 
+
# It must be possible to list Persons who have already been completely researched in the Source. Recognizing previously explored rabbit holes is necessary to avoid redundant research.
  
 
= Implementation Plan =
 
= Implementation Plan =

Revision as of 14:58, 7 December 2019

This is a work in progress, I hope to develop the content over the next couple of weeks - rjt

User Stories

1. Plan the research visit to a Repository

Aunt Martha (AM) is planning a visit to a one of her Gramps Repositories: the UK National Archives at Kew, London.

She produces a report from GRAMPS that tells her all of the Sources that are held in the National Archives that might be of interest to her.

The report also gives her a list of all the people in her Tree that could potentially appear in those Sources while marking (via stylesheet allowing font options such as single/double strikethrough, color or typeface) or hiding those to which she already has the source attached.

She also generates a detailed checklist report which includes a table with a row for each person and a column for each piece of information that the Source should contain. Again, Person already cited have a check in the column and the row hidable or styled.

She prints out these reports for use while she is at Kew.

2. Import Sources for a Repository

AM has discovered the Ancestry is a great Repository that she knows is a good place to find genealogy sources relevant to her database. She wants to import the information about the sources that are held by Ancestry so that she can start her research.

She clicks on the 'Import Repository' button and gets a list of all of the Repositories that are in the online GRAMPS Repository database and imports Ancestry. This populates the Sources that are contained in Ancestry and adds Ancestry to her list of Repositories.

3. Create Research Plan for a Person

AM sits down at her computer. She has an hour to spare and wants to progress her research.

She selects her Grandfather (Frank) in the Person View. GRAMPS shows her a research plan for Frank that shows all the Sources in which supposedly mention Frank (but have not yet been fully validated to support an event) and the type of information that can be found in each Source.

(Some of these sources had been logged in Gramps because they had been cited in Secondary Sources. But AM wants to capture the originals.)

It also shows her the Repositories that contain those Sources so that she can immediately start to look for the information. It allows her to strikethrough Sources which have already been checked, have already been cited (and validated), or have already been acquired/collected into her private Repository.

Reports

The first report is a research plan for a Repository. I shows all the sources that are held in that repository, if there are people that might appear in them. For each source if shows the candidate people and provides a template for recording the information found in the source.

Research plan mock.jpg

The second report is an individual research plan. I shows all the sources that the person might be listed in, these would be filtered so that those that are already listed for the individual are excluded (or may listed at the end).

Individual plan mock.jpg

Implementation Issues

Supporting queries

There are two primary queries:

* get_all_people_that_might_appear_in_source(source)
* get_all_sources_that_a_person_might_appear_in(person)

Both of these queries require some way of matching a Person to a Source. There are two types of source meta-data that could be used for doing this matching:

  1. dates - if a Source had a start and end date we could match this against the result of probably_alive
  2. places - places are trickier, what we want to ask is is this person likely to have lived somewhere in the region covered by this source. An initial implementation might associate a Place object with a Source and match if the Person has any Place or Address references that match all of the fields that are set in the Sources Place. So for a Census the Source's Place would say England and if any of the Addresses on the Person also had England it would be a match. Source might need to have multiple Places and the matching algorithm might need to be rather fuzzy. A 'default place' might be needed to cover all the People that have no Address or Place references.

I think that the date matching is clearly simpler than the place matching and should be the initial target.

The aspect of meta-data is record what information a Source contains. This could be as simple as a list of titles (e.g. Birthday, Name, Sex etc.) or it could be more sophisticated. It could be a list of Event templates. The Event templates could then be used to check against the Person, so the Source only matches against the Person if they do not have an Event of that type references to that Source. It might even be possible to right-click the Source on a Probable Sources tab on the Person and select Populate Events to create the empty events on the Person, already setup to reference the Source.

To be able to produce these reports it is going to be necessary to record additional information in the database. Most of this additional information is recorded against Sources but some will also be needed against people and possibly against Repositories.

Important issues

  1. It must be possible to exclude Sources altogether. Many Sources are not related to documentary evidence and you would not want them cluttering up the reports.
  2. It must be possible to exclude Sources on an Person. If you know that you have checked a Source for an Individual you want to exclude that Source from showing up next time you run the reports.
  3. It must be possible to list Persons who have already been completely researched in the Source. Recognizing previously explored rabbit holes is necessary to avoid redundant research.

Implementation Plan

It should be possible to make a start on this by storing the Source meta-data in the key/value data of the Source. This will allow an initial proof-of-concept of the reports without touching the database schema.

For instance:

key              value
_r_:start_date 1841-06-06
_r_:end_date   1841-06-07
_r_:include    True

A more general version of probably_alive would be needed that can take a range and decide if the Person might be alive during that period.


Testing

Future Possibilities

  • Extract some informations from archives to Gramps[1]

Related Work

RepositoriesReport is an example what can be done at the moment. This GEP seeks to develop this idea further.

See also