Difference between revisions of "GEPS 009: Import Export Merge"

From Gramps
Jump to: navigation, search
Line 42: Line 42:
 
The above text is a raw outline only. The writer is not really familiar with gramps and has only offered to open a page in wiki to the Coordinator because everybody else seemed to be reluctant to do so. There is no doubt that this is a mere "bones" of the task and a very small step in potential programming task which can only occur if there is input from other persons interested in the topic and willing to discuss in the wiki style. There is some hope that such a discussion may take place as there has been a considerable exchange of thoughts and information in the developers' mailing list.  
 
The above text is a raw outline only. The writer is not really familiar with gramps and has only offered to open a page in wiki to the Coordinator because everybody else seemed to be reluctant to do so. There is no doubt that this is a mere "bones" of the task and a very small step in potential programming task which can only occur if there is input from other persons interested in the topic and willing to discuss in the wiki style. There is some hope that such a discussion may take place as there has been a considerable exchange of thoughts and information in the developers' mailing list.  
  
It would be helpful if substantial contributors to this page were to indicate who they are. Simplest is to append four "~"'s whilst in the editing mode! It would enable simpler direct communication between all interested parties. Of course, minor edits do not need to have the contributor's name, which is optional anyway. [[User:OldAl|Al]] 21:03, 29 August 2008 (EDT)
+
== Julio patch set ==
=== List of Files ===
+
 
 +
Julio custom coded merge code in the 2.2.x branch. You find them [http://cage.ugent.be/~bm/downloads/merge.patch here]. This code must be reviewed for 3.1.x and integrated if needed.
 +
 
 +
== UID, GUID and _UID, what is needed in GRAMPS? ==
 +
The discussion of UID fits with the merge problem. Some unofficial standard for UID we should perhaps follow:
 +
* [http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html mail list discussion on ldsoss]
 +
* [http://www.familysearchdevnet.org/downloads/gedcom/FS-TT1001.doc word document with the basics]

Revision as of 09:17, 5 November 2008

This page is for the discussion of a proposed implementation of the merging old and new data both whilst importing, and as a independent merge process, in GRAMPS. As this action is closely related to import and export, this section has been named "Import Export Merge"

Import Export Merge

Current State

Officially, GRAMPS import does not merge existing data with new data being imported. (The Spreadsheet/CSV does do a type of merge, but let's leave that aside for the moment. It is discussed in a section of Gramps Manual). However, the standard GRAMPS import will duplicate some data (such as events, but not people) if you import a GEDCOM file twice. This proposal will fix this bug by allowing a user to intelligently, interactively, or automatically do a better job than the current version.

This same process can be used to interactively merge two objects in GRAMPS by the user. For example, a user may realize that two person entries are really the same person, and so should be combined.

Current Related Files

  1. Import
    1. src/GrampsDbUtils/importdbdir.py
    2. src/GrampsDbUtils/gedcomimport.glade
  2. CSV Import
    1. src/plugins/ImportCSV.py
  3. Merging
    1. src/Merge/*.py
    2. src/plugins/merge.glade

Exporting

Currently exporting to gedcom and csv is limited to some information. Though gedcom is "standard" lingua franca of genealogy, it is inherently limited, particularly because various extant versions of gedcom. CSV has the advantage that it can be imported to any current spreadsheet, particularly OpenOffice.org. What are the limitation of csv exports in gramps?

CSV export/import is limited to the main objects in GRAMPS. It was not designed as a general purpose import/export but rather an alternative input/output tool.

Merging

This is not a trivial task, though probably not impossible. Because of the extent and complexity of the task, a separate thread is desirable. A reference to this thread is here. Some informed discussion is clearly desirable. Whilst there are a number of people with some knowledge of Python who are eager to contribute to a "real program", they need to be able to first find their way in the various files of Gramps. It would help to extract the names of the most pertinent file names from the general list and make the sublist here for the orientation purposes. Also, the aims of Merging should be first defined in non-ambiguous format.

One can sub-classify import in three sub-titles:

Fresh Data Import

This is probably the simplest option and safest - delete (first archive!) the current gramps data base and import all data.

Append Import

Simply append all import data to the existing data base. The editing task would be left to the user. This option should be relatively easy to implement.

Merge Import

Leave some editing of the data to the program. Whilst manual intervention by the user would inevitably be required, some of it could be achieved in the program.

Merge Two Objects

This is a topic that was initially overlooked. For further information see merging pages.

Comments

The above text is a raw outline only. The writer is not really familiar with gramps and has only offered to open a page in wiki to the Coordinator because everybody else seemed to be reluctant to do so. There is no doubt that this is a mere "bones" of the task and a very small step in potential programming task which can only occur if there is input from other persons interested in the topic and willing to discuss in the wiki style. There is some hope that such a discussion may take place as there has been a considerable exchange of thoughts and information in the developers' mailing list.

Julio patch set

Julio custom coded merge code in the 2.2.x branch. You find them here. This code must be reviewed for 3.1.x and integrated if needed.

UID, GUID and _UID, what is needed in GRAMPS?

The discussion of UID fits with the merge problem. Some unofficial standard for UID we should perhaps follow: