Changes

Jump to: navigation, search

Gramps Performance

8,446 bytes added, 23:02, 17 November 2023
m
See also
{{man tip|The advice on this page was for older versions of Gramps so may not work for you. Please update as needed.}}{{stub}}Comparison of performance on large datasets between different GRAMPS Gramps versions
==Performance tests==
It is important that GRAMPS Gramps performs well on datasets in the 10k to 30k range. A good benchmark is to test GRAMPS Gramps on a 100k range dataset, and keep track of performance with every new version.
Furthermore, this page can serve as proof to users that the present version of GRAMPS Gramps is '''not slow'''. From version 2.2.5 onwards, special attention will be given to performance, so that it does not deteriorate due to changes.
If you want to work with a large database, read [[Tips for large databases]].
==General setup==
Comparison should be with equal hardware, and on the same datasets to be fair. Optimal representation may be chosen, so for GRAMPSGramps, tests are done in the native database format, called GRAMPS GRDB format or GRAMPS XML format.
Should somebody want to publish results of commercial software under windows, this is allowed, but should be fair: same hardware and dataset, so test on a dual-boot machine, and use the internal format of the program.
=== Genealogical datasets ===
{|{{prettytable}}man warn|-Warning|[[Image:Gnome-important.png]]|<center style="font-size:110%">Private datasets will not be shared under any reason. <br><br>Free datasets are given under the following copyright: use for testing of genealogical programs only, no publication, no sharing. They have been created with free information on the net of which where the users posting author explicitly state it can stated their dataset may be used re-distributed freely. Should <br><br> However, should you however feel certain data is misplaced, or that the original posting author does did not have the right to distribute the data, please contact us to remove any information as necessary.</center>|}}
'''FAQ'''
* ''My computer hangs on upon open, eating memory?'' These are LARGE datasets, so do NOT open them directly. For GRAMPS Gramps open them as follows: create a new grdb fileFamily Tree. In the empty file Open it and go to file the import menu-import and import the dataset.
* ''What is tar.bz?'' This is a compression format. You must uncompress the file before importing it
* ''Can you provide the GEDCOM?'' No. Offering GEDCOM has the danger of attracting sample would tend to much attract excessive traffic to this sitenot related to Gramps. If you need must have GEDCOM, you should could install GRAMPSGramps, import the dataset, and then choose "Export to GEDCOM".
* ''What is in these files?'' See summary at the bottom of this page.
 
{| {{prettytable}}
|-
!Test Code !nameFile Name
!Download size
!People
!Size(MB)!CopyrightLicense
|-
|<!-- Code -->[[#Summary_of_database_test_d01|d01]]|<!-- File Name -->Doug's test GEDCOM| <!-- Download size --> - | <!-- People --> 100993|<!-- Size (MB) -->32MB|<!-- License -->Private
|-
|<!-- Code -->[[#Summary_of_database_test_d02|d02]]|[http://www.gramps<!-- File Name --project.org/files/stresstestdata><strike>testdb80000</testdb80000.gramps testdb80000]strike>| <!-- Download size --> 11.2 MB2MB| <!-- People --> 82688 |<!-- Size (MB) -->70MB|<!-- License -->Testing only, no sharing, no publication <br>{{man menu|*** NOTE: THIS FILE IS MISSING. <br>IF ANYONE HAS A COPY, PLEASE CONTACT nickwebmaster@gramps-project.org ***}}
|-
|<!-- Code -->[[#Summary_of_database_test_d03|d03]]|<!-- File Name -->[http://www.gramps-project.org/files/stresstestdata/testdb120000.gramps.tar.gz testdb120000]| 18<!-- Download size --> 14.8MB |<!-- People --> 124032|<!-- Size (MB) -->88 MB|<!-- License -->Testing only, no sharing, no publication|-|<!-- Code -->[[#Summary_of_database_test_d03|d03_alternate]]|<!-- File Name -->[http://www.gramps-project.org/files/stresstestdata/test_2011-09-07.gramps.tar.bz2 test_2011-09-07.gramps]|<!-- Download size --> 11.9MB |<!-- People --> 124032|<!-- Size (MB) -->88.4MB|<!-- License -->Testing only, no sharing, no publication (d03 for Gramps 3.3.x)|-|<!-- Code -->[[#Summary_of_database_test_d04|d04]]|<!-- File Name -->Jean-Raymond's test GEDCOM [http://forum.geneanet.org/index.php?topic=389170.0 french forum]|<!-- Download size --> -|<!-- People --> 52699|<!-- Size (MB) -->13.6MB|<!-- License -->Private|-|<!-- Code -->[[#Summary_of_database_test_d05|d05]]|<!-- File Name -->[http://www.gramps-project.org/files/stresstestdata/places.gramps places.gramps]|<!-- Download size --> 2.5MB | 124032<!-- People --> 65598 place objects|<!-- Size (MB) -->15.3MB|<!-- License -->Testing only, no sharing, no publication|-|<!-- Code -->[[#Summary_of_database_test_d05|d06]] (same as d05, but gramps42 format)|<!-- File Name -->[[Media:Places-2.gramps.zip]]|<!-- Download size --> 2.8MB |<!-- People --> 65598 place objects (expanded)|105MB<!-- Size (MB) -->22MB|<!-- License -->Testing only, no sharing, no publication|-|<!-- Test Code -->|<!-- File Name -->|<!-- Download size -->|<!-- People -->|<!-- Size (MB) -->|<!-- License -->
|}
{| {{prettytable}}
|-
!Hardware Code
!Processor
!clock
!RAM
!Storage<!--Type eg: HDD or SSD-->
!OS
!User
|-
|H01 || Pentium 4 || 2.66 GHz || 512 MB || HDD || Linux || ? |-|H02 || ? || 1.7 GHz || 512 MB || HDD || Linux || ?|-|H03 || AMD Athlon64 X2 || 2x2.1 GHz || 1 GB || HDD || Kubuntu 6.06 || ?
|-
|H02 H04 || ? Intel Centrino Duo || 12x1.7 66 GHz || 512 MB 2 GB || Linux HDD || ?Ubuntu 9.04 || [[User:Duncan]]
|-
|H03 H05 || AMD Athlon64 X2 Intel Centrino Duo || 2x22x1.1 66 GHz || 1 2 GB || Kubuntu 6HDD || Ubuntu 8.06 10 || ?[[User:Duncan]]
|-
|H04 H06 || Intel Centrino Duo AMD Phenom 9500 || 2x1Quad Core 2.66 2 GHz || 2 GB 3GB || HDD || Ubuntu 9.04 Windows Vista || [[User:Duncan]]Jean-Raymond Floquet
|-
|H05 H07 || Intel Centrino Duo Pentium 4 || 2x12.80 GHz || 512 MB * || HDD || Ubuntu 9.04 || [[User:Romjerome]]|-|H08 || Intel Celeron Dual Core || 2.66 60 GHz || 2 GB || HDD || Ubuntu 10.04 || [[User:Romjerome]]|-|H09 || Intel i5-2520M || 2.50 GHz || 8GB || SSD || Ubuntu 14.10 04.3 || [[User:DuncanSam888]]
|}
(*) + 80MB of swap used on import === Tests table legend ===
{| {{prettytable}}
|-
!Test Code !! test Test Description
|-
|T01 || Time to import GEDCOM/GRAMPS in empty native file format (GRDB)
|-
|T01_a || Time to import GEDCOM/GRAMPS XML in empty native file format (BSDDB)
|-
|T02 || Size native file format (GRDB)
|-
|T03 || Time to open native file format (GRDB) for clean/nonclean non-clean start on people view (*)
|-
|T04 || Time to open edit person dialog
|T07 || Sort on date in event view
|-
|T08 || Overal Overall editing responsiveness
|}
(*) clean start means computer restart (so also python methods/modules must be loaded and started). Non clean means you have opened GRAMPS Gramps with .grdb file before, and open it again. Parts will be still in memory and access will be faster, as well as python being in memory.
=== Performance results ===
{{man warn|General remark: tests |Tests are done with in GRAMPS Gramps preferences: '''transactions enabled''', unless indicated otherwise with '''notrans'''. This gives a performance boost. ''For safety: only change this setting on an empty database -- you are warned!''}}
{| {{prettytable}}
|-
!Comp Hardware Code !! GRAMPS Gramps !! data !! T01 !! T02
|-
|H03 ||bgcolor="#ffa0a0"| 2.2.6 4 notrans || d02 d01 (xml)|| 15min bgcolor="#ffa0a0"|2h | 332MB| 542.6MB (v11)
|-
|H03 ||bgcolor="#a0ffa0"| 2.2.6 4 ||d01 (xml)| d03 |bgcolor="#a0ffa0"| 23min 24 min || 528MB (v12)544.5MB
|-
|H03 ||bgcolor="#a0ffa0"| 2.2.4 || d02 (xml)||bgcolor="#a0ffa0"| 20 min || 323MB
|-
|H03 ||bgcolor="#a0ffa0"| 2.2.4 notrans || d01 d03 (xml)||bgcolor="#a0ffa0"| 2h 25 min || 542.6MB (v11)527MB
|-
|H03 ||bgcolor="#a0ffa0"| 2.2.4 6 || d01 d02 (xml)|| 24 min bgcolor="#a0ffa0"|15min | 544.5MB | 332MB
|-
|H03 ||bgcolor="#a0ffa0"| 2.2.4 6 ||d03 (xml)| d02 |bgcolor="#a0ffa0"| 20 min 23min || 323MB 528MB (v12)
|-
|H03 H04 ||bgcolor="#ffa0a0"| 2.2.4 10 (trans?)|| d03 (xml)||bgcolor="#ffa0a0"| 25 min 1h:56min || 527MB ?
|-
|H04 H05 ||bgcolor="#ffa0a0"| 3.0.4 || d03 (xml) ||bgcolor="#ffa0a0"| 1h:56m 56min || ?
|-
|H05 H06 || bgcolor="#a0ffa0"| 3.1.2|| d04 (gedcom) ||bgcolor="#a0ffa0"| 8min || 937MB|-|H07 ||bgcolor="#a0a0a0"| 3.21.10 90 - 2009-7-20 (trans?)|| d03 (xml)||bgcolor="#a0a0a0"| 2h:44min || 2GB *|-|H08 ||bgcolor="#ffa0a0"| 3.3.0 (+ DB upgrade v13 ? + v14 + v15)|| d03 (xml)||bgcolor="#ffa0a0"| 1h:47min || 547MB (v15)|-|H08 ||bgcolor="#ffa0a0"| 3.3.0 || d03_alternate (xml)||bgcolor="#ffa0a0"| 1h:56m 46min!|| ?543MB (v15)
|}
 
(*) 1520MB log files - 480MB tables
{| {{prettytable}}
|-
!Comp Hardware Code !! data !! GRAMPS Gramps !! test T03 !! result T04 !! T05 !! T06 !! T07 !! T08 !! ...result
|-
|H03 H02 || d03 d01 ||bgcolor="#ffa0a0"| 2.2.6 4 || T03 = /17s 4m17s || T04 = 1s ? || T05 = 20s?/18s ? ||T06 = ?/9s || T07 = 21s ? || T08 = Excellent||bgcolor="#ffa0a0"|
|-
|H03 || d03 ||bgcolor="#ffa0a0"| 2.2.4 || T03 = 2m37s/4m3s || T04 = 3s|| T05 = 43s/23s || T06 = 1m23s/12s || T07 = 20s || T08 = ||bgcolor="#ffa0a0"| very bad
|-
|H03 || d01 ||bgcolor="#ffa0a0"| 2.2.4|| T03 = 2m22s/2m || T04 = 3s|| T05 = 33s || T06 = 1m9s/10s || T07 = 18s || T08 = ||bgcolor="#ffa0a0"| very bad
|-
|H03 H02 || d02 d01 || 2.2.6 5 || T03 = ?/24s 12s || T04 = 1s ? || T05 = 17s?/13s ? || T06 = ?/11s || T07 = 17s ? || T08 = Excellent||
|-
|H03 || d03 ||bgcolor="#a0ffa0"| 2.2.6 || T03 = /17s || T04 = 1s || T05 = 20s/18s ||T06 = ?/9s || T07 = 21s || T08 = ||bgcolor="#a0ffa0"| Excellent
|-
|H03 || d02 || d01 |bgcolor="#a0ffa0"| 2.2.46 || T03 = 2m22s?/2m 24s || T04 = 3s1s || T05 = 33s 17s/13s || T06 = 1m9s?/10s 11s || T07 = 18s 17s || T08 = very bad||bgcolor="#a0ffa0"| Excellent
|-
|H05 || d03 ||bgcolor="#e0ffe0"| 2.2.10 || T03 = 1m15s/16s || T04 = 1s || T05 = 16s/13s || T06 = 11s/1s || T07 = 26s || T08 = || bgcolor="#e0ffe0"|good after loading each view once
|-
|H02 H06 || d01 d04 || 2bgcolor="#e0ffe0"| 3.1.2.5 || T03 = 12s 1m30/? || T04 = 10s || T05 = ?/? || T06 = ? ||T07 = 19s || T08 = 11s ||bgcolor="#e0ffe0"| not bad
|-
|H02 H07 || d01 d03 || 2bgcolor="#a0a0a0"| 3.1.90 2009-7-20 || [http://www.gramps-project.org/bugs/view.php?id=2686 Cannot allocate memory (also python-2.4 6)] || T04 = \ || T05 = \ || T03 T06 = 4m17s \ || T07 = \ || T08 = ||bgcolor="#a0a0a0"| size limitation on 3.0.x, 3.1.x and trunk ...
|-
|H05 H08 || d03 || 2bgcolor="#a0ffa0"| 3.23.10 0 || T03 = 1m15s16s (flat) /16s 30s (tree) || T04 = 1s || T05 = 16s1s/13s 1s || T06 = 11s/1s 15s || T07 = 26s 30s || T08 = good after loading each view once1s ||bgcolor="#a0ffa0"| Excellent
|-
|? || db || version || T03 = ?/? || T04 = ? || T05 = ?/? || T06 = ? || T07 = ? || T08 = || description
|}
== Dataset summaries ==
For every test dataset, create a summary with [[Gramps_4.2_Wiki_Manual_-_Reports_-_part_6#Database_Summary_Report|Database Summary Report:  ;''Summary of the database'' ;'''Summary of database test d01''']]:
=== Database Summary Report's ===
==== Summary of database test d01 ====
Number of individuals: 100993
Males: 53046
Unique surnames: 15308
 ;'''==== Summary of database test d02''':====
Number of individuals: 82688
Males: 44736
Unique surnames: 13957
 ;'''==== Summary of database test d03''':====
Number of individuals: 124032
Males: 67104
Number of families: 48384
Unique surnames: 20695
 
==== Summary of database test d04 ====
Number of individuals: 52699
Males: 26420
Females: 26279
Individuals with incomplete names: 2
Individuals missing birth dates: 16427
Disconnected individuals: 0
Number of families: 24604
Unique surnames: 5822
 
==== Summary of database test d05 ====
Number of individuals: 2132
Number of families: 749
Number of events: 4981
Number of places: '''65598'''
Number of sources: 9
Number of media paths: 7
Number of repositories: 5
Number of notes: 1509
 
== User Stories ==
Running the tests can be slow, so here some user testimonies about Gramps Performance
=== Robert 2012-10, version 3.3.1 ===
I work with a database of 141,000+names currently without difficulty
(Gramps 3.3.1-1 on Fedora 16).
Initial start is fairly slow though.
First time to load each view is slow, but subsequent visits to views is
almost immediate.
Initial view load times:
* people 11 to 12 secs
* relationship abt 7 secs
* family 3 to 4 secs
* events 7 to 8 secs
* places 3 to 4 secs
* notes 11 to 12 secs
* ancestry view abt 1 sec or less
* Media abt 2 secs (although I only have about 1000 media in database)
* Repositories almost immediate
* sources about 1 sec - (time selecting a source varies according to number references for that source - my worst case is a civil registry which has about twice as many references as people in my database).
 
=== JohnBoyTheGreat 2019-12, version 5.1.1 ===
Import tested with the [https://mhss.sk.ca/FH/GRANDMA.shtml GRANDMA Mennonite database] of 1.4 million people. by user on reddit!! https://www.reddit.com/r/gramps/comments/dzevcl/database_size_limit_for_gramps/fb6hdbj/
 
''
FOLLOWUP...
 
It's been a few weeks since I asked whether anyone had attempted to use GRAMPS with the GRANDMA Mennonite database of 1.4 million people.
 
Based upon the suggestion above, I tried to load the Catalog of Life database...it took several days and it seemed to be working, but it eventually locked up GRAMPS. However, it seemed to be working. But, since I was only loading the Catalog of Life to test it, I decided not to waste time trying again.
 
In the meantime, I had ordered the GRANDMA database, but the organization selling it somehow gave me a bad download code and I couldn't reach them for several days (holiday and weekend). It was more than a week later before I was able to try loading the GRANDMA database into GRAMPS.
 
The result was SUCCESS!
 
It took about three days to load the GRANDMA database into GRAMPS, after giving GRAMPS a realtime priority in Windows. Setting GRAMPS to realtime sped up the loading considerably. My computer is rather fast compared to many, so anyone who wants to do the same thing should consider that it could take a week or longer to load up a huge database like GRANDMA.
 
At this point I have the GRANDMA database loaded into GRAMPS and running okay. It's really slow to switch to various functions, but it works okay once you get to each part of the GRAMPS program. One difficulty I ran into was scrolling through the 1.4 million records. It moves through the records quickly, but there are so many that you can't just use your cursor to pull to the surname you want to explore. Instead, I have to move the slider to a surname as close as possible, then scroll repeatedly until I find it. That can be 10-20 spins of the mouse wheel, so the process can be exhausting when you are looking for many different names.
 
My next step is going to be to copy the individual profiles which I need to a second family tree database that will be much, much smaller.
 
CONCLUSION: GRAMPS can handle 1.4 million records in a database. It's slow and takes several days to load, but it works.
''
== Possible Future Optimizations ==
One can fine tune some things to obtain better results. An overview.
See if GRAMPS Gramps can pass this:* [http://www.xs4alltamurajones.nl/~tamuraj/jones2net/TheConfuciusChallenge.xhtml The Confucius Challenge] ** [http://www.xs4alltamurajones.nl/~tamuraj/jones2net/ConfuciusCascade.xhtml Confucius Cascade] a real-world test based on consisting of increasingly gigantic GEDCOMs, tough time limits.** [http://www.xs4alltamurajones.nl/~tamuraj/jones2net/ConfuciusCup2008.xhtml Confucius Cup 2008]
** [http://www.tamurajones.net/TwoHugeGEDCOMFiles.xhtml Two Huge GEDCOM Files]
** [http://www.tamurajones.net/GedFan0.4.0.0.xhtml GedFan] - creates GEDCOM files, so-called fan files, which are used to test genealogy applications, and thus determine the capacity of those application, expressed as a fan value.
 
==See also==
* [[GEPS 016: Enhancing Gramps Processing Speed]]
* [[Plugins_Command_Line#Generate_Testcases_for_Persons_and_Families|testcasegenerator]] Command Line option
[[Category:Developers/General]]
[[Category:Documentation|Performance]]
4,608
edits

Navigation menu