Please update or expand this section.
Comparison of performance on large datasets between different Gramps versions
- 1 Performance tests
- 2 General setup
- 3 The Test Results
- 4 Dataset summaries
- 5 User Stories
- 6 Possible Future Optimizations
- 7 See also
It is important that Gramps performs well on datasets in the 10k to 30k range. A good benchmark is to test Gramps on a 100k range dataset, and keep track of performance with every new version.
Furthermore, this page can serve as proof to users that the present version of Gramps is not slow. From version 2.2.5 onwards, special attention will be given to performance, so that it does not deteriorate due to changes.
If you want to work with a large database, read Tips for large databases.
Comparison should be with equal hardware, and on the same datasets to be fair. Optimal representation may be chosen, so for Gramps, tests are done in the native database format, called GRAMPS GRDB format or GRAMPS XML format.
Should somebody want to publish results of commercial software under windows, this is allowed, but should be fair: same hardware and dataset, so test on a dual-boot machine, and use the internal format of the program.
A table with datasets is given. Pay attention to the copyright
The second table is a table with hardware configuration. Add your machine to this list if you do some tests and want to add them to this article.
The third table gives the test results, which are subjective. Please, don't use other software while doing the tests.
The Test Results
- My computer hangs upon open, eating memory? These are LARGE datasets, so do NOT open them directly. For Gramps open them as follows: create a new Family Tree. Open it and go to the import menu and import the dataset.
- What is tar.bz? This is a compression format. You must uncompress the file before importing it
- Can you provide the GEDCOM? No. Offering GEDCOM sample would tend to attract excessive traffic to this site not related to Gramps. If you must have GEDCOM, you could install Gramps, import the dataset, and then choose "Export to GEDCOM".
- What is in these files? See summary at the bottom of this page.
|Test Code||File Name||Download size||People||Size(MB)||License|
|d01||Doug's test GEDCOM||-||100993||32MB||Private|
|d02||11.2MB||82688||70MB||Testing only, no sharing, no publication|
*** NOTE: THIS FILE IS MISSING.
IF ANYONE HAS A COPY, PLEASE CONTACT [email protected] ***
|d03||testdb120000||14.8MB||124032||88 MB||Testing only, no sharing, no publication|
|d03_alternate||test_2011-09-07.gramps||11.9MB||124032||88.4MB||Testing only, no sharing, no publication (d03 for Gramps 3.3.x)|
|d04||Jean-Raymond's test GEDCOM french forum||-||52699||13.6MB||Private|
|d05||places.gramps||2.5MB||65598 place objects||15.3MB||Testing only, no sharing, no publication|
|d06 (same as d05, but gramps42 format)||Media:Places-2.gramps.zip||2.8MB||65598 place objects (expanded)||22MB||Testing only, no sharing, no publication|
|H01||Pentium 4||2.66 GHz||512 MB||HDD||Linux||?|
|H02||?||1.7 GHz||512 MB||HDD||Linux||?|
|H03||AMD Athlon64 X2||2x2.1 GHz||1 GB||HDD||Kubuntu 6.06||?|
|H04||Intel Centrino Duo||2x1.66 GHz||2 GB||HDD||Ubuntu 9.04||User:Duncan|
|H05||Intel Centrino Duo||2x1.66 GHz||2 GB||HDD||Ubuntu 8.10||User:Duncan|
|H06||AMD Phenom 9500||Quad Core 2.2 GHz||3GB||HDD||Windows Vista||Jean-Raymond Floquet|
|H07||Intel Pentium 4||2.80 GHz||512 MB *||HDD||Ubuntu 9.04||User:Romjerome|
|H08||Intel Celeron Dual Core||2.60 GHz||2 GB||HDD||Ubuntu 10.04||User:Romjerome|
|H09||Intel i5-2520M||2.50 GHz||8 GB||SSD||Ubuntu 14.04.3||User:Sam888|
(*) + 80MB of swap used on import
Tests table legend
|Test Code||Test Description|
|T01||Time to import GEDCOM/GRAMPS in empty native file format (GRDB)|
|T01_a||Time to import GEDCOM/GRAMPS XML in empty native file format (BSDDB)|
|T02||Size native file format (GRDB)|
|T03||Time to open native file format (GRDB) for clean/non-clean start on people view (*)|
|T04||Time to open edit person dialog|
|T05||Time to delete/undelete person|
|T06||Open event view clean/after T03 (*)|
|T07||Sort on date in event view|
|T08||Overall editing responsiveness|
(*) clean start means computer restart (so also python methods/modules must be loaded and started). Non clean means you have opened Gramps with .grdb file before, and open it again. Parts will be still in memory and access will be faster, as well as python being in memory.
Tests are done with in Gramps preferences: transactions enabled, unless indicated otherwise with notrans. This gives a performance boost. For safety: only change this setting on an empty database -- you are warned!
|H03||2.2.4 notrans||d01 (xml)||2h||542.6MB (v11)|
|H03||2.2.4||d01 (xml)||24 min||544.5MB|
|H03||2.2.4||d02 (xml)||20 min||323MB|
|H03||2.2.4||d03 (xml)||25 min||527MB|
|H03||2.2.6||d03 (xml)||23min||528MB (v12)|
|H04||2.2.10 (trans?)||d03 (xml)||1h:56min||?|
|H07||3.1.90 - 2009-7-20 (trans?)||d03 (xml)||2h:44min||2GB *|
|H08||3.3.0 (+ DB upgrade v13 ? + v14 + v15)||d03 (xml)||1h:47min||547MB (v15)|
|H08||3.3.0||d03_alternate (xml)||1h:46min!||543MB (v15)|
(*) 1520MB log files - 480MB tables
|H02||d01||2.2.4||T03 = 4m17s||T04 = ?||T05 = ?/?||T06 = ?||T07 = ?||T08 =|
|H03||d03||2.2.4||T03 = 2m37s/4m3s||T04 = 3s||T05 = 43s/23s||T06 = 1m23s/12s||T07 = 20s||T08 =||very bad|
|H03||d01||2.2.4||T03 = 2m22s/2m||T04 = 3s||T05 = 33s||T06 = 1m9s/10s||T07 = 18s||T08 =||very bad|
|H02||d01||2.2.5||T03 = 12s||T04 = ?||T05 = ?/?||T06 = ?||T07 = ?||T08 =|
|H03||d03||2.2.6||T03 = /17s||T04 = 1s||T05 = 20s/18s||T06 = ?/9s||T07 = 21s||T08 =||Excellent|
|H03||d02||2.2.6||T03 = ?/24s||T04 = 1s||T05 = 17s/13s||T06 = ?/11s||T07 = 17s||T08 =||Excellent|
|H05||d03||2.2.10||T03 = 1m15s/16s||T04 = 1s||T05 = 16s/13s||T06 = 11s/1s||T07 = 26s||T08 =||good after loading each view once|
|H06||d04||3.1.2||T03 = 1m30/?||T04 = 10s||T05 = ?/?||T06 = ?||T07 = 19s||T08 = 11s||not bad|
|H07||d03||3.1.90 2009-7-20||Cannot allocate memory (also python-2.6)||T04 = \||T05 = \||T06 = \||T07 = \||T08 =||size limitation on 3.0.x, 3.1.x and trunk ...|
|H08||d03||3.3.0||T03 = 16s (flat) / 30s (tree)||T04 = 1s||T05 = 1s/1s||T06 = 15s||T07 = 30s||T08 = 1s||Excellent|
|?||db||version||T03 = ?/?||T04 = ?||T05 = ?/?||T06 = ?||T07 = ?||T08 =||description|
For every test dataset, create a Database Summary Report:
Database Summary Report's
Summary of database test d01
Number of individuals: 100993 Males: 53046 Females: 47947 Individuals with incomplete names: 324 Individuals missing birth dates: 42726 Disconnected individuals: 19 Number of families: 36554 Unique surnames: 15308
Summary of database test d02
Number of individuals: 82688 Males: 44736 Females: 37952 Individuals with incomplete names: 17120 Individuals missing birth dates: 31528 Disconnected individuals: 880 Number of families: 32256 Unique surnames: 13957
Summary of database test d03
Number of individuals: 124032 Males: 67104 Females: 56928 Individuals with incomplete names: 25680 Individuals missing birth dates: 47292 Disconnected individuals: 1320 Number of families: 48384 Unique surnames: 20695
Summary of database test d04
Number of individuals: 52699 Males: 26420 Females: 26279 Individuals with incomplete names: 2 Individuals missing birth dates: 16427 Disconnected individuals: 0 Number of families: 24604 Unique surnames: 5822
Summary of database test d05
Number of individuals: 2132 Number of families: 749 Number of events: 4981 Number of places: 65598 Number of sources: 9 Number of media paths: 7 Number of repositories: 5 Number of notes: 1509
Running the tests can be slow, so here some user testimonies about Gramps Performance
Robert 2012-10, version 3.3.1
I work with a database of 141,000+names currently without difficulty (Gramps 3.3.1-1 on Fedora 16). Initial start is fairly slow though. First time to load each view is slow, but subsequent visits to views is almost immediate. Initial view load times:
- people 11 to 12 secs
- relationship abt 7 secs
- family 3 to 4 secs
- events 7 to 8 secs
- places 3 to 4 secs
- notes 11 to 12 secs
- ancestry view abt 1 sec or less
- Media abt 2 secs (although I only have about 1000 media in database)
- Repositories almost immediate
- sources about 1 sec - (time selecting a source varies according to number references for that source - my worst case is a civil registry which has about twice as many references as people in my database).
JohnBoyTheGreat 2019-12, version 5.1.1
Import tested with the GRANDMA Mennonite database of 1.4 million people. by user on reddit!! https://www.reddit.com/r/gramps/comments/dzevcl/database_size_limit_for_gramps/fb6hdbj/
It's been a few weeks since I asked whether anyone had attempted to use GRAMPS with the GRANDMA Mennonite database of 1.4 million people.
Based upon the suggestion above, I tried to load the Catalog of Life database...it took several days and it seemed to be working, but it eventually locked up GRAMPS. However, it seemed to be working. But, since I was only loading the Catalog of Life to test it, I decided not to waste time trying again.
In the meantime, I had ordered the GRANDMA database, but the organization selling it somehow gave me a bad download code and I couldn't reach them for several days (holiday and weekend). It was more than a week later before I was able to try loading the GRANDMA database into GRAMPS.
The result was SUCCESS!
It took about three days to load the GRANDMA database into GRAMPS, after giving GRAMPS a realtime priority in Windows. Setting GRAMPS to realtime sped up the loading considerably. My computer is rather fast compared to many, so anyone who wants to do the same thing should consider that it could take a week or longer to load up a huge database like GRANDMA.
At this point I have the GRANDMA database loaded into GRAMPS and running okay. It's really slow to switch to various functions, but it works okay once you get to each part of the GRAMPS program. One difficulty I ran into was scrolling through the 1.4 million records. It moves through the records quickly, but there are so many that you can't just use your cursor to pull to the surname you want to explore. Instead, I have to move the slider to a surname as close as possible, then scroll repeatedly until I find it. That can be 10-20 spins of the mouse wheel, so the process can be exhausting when you are looking for many different names.
My next step is going to be to copy the individual profiles which I need to a second family tree database that will be much, much smaller.
CONCLUSION: GRAMPS can handle 1.4 million records in a database. It's slow and takes several days to load, but it works.
Possible Future Optimizations
One can fine tune some things to obtain better results. An overview.
See if Gramps can pass this:
- The Confucius Challenge
- Confucius Cascade a real-world test based on consisting of increasingly gigantic GEDCOMs, tough time limits.
- Confucius Cup 2008
- Two Huge GEDCOM Files
- GedFan - creates GEDCOM files, so-called fan files, which are used to test genealogy applications, and thus determine the capacity of those application, expressed as a fan value.