Difference between revisions of "Addon:Lxml Gramplet"

From Gramps
Jump to: navigation, search
m
m (Link POSIX to Glossary)
(40 intermediate revisions by 3 users not shown)
Line 1: Line 1:
''lxml gramplet'' is an experimental [[Gramplets|gramplet]] working under POSIX platform(s), which reads, writes (''not the original one; safe read only state''), transforms our [[GRAMPS_XML|Gramps XML]] file on the fly without an import into our database (Gramps session).
+
{{Third-party plugin}}
  
==Dependencies and file format==
+
{{man label|lxml Gramplet}} is an experimental [[Gramplets|gramplet]] working under [[Gramps_Glossary#posix|POSIX]] platform(s), which reads, writes (''not the original one; safe read only state''), transforms content of our [[Gramps XML]] file on the fly without an import into our database (Gramps session).
  
* [http://lxml.de/ lxml] is a Pythonic binding for the C libraries [http://xmlsoft.org/ libxml2] and [http://xmlsoft.org/XSLT/ libxslt]. It is known for good performances by using C-level ([http://www.cython.org/ Cython]).
+
Includes the {{man label|etree Gramplet}} for testing the ''Python ElementTree module''(etree) with Gramps XML.
* [[GRAMPS_XML|Gramps XML]] file format is robust and well [[GRAMPS_XML#Gramps_XML_Resources|documented]].
 
  
 
==Goals==
 
==Goals==
Line 10: Line 9:
 
The idea of this experimental '''lxml gramplet''' is to provide a way for using basic lxml features with Gramps XML files.  
 
The idea of this experimental '''lxml gramplet''' is to provide a way for using basic lxml features with Gramps XML files.  
  
''XPath'', ''Xslt'', ''RelaxNG validation'', can be used and done by lxml, which provides an [http://lxml.de/compatibility.html API very close] to [http://docs.python.org/library/xml.etree.elementtree.html etree ElementTree module] from python 2.5 and later.
+
''XPath'', ''Xslt'', ''XML dump'', ''RelaxNG and XSD validations'', can be used and done by lxml, which provides an [http://lxml.de/compatibility.html API very close] to [http://docs.python.org/3/library/xml.etree.elementtree.html etree ElementTree module] from python 2.5 and later.
  
The experimental '''lxml gramplet''' aims to use these lxml features[1] by parsing a Gramps XML file generated by Gramps 3.3.x and to generate an output sample, using ''open'' [http://www.w3.org/ W3C] standards ([http://www.w3.org/standards/xml/ XML], [http://www.w3.org/standards/webdesign/ Web design], [http://www.w3.org/standards/webofservices/ Web services], etc ...).
+
The experimental '''lxml gramplet''' aims to use these lxml features[1] by parsing a Gramps XML file generated by Gramps 3.4.x (or 3.3.x) and to generate an output sample, using ''open'' [http://www.w3.org/ W3C] standards ([http://www.w3.org/standards/xml/ XML], [http://www.w3.org/standards/webdesign/ Web design], [http://www.w3.org/standards/webofservices/ Web services], etc ...).
  
  
 
[1] see also [http://lxml.de/objectify.html lxml.objectify]
 
[1] see also [http://lxml.de/objectify.html lxml.objectify]
 +
 +
==Usage==
 +
 +
* You can get a copy of this simple ''draft'' from Addon repository:
 +
 +
https://github.com/gramps-project/addons-source/tree/master/lxml
 +
 +
Currently, this addon quickly explores multiple ways. Feel free to modify for your own use.
 +
 +
For '''testing only''', by design these [https://docs.python.org/3/library/xml.html#xml-vulnerabilities actions are not for production].
 +
 +
== Prerequisites ==
 +
Before this can be used you will need the following prerequisites installed:
 +
 +
* [http://lxml.de/ lxml] is a Pythonic binding for the C libraries [http://xmlsoft.org/ libxml2]
 +
* and [http://xmlsoft.org/XSLT/ libxslt].
 +
 +
Both are known for good speed performances by using C-level ([http://www.cython.org/ Cython]).
 +
 +
===Gramps XML file format===
 +
* [[Gramps XML]] file format is robust and well [[Gramps XML#Gramps_XML_Resources|documented]].
 +
  
 
==Screenshots==
 
==Screenshots==
Line 63: Line 84:
 
[[File:lxml_hardcoded_list.png|600px|thumb|left|Hardcoded list (gramps translations)]]
 
[[File:lxml_hardcoded_list.png|600px|thumb|left|Hardcoded list (gramps translations)]]
  
<br clear="all"/>
+
{{-}}
  
==Test it==
+
==Further development==
  
* You can get a copy of this simple ''draft'' from Addon repository:
+
===Bibliography gramplet ?===
  
http://gramps-addons.svn.sourceforge.net/viewvc/gramps-addons/trunk/contrib/lxml
+
* [http://www.giuspen.com/cherrytree CherryTree] is an hierarchical note taking application, featuring rich text and syntax highlighting, storing all the data (including images) in a single '''''xml file''''' with extension ''.ctd'', which has planned to also implement an integration with [http://www.zotero.org/ zotero] content.
  
* You can also [http://gramps-addons.svn.sourceforge.net/viewvc/gramps-addons/trunk/download/lxml.addon.tgz download and install it] as 3.4 addon.
+
* [http://zim-wiki.org/index.html Zim] is a graphical text editor used to maintain a collection of wiki pages. All pages you create in zim are saved as plain text files with wiki formatting. This means that you can access your content with any other editor or file manager without being dependent on zim. You can even have your pages in a revision control system like CVS or use a Makefile to compile your notes into a webpage. Any images you add are just image files which are linked from the text files. This means that [http://zim-wiki.org/index.html zim] can call your standard programs to edit images. When you embed an image in a page the context menu for the image will offer to open it with whatever image manipulation programs you have installed. After editing you just reload the page to see the result. See also [http://zim-wiki.org/extras.html third party contributions].
 
 
Currently, this addon quickly explores multiple ways. Feel free to modify for your own use.
 
  
==Go further==
+
===Collaborative indexes===
  
===Bibliography gramplet ?===
+
* Tiny Tafel [http://en.wikipedia.org/wiki/Tiny_Tafel]
  
* [http://www.giuspen.com/cherrytree CherryTree] is an hierarchical note taking application, featuring rich text and syntax highlighting, storing all the data (including images) in a single '''''xml file''''' with extension ''.ctd'', which has planned to also implement an integration with [http://www.zotero.org/ zotero] content.
+
* [[GENDEX]]
  
* [http://zim-wiki.org/index.html Zim] is a graphical text editor used to maintain a collection of wiki pages. All pages you create in zim are saved as plain text files with wiki formatting. This means that you can access your content with any other editor or file manager without being dependent on zim. You can even have your pages in a revision control system like CVS or use a Makefile to compile your notes into a webpage. Any images you add are just image files which are linked from the text files. This means that [http://zim-wiki.org/index.html zim] can call your standard programs to edit images. When you embed an image in a page the context menu for the image will offer to open it with whatever image manipulation programs you have installed. After editing you just reload the page to see the result. See also [http://zim-wiki.org/extras.html third party contributions].
+
* [http://scrapy.org/ Scrapy] is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. It should support [[Gramps XML]], [[Gramps_4.0_Wiki_Manual_-_Manage_Family_Trees:_CSV_Import_and_Export|Gramps CSV]] and [[GEPS_009:_Import_Export_Merge|Gramps JSON]].
  
 
===Clients library for FamilySearch API===
 
===Clients library for FamilySearch API===
Line 126: Line 145:
 
===Database compare and merge===
 
===Database compare and merge===
  
* GrampsCompare.py, a python script for comparing data in 2 Gramps xml files.
+
* GrampsCompare.py, a python script for comparing data in 2 Gramps XML files.
 +
 
 +
source: [http://sourceforge.net/mailarchive/message.php?msg_id=28173190 Archive (Oct 02, 2011) on gramps-devel mailing list]
 +
 
 +
* [[Import_Merge_Tool|ImportMerge]] tool
 +
 
 +
* [[ImportGramplet]]
  
source: [http://sourceforge.net/mailarchive/message.php?msg_id=28173190 Archive (Oct 02, 2011) on gramps-devel mailing list]  
+
* [http://svn.code.sf.net/p/gramps-addons/code/trunk/contrib/Differences/ Differences report]
  
 
===Database backend===
 
===Database backend===
Line 138: Line 163:
 
* Gramps Exhibit and experimental phase for [http://members.tele2.nl/m.d.nauta/typeless_data_entry/typeless_data_entry.html typeless data entry].
 
* Gramps Exhibit and experimental phase for [http://members.tele2.nl/m.d.nauta/typeless_data_entry/typeless_data_entry.html typeless data entry].
  
* [http://akara.info/ Akara] is a platform for developing data services available on the Web, using [http://en.wikipedia.org/wiki/Representational_state_transfer REST] architecture. Akara is open source software written in Python and C. eg, [http://recollection.zepheira.com/ Recollection project] for the Library of Congress. See the [http://recollection.zepheira.com/about/userguide/ user guide] or screencasts (''shockwave flash'') [http://outreach.zepheira.com/public/loc/recollection/video/recollection-augmentation.swf], [http://outreach.zepheira.com/public/loc/recollection/video/recollection-intro.swf].
+
* [http://akara.info/ Akara] is a platform for developing data services available on the Web, using [http://en.wikipedia.org/wiki/Representational_state_transfer REST] architecture. Akara is open source software written in Python and C. eg, [http://recollection.zepheira.com/ Recollection project] for the Library of Congress. See the [http://recollection.zepheira.com/about/userguide/ user guide] or screencasts (''shockwave flash'') [http://outreach.zepheira.com/public/loc/recollection/video/recollection-augmentation.swf], [http://outreach.zepheira.com/public/loc/recollection/video/recollection-intro.swf], [https://www.youtube.com/watch?v=m-TD4jTWn3U].
 +
 
 +
* [[#Collaborative indexes|Scrapy]]
 +
 
 +
===Environment===
 +
 
 +
* [[Linux_Genealogy_CD#Ways_to_go_.3F|Genealogical ''user'' tablet]] could also provide a portable environment.
 +
 
 +
* A simple reader with a crossplatform lib: [http://en.wikipedia.org/wiki/QML qml], [http://qt.nokia.com/ qt4], [http://www.gtk.org/ gtk3], [http://kivy.org kivy], [http://pyjs.org/ pyjamas], [[#HTML_class|html5]]; for generating native apps.
  
 
===Faceted classification===
 
===Faceted classification===
  
A [http://en.wikipedia.org/wiki/Faceted_classification faceted classification], [http://unesdoc.unesco.org/Ulis/cgi-bin/ulis.pl?catno=133325&set=4B1BA8F9_1_463&database=ged&gp=0&mode=e&lin=1&ll=f system] proposed by [http://en.wikipedia.org/wiki/S._R._Ranganathan Shiyali Ramamrita Ranganathan] with the theory "[http://en.wikipedia.org/wiki/Five_laws_of_library_science five laws in library science]". See also [http://en.wikipedia.org/wiki/Folksonomy Folksonomy].
+
* A [http://en.wikipedia.org/wiki/Faceted_classification faceted classification], [http://unesdoc.unesco.org/Ulis/cgi-bin/ulis.pl?catno=133325&set=4B1BA8F9_1_463&database=ged&gp=0&mode=e&lin=1&ll=f system] proposed by [http://en.wikipedia.org/wiki/S._R._Ranganathan Shiyali Ramamrita Ranganathan] with the theory "[http://en.wikipedia.org/wiki/Five_laws_of_library_science five laws in library science]". See also [http://en.wikipedia.org/wiki/Folksonomy Folksonomy].
 +
 
 +
* [http://pythonhosted.org/Whoosh/ python-whoosh] can provide a simple way for [http://pythonhosted.org/Whoosh/facets.html generating facets in python].
  
 
===HTML class===
 
===HTML class===
Line 150: Line 185:
  
 
* Gtk3
 
* Gtk3
GTK+3 provides an HTML backend that allows GTK applications to run natively within an HTML5 web navigator.
+
GTK+3 provides an [http://git.gnome.org/browse/gtk+/log/?h=broadway HTML backend] that allows GTK applications to run natively within an HTML5 web navigator.
  
See [http://people.gnome.org/%7Ealexl/broadway-screencast.ogg sample1], [http://youtu.be/AO-qca9ddqg sample2].
+
See [http://people.gnome.org/%7Ealexl/broadway-screencast.ogg sample1], [http://youtu.be/AO-qca9ddqg sample2], [http://www.youtube.com/watch?v=hhMFD3ZCrIc sample3].
  
 
===Interface===
 
===Interface===
Line 164: Line 199:
 
===Performances===
 
===Performances===
  
See [[GRAMPS_Performance|Gramps performances]] for comparison on large datasets between different Gramps versions.  
+
See [[Gramps_Performance|Gramps performances]] for comparison on large datasets between different Gramps versions.  
  
 
===Web applications===
 
===Web applications===
  
* [[GEPS_013:_GRAMPS_Webapp|GEPS 013]] describes a web-based application that runs in your browser, and requires a server. A prototype is now on-line at http://gramps-connect.org/ which is running trunk on a sample database (id=admin1, password=gramps).
+
* [[GEPS_013:_Gramps_Webapp|GEPS 013]] describes a web-based application that runs in your browser, and requires a server. A prototype is now on-line at http://gramps-connect.org/ which is running trunk on a sample database (id=admin1, password=gramps).
  
 
* [[DenominoViso]] plugin for GRAMPS is a third party plugin that creates an interactive graphical representation of a family tree. DenominoViso creates a grapical webpage in SVG/XHTML/javascript.
 
* [[DenominoViso]] plugin for GRAMPS is a third party plugin that creates an interactive graphical representation of a family tree. DenominoViso creates a grapical webpage in SVG/XHTML/javascript.
  
 
* [[Gramps-tweet]], an Addon mashup between Gramps and Twitter.
 
* [[Gramps-tweet]], an Addon mashup between Gramps and Twitter.
 +
 +
* [[#Collaborative indexes|Scrapy]]
 +
 +
* [http://www.newsblur.com/ NewsBlur], etc ...
  
 
===XQuery===
 
===XQuery===
  
:"Or something close to SQL like XQuery so you can do querys on gramps xml database similar to SQL Query. It can works even in internet browser thru plugins. XML is quite self-explanatory. [http://www.zorba-xquery.com Zorba] provide python bindings for XQuery."
+
:"Or something close to SQL like XQuery so you can do querys on Gramps XML database similar to SQL Query. It can works even in internet browser thru plugins. XML is quite self-explanatory. [http://www.zorba-xquery.com Zorba] provide python bindings for XQuery."
  
source: [http://sourceforge.net/mailarchive/message.php?msg_id=23856194 Archive (Oct 28, 2009) on gramps-user mailing list]
+
;source: [http://sourceforge.net/p/gramps/mailman/message/23856194/ Archive (Oct 28, 2009) on gramps-user mailing list]
  
  

Revision as of 16:36, 30 January 2020

Gramps-notes.png

Please use carefully on data that is backed up, and help make it better by reporting any comments or problems to the author, or issues to the bug tracker
Unless otherwise stated on this page, you can download this addon by following these instructions.
Please note that some Addons have prerequisites that need to be installed before they can be used.
This Addon/Plugin system is controlled by the Plugin Manager.


lxml Gramplet is an experimental gramplet working under POSIX platform(s), which reads, writes (not the original one; safe read only state), transforms content of our Gramps XML file on the fly without an import into our database (Gramps session).

Includes the etree Gramplet for testing the Python ElementTree module(etree) with Gramps XML.

Goals

The idea of this experimental lxml gramplet is to provide a way for using basic lxml features with Gramps XML files.

XPath, Xslt, XML dump, RelaxNG and XSD validations, can be used and done by lxml, which provides an API very close to etree ElementTree module from python 2.5 and later.

The experimental lxml gramplet aims to use these lxml features[1] by parsing a Gramps XML file generated by Gramps 3.4.x (or 3.3.x) and to generate an output sample, using open W3C standards (XML, Web design, Web services, etc ...).


[1] see also lxml.objectify

Usage

  • You can get a copy of this simple draft from Addon repository:

https://github.com/gramps-project/addons-source/tree/master/lxml

Currently, this addon quickly explores multiple ways. Feel free to modify for your own use.

For testing only, by design these actions are not for production.

Prerequisites

Before this can be used you will need the following prerequisites installed:

Both are known for good speed performances by using C-level (Cython).

Gramps XML file format


Screenshots

  1. Titles, labels and footer are translated (written on python code).
  2. Full separation of presentation and content for the generation.


  • Local output with custom XML data in buffer and XSLT transformation
Dynamic output


  • Local output without stylesheet
Dynamic output without stylesheet


  • View via HTML view
Within Gramps


  • Pseudo dynamic code generation (xml + xslt = html file)
Dynamic code geneartion


  • Action on surname (sort, remove duplicated)
Sorted surnames list


  • Action on place title (sort, enable cross search on place fields)
Sorted places list


  • Hardcoded list written in python and translated by Gramps into our locale (if translation exists)
Hardcoded list (gramps translations)


Further development

Bibliography gramplet ?

  • CherryTree is an hierarchical note taking application, featuring rich text and syntax highlighting, storing all the data (including images) in a single xml file with extension .ctd, which has planned to also implement an integration with zotero content.
  • Zim is a graphical text editor used to maintain a collection of wiki pages. All pages you create in zim are saved as plain text files with wiki formatting. This means that you can access your content with any other editor or file manager without being dependent on zim. You can even have your pages in a revision control system like CVS or use a Makefile to compile your notes into a webpage. Any images you add are just image files which are linked from the text files. This means that zim can call your standard programs to edit images. When you embed an image in a page the context menu for the image will offer to open it with whatever image manipulation programs you have installed. After editing you just reload the page to see the result. See also third party contributions.

Collaborative indexes

  • Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. It should support Gramps XML, Gramps CSV and Gramps JSON.

Clients library for FamilySearch API

Serialization for C client library or Objective C Client library is done in conjunction with libxml2.

Comments on DB API Idea

I was basically approaching it from the leave gen.lib alone and
implement a "fully blown" SimpleAccess-esque solution.

At the moment I basically have a 'DB' object which represents an open
database. This at the moment is populated from a Gramps XML file. This
is then basically stored as lxml.objectify objects. Internally a graph
structure is built to represent the linking inside the database (so
relationships and ref. integrity is made easier).

'DBItem' objects consist of the 'node' data, the basic save/delete
etc... Deleting an event automatically removes all other references to
it (which has caught me out previously).

class Person(DBItem):
    DBTYPE = 'person'

Basically registers an object that 'wraps' a basic DBItem, but
containing useful attributes/methods. So for a person, we can write
attributes such as .birth, .mother, .families etc... etc... It can also
over-ride how it should be saved/retrieved etc...

I chose this approach because it keeps the process incremental. We can
still access the 'raw' data in a DBItem for the stuff I'm not caring
about at the moment, but someone can write a 'Place' class later for
instance.

The DB itself is an xpath queryable object (adds a bit of flexibility
for selections that don't have convenient attributes as of yet).

I'll see if I can get the code example out this week.

Anyway, does this seem a reasonable approach? 

source: Archive (Dec 07, 2009) on gramps-devel mailing list

Database compare and merge

  • GrampsCompare.py, a python script for comparing data in 2 Gramps XML files.

source: Archive (Oct 02, 2011) on gramps-devel mailing list

Database backend

Data transfer

  • Akara is a platform for developing data services available on the Web, using REST architecture. Akara is open source software written in Python and C. eg, Recollection project for the Library of Congress. See the user guide or screencasts (shockwave flash) [2], [3], [4].

Environment

Faceted classification

HTML class

  • Gramps

Libhtml is an HTML/XML class for Gramps, see API.

  • Gtk3

GTK+3 provides an HTML backend that allows GTK applications to run natively within an HTML5 web navigator.

See sample1, sample2, sample3.

Interface

Performances

See Gramps performances for comparison on large datasets between different Gramps versions.

Web applications

  • GEPS 013 describes a web-based application that runs in your browser, and requires a server. A prototype is now on-line at http://gramps-connect.org/ which is running trunk on a sample database (id=admin1, password=gramps).
  • DenominoViso plugin for GRAMPS is a third party plugin that creates an interactive graphical representation of a family tree. DenominoViso creates a grapical webpage in SVG/XHTML/javascript.

XQuery

"Or something close to SQL like XQuery so you can do querys on Gramps XML database similar to SQL Query. It can works even in internet browser thru plugins. XML is quite self-explanatory. Zorba provide python bindings for XQuery."
source
Archive (Oct 28, 2009) on gramps-user mailing list