Difference between revisions of "Generate XML"

From Gramps
Jump to: navigation, search
m (Why doesn't GRAMPS just use a .gz extension?: Clarify pattern/content)
(Why doesn't GRAMPS just use a .gz extension?)
Line 39: Line 39:
 
GRAMPS uses the [http://freedesktop.org/wiki/Software/shared-mime-info Shared Mime System] defined by [http://freedesktop.org Free Desktop] project, and used by all major desktops, including KDE and GNOME.  GRAMPS relies on the MIME type identified by the Shared Mime System to determine the file type of the file.
 
GRAMPS uses the [http://freedesktop.org/wiki/Software/shared-mime-info Shared Mime System] defined by [http://freedesktop.org Free Desktop] project, and used by all major desktops, including KDE and GNOME.  GRAMPS relies on the MIME type identified by the Shared Mime System to determine the file type of the file.
  
The Share Mime System allows you to identify a file's type by either using a file extension or by looking at the contents of a small section of the file. The first problem is, usually the filename/extension pattern has the higher priority compared to the contents: if the file is name <code>something.jpg</code> then it is likely to be JPEG image, not text. The second problem is, if we looked at the contents, we would not be able to tell the difference between a gzip'd GRAMPS XML file or any other gzip'd file. If we looked at uncompressed data, we would not be able to tell the difference between a GRAMPS XML file and other XML files. For these reasons, we must rely on the <code>.gramps</code> extension.
+
The Share Mime System allows you to identify a file's type by either using a file extension or by looking at the contents of a small section of the file. The first problem is, usually the filename or extension pattern has the higher priority compared to the contents: if the file is named <code>something.jpg</code> then it is likely to be JPEG image, not text. The second problem is, if we looked at the contents, we would not be able to tell the difference between a gzip'd GRAMPS XML file or any other gzip'd file. If we looked at uncompressed data, we would not be able to tell the difference between a GRAMPS XML file and other XML files. For these reasons, we must rely on the <code>.gramps</code> extension.
  
 
If the GRAMPS XML file had added .gz extension to the name, the Shared Mime system would tell us that the file's type is <code>application/x-gzip</code> instead of the expected <code>application/x-gramps-xml</code>. Unfortunately, it cannot tell us that it is a <code>gzip'd</code> GRAMPS XML file. So, we would not be able to tell if this was a valid file. Even worse, the mime type of <code>application/x-gzip</code> would be associated with another application (such as File Roller or Ark) instead of GRAMPS.
 
If the GRAMPS XML file had added .gz extension to the name, the Shared Mime system would tell us that the file's type is <code>application/x-gzip</code> instead of the expected <code>application/x-gramps-xml</code>. Unfortunately, it cannot tell us that it is a <code>gzip'd</code> GRAMPS XML file. So, we would not be able to tell if this was a valid file. Even worse, the mime type of <code>application/x-gzip</code> would be associated with another application (such as File Roller or Ark) instead of GRAMPS.

Revision as of 03:39, 7 July 2007


GRAMPS and XML

GRAMPS is capable of importing and exporting an XML file that contains all the information in the database. This file is useful for transferring data from one machine to another or for XML processing.

Generating XML

The easiest way to generate an XML file is to export the data. This can be done from the File->Export menu. This will generate a file with a .gramps extension. This file is usually a gzip'd XML file (depending on some system settings, sometimes this will be an uncompressed XML file).

GRAMPS compresses the file because XML files can become rather large. For large databases, this file could grow to 10s to 100s of megabytes in size. Fortunately, XML files compress nicely, usually producing a fairly small size.

How do I tell if the XML file is compressed?

The easiest way is to run the file command on it.

 $ file data.gramps

If the file is compressed, you should see a result similar to:

 data.gramps: gzip compressed data, from Unix, last modified: Sun Jun 17 22:36:04 2007

If it is uncompressed, you should see a result similar to:

 data.gramps: XML 1.0 document text

How do I uncompress the file?

If the file is compressed, you can uncompress it using the gunzip command.

 $ gunzip < data.gramps > data.xml

This example creates an uncompressed data.xml file from the compressed data.gramps file.

You must use the I/O redirection operators (">" and "<"), since gzip expects files to have a .gz extension.

Why doesn't GRAMPS just use a .gz extension?

GRAMPS uses the Shared Mime System defined by Free Desktop project, and used by all major desktops, including KDE and GNOME. GRAMPS relies on the MIME type identified by the Shared Mime System to determine the file type of the file.

The Share Mime System allows you to identify a file's type by either using a file extension or by looking at the contents of a small section of the file. The first problem is, usually the filename or extension pattern has the higher priority compared to the contents: if the file is named something.jpg then it is likely to be JPEG image, not text. The second problem is, if we looked at the contents, we would not be able to tell the difference between a gzip'd GRAMPS XML file or any other gzip'd file. If we looked at uncompressed data, we would not be able to tell the difference between a GRAMPS XML file and other XML files. For these reasons, we must rely on the .gramps extension.

If the GRAMPS XML file had added .gz extension to the name, the Shared Mime system would tell us that the file's type is application/x-gzip instead of the expected application/x-gramps-xml. Unfortunately, it cannot tell us that it is a gzip'd GRAMPS XML file. So, we would not be able to tell if this was a valid file. Even worse, the mime type of application/x-gzip would be associated with another application (such as File Roller or Ark) instead of GRAMPS.

GRAMPS is not unique in this problem. For example, the OpenDocument format used by OpenOffice, KWord and AbiWord is actually a collection of files in a zip archive. If you run unzip on a OpenDocument file, you will see something like:

 $ unzip test.odt
 Archive:  test.odt
   inflating: mimetype                
   inflating: meta.xml                
   inflating: settings.xml            
   inflating: META-INF/manifest.xml   
   inflating: styles.xml              
   inflating: content.xml