Gramps-Connect: Developer Introduction

From Gramps
Jump to: navigation, search
Gnome-important.png
This has been abandoned

Please see https://github.com/gramps-project/web-api for future attempts

Gramps developers are actively working on a version of Gramps that runs on a web server, and displays one's family tree information dynamically (as opposed to Gramps Narrated Web Report). This web application project is called Gramps-Connect. We have a working demonstration of this code available at http://gramps-connect.org. It is written in Django, a Python web framework.

This page gives an introduction to Django for those that are already familiar with Gramps data and Python. For more details on the specific Gramps Django interface, please see GEPS 013: Gramps Webapp.

Basic Django Structure

The motivation and getting familiar with this project is provided in GEPS 013: Gramps Webapp. This dives into Django, from a getting started perspective.

When you install and set up Django, all of the basic files are created for you with default settings. Django's online documentation and the google group are excellent references. To follow is an outline of Django's file structure, some important basics, and specific examples of ways to work with the Gramps dataset to create webpages.

In working with Gramps' Django files, there are four files that you may find yourself editing:

  1. gramps/webapp/urls.py
  2. gramps/webapp/grampsdb/views.py
  3. gramps/webapp/grampsdb/forms.py
  4. data/templates/

The template files have .html extensions and are found in the templates directory.

You will also find yourself referencing the gramps/webapp/grampsdb/models.py file, which defines the database structure. This file models the Gramps data as it exists in the desktop Gramps application and will not require alteration unless the database structure is changed. To access data from the database, you will often reference the models.py file and refer to the tables and fields it defines.

We will now go through each of these files, exploring the use and format of each.

urls.py

urls.py is the file in which you define of all of the website's urls and the corresponding code that will run in response to each particular url request. The file simply defines the variable urlpatterns:

   urlpatterns = patterns("", (r'^example/$', example_page),)

The above example maps the url www.mysite.com/example/ (with or without the trailing '/' -- Django will handle both the same) to the function example_page. Note that urlpatterns is a tuple, so you can add more urls to the list as such:

   urlpatterns += patterns("", 
     (r'^$', main_page),
     (r'^login/$', login_page),
   )

Urls are written as regular expressions so that you may define groups of urls that you want to map to the same function. In the above example r'^$' defines an empty string, indicating that the domain's base url is mapped to the function main_page. Django performs pattern matching, starting with the first defined url and working down the list until there is a match, at which point the corresponding code is run.

You may also capture some of the url to pass along to your code for processing. For example, with one line we defined the url for viewing each individual person record in the database as being www.mysite.com/person/xxxxx/ where xxxxx is the record's primary key:

   urlpatterns += patterns("", (r'^person/(\d+)/$',person_detail), )

The \d+ in the regular expression matches one or more digits and the parentheses indicate to Django to capture that portion of the url and pass it along as a string value to the function to which you've mapped the url (e.g., person_detail).

views.py

The functions that you map to your urls will exist in the views.py file. Each view function must take an http request object as its first parameter and must return an http response object. The following view function redirects the user to another page.

    def example_page(request):
        # do some processing here
        return HttpResponseRedirect('/gohere/')

Upon redirecting to another url, Django will process it as it does any request, using urls.py to map the url to a view function. Redirecting is useful, but more often you will want to define some data and pass it along to a web page for display. In this case Django has a nice shortcut through some of the repetitive and complex http context-rendering so that you can call the function render_to_response to pass your data to a template for rendering.

from django.shortcuts import render_to_response

def welcome_page(request):
    # perform any processing you need here 
    mydata = "the data I want to display"
    return render_to_response('welcome.html', {'MyData':mydata})

The above example passes data to the template welcome.html, which defines the page that will be displayed. The template name is the first parameter of render_to_response, and the second parameter is a dictionary mapping template variables to values. You can pass any number of variables, and you will access them in the template by the variable name in quotes.

To access data from the database, you will need to include models.py. The tables it defines are used like python classes. The following example is a person_detail view function that uses the captured portion of the url (see the last example above under urls.py) to find the relevant person record. The captured url data is passed to the function as the second parameter, and it doesn't matter what name you use here.

    from webapp.grampsdb.models import *

    def person_detail(request, ref):
        try: # check for valid input
            i = int(ref)
        except ValueError: 
            raise Http404('Invalid record number.')
        p = Person.objects.get(id=i)
        try:
            n = Name.objects.filter(person=i, private=0)
        except:
            n = 'unnamed person'
        # work with data and pass information to template
        return render_to_response('person_detail.html', {'Names':n, 'Handle': p.handle})

The captured data from a url is always passed as a string, so the function above (which expects an integer primary key) checks to be sure that it can be converted to an integer and raises an http error if it cannot. The table of people is named Person in models.py and names are in the table Name. To access their data, refer to them directly as objects. Every table has an id field, which is an automatically-created primary key. The example above assigns to the variable p a reference to the person record where id equals the captured information from the url (the requested record's primary key). Field names are accessed like properties: p.handle is the value of the field handle on that person record, and in the example it is passed to the template as the variable Handle. The example above also grabs name data by filtering the Name table for records where the person field equals the person id we're examining and where private is false. Filtering returns a dataset. Unicode methods may be written for each table in models.py to define return values. The models.py for Gramps Connect includes such definitions, so that when you refer to the name record, useful fields like first and last name are displayed.

Templates

A template defines the html, but it is more than just an html file. Django processes the file before passing it along as a response, and so there are a number of features that allow you to do much more. Template tags sit between sets of {% %} and are processed by Django as the html is rendered, giving you some control in defining at runtime what will be displayed. Variables passed to a template are identified within {{ }}, and there are a number of filters that you can apply to the data using pipes. There is a full list of filters and tags on the Django website.

Template inheritance is a huge time saver. With the extends tag at the top of the file, a template inherits everything from its parent. The parent template defines blocks that may be overridden by the child. Each block should be given a relevant and unique name. The advantage to using inheritance is that shared information exists in one location so that a site-wide change can be handled by changing one file. Since you can define default values for a block in the parent, it pays to use them gratuitously. Child templates can leave a block undefined so that the default values as defined in the parent will apply.

The following is a simple version of the file base.html, which is used as the base for all of the site's pages. Note that indentations in the template files have no effect -- they were added to make reading and editing easier.

<html>
<head>
{% comment %}
Here you might include all of your header information, style sheet
references, javascript, etc. This is a multiline comment.
{% endcomment %}
<title>{% block windowtitle %}GRAMPS Connect{% endblock %}</title>
</head>
<body>
    <h1>{% block pagetitle %}{% endblock %}</h1>
    {% block content %}{% endblock %}
    {# This is a single line comment #}
</body>
</html>

base.html above has three blocks, one of which has default information that will be used if a child does not supply anything to replace it. Here is a simple child template for a person details screen receiving the data defined in the last view function example:

    {% extends "base.html" %}
    {% block windowtitle %}Person Details{% endblock %}
    {% block pagetitle %}Person # {{Handle}}{% endblock %}

    {% block content %}
        Names for person #{{Handle}}:
        <ul>
           {% for nm in Names %}
               <li>{{nm}}</li>
           {% endfor %}
        </ul>
    {% endblock %}

The resulting html generated by Django adds the block information from the child to the html in base.html and replaces the variables with the data that the view function passed to the template. Django loops through the Names dataset, as per the for loop tag in the template, and the result is html displaying an unordered list of the names in the dataset.

forms.py

Sometimes you want website users to be able to enter and edit data. This requires a form, and Django has a system for generating and manipulating form data. Forms are defined in the forms module and view functions may use these definitions to create forms to pass along to a template. A form outlines the fields for the screen:

from Django import forms

class ContactForm(forms.Form):
        name = forms.CharField(max_length=100, required=False)
        email=forms.CharField()
        when=forms.DateTimeField()
        message=forms.TextField()

Most of the forms you will need to define will be based on data from the database. To save time, Django can define a form for you based on a model definition. Be sure to include models.py to access those definitions.

from webapp.grampsdb.models import *

class PersonForm(forms.ModelForm):
    class Meta:
            model = Person

This form will include all fields in the Person table, as defined in models.py. To include or exclude certain fields, add exclude=(fieldname,field2) or include=(fieldname,field2) to the class Meta. Be aware of the table fields' requirements before deciding which fields to include or exclude. The form will throw an error upon validating the data if required fields are left blank.

You can override or add to the default initialization by adding to the form class:

class PersonForm(forms.ModelForm):
         def __init__(self, *args, **kwargs):
            # do some processing here
            # the next line calls the default init for the class
            super(PersonForm, self).__init__(*args, **kwargs) 

You may also want to override the clean functions, which run when the data is checked for validity and before the form data is saved to the database. There is a clean function for the form, and there are clean functions for each field. This can be useful for times when the data requires some special processing before it is saved to the database. All clean functions must return cleaned_data, even if nothing was changed.

class PersonForm(forms.ModelForm):
    def clean(self):
        # do some processing
        # the next line calls the default clean function
        super(PersonForm, self).clean()
        return self.cleaned_data

    def clean_handle(self):
        # do some processing
        return self.cleaned_data['handle']

Sometimes you need the ability to add forms to the screen for more than one record in a dataset. Django has a built-in abstraction layer called formsets to create forms for datasets.

from django.forms.models import BaseModelFormSet

class NameForm(forms.ModelForm):
    class Meta:
        model = Name

class NameFormset(BaseModelFormSet):
    def __init__(self, *args, **kwargs):
        self.form = NameForm
        super(NameFormset, self).__init__(*args, **kwargs)

An inline formset is a formset designed to help work with related records. It isn't created with a class like a formset, but it does automatically handle the foreign key fields between related tables. At minimum, you need to supply the related table and the table with the records you want to edit. The example below also demonstrates how to limit the fields and how to specify that it use a previously-defined form (where you may have defined special clean functions, etc.).

from django.forms.models import inlineformset_factory

NameInlineFormSet = inlineformset_factory(Person, Name,
                              fields=('prefix','first_name', 'surname'),
                              form=NameForm)

Using Forms in your View Function

To use a form that you have defined, import the forms.py file in views.py and use it in your view function like any Python class. To create a form that displays a specific record from the database, pass the argument instance, which accepts a reference to a record.

from web.grampsdb.forms import *

def person_detail(request, ref):
    # displays the person record on the screen in an html form for editing 
    try: # check for valid input
        i = int(ref)
    except ValueError:
        raise Http404('Invalid record number.')
   
    p = Person.objects.get(id=i)
    psnform = PersonForm(instance=p)
    return render_to_response('person_detail.html', {'PForm':psnform})

To create a blank form for data entry of a new record, do not pass the form an instance:

def add_new_person(request):
    # displays a blank person record on the screen in an html form for adding
    psnform = PersonForm()
    return render_to_response('person_detail.html', {'PForm':psnform})

The same view function can accept the POST data that is returned when a website user edits the form and clicks the submit button. If there is POST data, it can be passed to a form object -- again with an instance argument referring to the person record being edited. Calling is_valid() on the form will run the clean functions and return true if the data can be saved. Calling save() or save(commit=false) will also cause the clean functions to run if they haven't already. The clean functions, when they come across errors, fill error properties for each field and for each form that has an error. This information is available to the template for display, so the view function does not need to do anything further about failed validation than to pass the form along to the template. Note: since our person details screen now needs to POST to the same url to save changes to the current record, the view function now passes the requested url to the template for use in the form's submission.

def person_detail(request, ref):
    # displays the person record on the screen in an html form for editing 
    try: # check for valid input
        i = int(ref)
    except ValueError:
        raise Http404('Invalid record number.')
    p = Person.objects.get(id=i)
    if request.method == 'POST':    # form submitted with changes
        # create a form with the POST data
        psnform = PersonForm(request.POST, instance=p)
        if psnform.is_valid():      # test validation rules
            psnform.save()
            # redirect after successful POST
            return HttpResponseRedirect('/success/') 
    else:     # request to view record
        psnform = PersonForm(instance=p)
    return render_to_response('person_detail.html',{'PForm':psnform, 'URL':request.path})

The add_new view function is handled in the same way, except that there is no instance to pass.

def add_new_person(request):
    # displays a blank person record on the screen in an html form for adding
    if request.method == 'POST':    # form submitted with data
        # create a form with the POST data
        psnform = PersonForm(request.POST)
        if psnform.is_valid():      # test validation rules
            psnform.save()
            # redirect after successful POST
            return HttpResponseRedirect('/success/') 
    else:     # request for a blank form
        psnform = PersonForm()
    return render_to_response('person_detail.html', {'PForm':psnform, 'URL':request.path})

There will be times when you'll want to use more than one form on the web page. To help keep forms and their POST data straight, assign each a unique prefix when you create them. When you pass an instance to an inline form, you pass the related record. In the example below, the requested person record is displayed along with all names related to that person. Also note that instead of redirecting to a success page after a successful data save, the user is directed back to the same data view/entry page. A screen message passed to the template can communicate that the record changes were successfully saved and can help with error messaging.

def person_detail(request, ref):
    try: # check for valid input
        i = int(ref)
    except ValueError:
        raise Http404('Invalid record number.')
    p = Person.objects.get(id=i)
    if request.method == 'POST':    # form submitted with changes
        # create forms with the POST data
        psnform = PersonForm(request.POST, instance=p, prefix='person')
        nmformset = NameInlineFormSet(request.POST, instance=p, prefix='names')
        if psnform.is_valid() and nmformset.is_valid():      # test validation rules
            psnform.save()
            nmformset.save()
            # successful POST
            return render_to_response('person_detail.html', {'PForm':psnform, 'NFormset':nmformset,
                                        'ScreenMsg':’Data was saved successfully’})
        scrn_msg = ‘Please correct the errors listed below’
    else:     # request to view record
        psnform = PersonForm(instance=p, prefix='person')
        nmformset = NameInlineFormSet(instance=p, prefix='names')
        scrn_msg=’’
    return render_to_response('person_detail.html', 
                              {'PForm':psnform, 'NFormset':nmformset,
                               'ScreenMsg': scrn_msg, 'URL':request.path})

Displaying Forms in Templates

When a form is passed to a template, all of its fields and properties are accessible. Django produces the html for the form input fields so that all you need to provide are the form tags and submit button. The simplest way to handle a form in a template is to let Django do most of the work:

<form action={{URL}} method="post">
    {{PForm}}
    <input type="submit" value="Save" />
</form>

You can specify .as_p (i.e. { { PForm.as_p } } in the example above) to format in paragraphs, .as_ul for an unordered list, and .as_table for a table. The following example loops through the form's fields to display each field (formatted as an input), the field's label, the field's help text where applicable, and the field's errors where not empty. A div refers to a custom css style designed for displaying errors.

<form action={{URL}}  method="post">
    <table>
    {% for field in PForm %}
    <tr>
        <td>{{field.label_tag}}:</td>
        <td>{{field}}</td>
        <td>{{field.help_text}}</td>
        <td><div id="errmsg" {{field.errors}}</div></td>
    </tr>
    {% endfor %}
    </table>
    <input type="submit" value="Save" />
</form>

Alternatively, you can choose to break out the components that you want to display and place them on the screen wherever you like (within the html form tags). You can refer to each field directly by name, and to each field's label, errors, and help text. Recall that the Django form was based on the database table as designed in models.py, so the field names will be the same as those in the table. Referring directly to form.fieldname.data provides the data in raw form (not formatted as an input box), which can be useful for fields that you want to display but do not want users to edit -- just remember to add a hidden input for the field so that it is included in the POST. Any field designed into the Django form that is not included in the POST data is assumed to be blank when the record is saved. Note the formatting for the hidden field. The label includes the prefix assigned to the form in the view function. This mimics the format that Django uses when it creates the input field html, assuring that Django will be able to map the POST data correctly to the database field.

    
<form action={{URL}}  method="post">
    <ul>
        <li>{{PForm.handle}} {{PForm.handle.errors}}</li>
        <li>{{PForm.gender_type}} {{PForm. gender_type.errors}}</li>
        <li>Last changed on {{PForm.last_changed.data}}</li>
    </ul>
    <input type="hidden" name="person-last_changed" id="id_person-last_changed" />
    <input type="submit" value="Save" />
</form>

Formsets are just a bit more complicated, since the formset consists of multiple forms. As with forms, formsets can handle themselves:

<form action={{URL}}  method="post">
    {{NFormset}}
    <input type="submit" value="Save" />
</form>

Formsets include a management form that keeps track of the number of forms in the set and other management information, so if you decide to break a formset down for display, you must include the management form, otherwise the POST will cause data validation errors. Within the formset, you can refer to each form just like any other form. As a whole:

<form action={{URL}}  method="post">
    {{ NFormset.management_form }}
    {% for form in Nformset.forms %}
        {{form}}
    {% endfor %}
    <input type="submit" value="Save" />
</form>

Or you can break down each form as you like, either looping through the fields or referring to each one directly.

<form action={{URL}}  method="post" >
    {{ NFormset.management_form }}
    {% for form in Nformset.forms %}
       {% for field in form %}
           {{field}} <div id="errmsg"> {{field.errors}}</div>
       {% endfor %}
    {% endfor %}
    <input type="submit" value="Save" />
</form>
 

The example below uses a table and loops to create a tabular display of fields in the formset. Note the extra loop through the hidden fields in each form of the formset, which isn't necessary on a simple dataset but is a good way to avoid missing POST data.

    <form action={{URL}}  method=”post”>
    {{ NFormset.management_form }}
    <table>
        {% for form in NFormset.forms %}
            {% if forloop.first %}
                <tr> {# header row - field names #}
                {% for field in form.visible_fields %} 
                    <th>{{field.label_tag}}</th>
                {% endfor %}
                </tr>
            {% endif %}
            <tr><td> {# hidden fields for the form #}
            {% for hidden in form.hidden_fields %}{{hidden }}{% endfor %}
            </td></tr>
            <tr> {# visible fields for the form #}
            {% for field in form.visible_fields %} 
                <td>{{field}}</td>
            {% endfor %}
            </tr>
        {% endfor %}
    </table>
    <input type=”submit” value=”Save” />
</form>

Having used prefixes in our view function when creating multiple Django forms to pass to the template, we can place them all inside one set of html form tags for submission together in one POST transaction.

    <form action={{URL}}  method="post">
        {{PForm}}
        {{NFormset}}
        <input type="submit" value="Save" />
    </form>

A Note on Error-Checking Forms

When it came to working with the name formset, we wanted to ensure that there was at least one name for the person and that one and only one name was marked as preferred. Since the view function uses an an inline formset, which is not defined as a class, error-checking became problematic. After much trial and error, the solution was to write an error-checking function outside the form classes (but it made sense to store it in the forms.py file). Since cleaned_data is only populated after running the clean functions and only if the data is clean, the data is accessed in a slightly different manner. A string is returned with an error message (or an empty string if all passed).

    def cleanPreferred(fmst):
        # tests for >= 1 name record and 1 Preferred
        ctPref = 0
        ctName = 0
        for i in range (0,fmst.total_form_count()):
            form = fmst.forms[i]
            try: # when preferred is false, its value is not in the form data
                if form.data[fmst.prefix + '-' + str(i) + '-preferred'] == 'on':
                    val = 1
                else:
                    val = 0
            except:
                val = 0
            ctPref += val
            ctName += len(form.data[fmst.prefix + '-' + str(i) + '-surname'])
            ctName += len(form.data[fmst.prefix + '-' + str(i) + '-first_name'])

        if ctName < 1:
            return "Error: Each person must have at least one name."
        elif ctPref != 1:
            return "Error: Exactly one name may be the preferred name."
        else:
            return ""

This error-checking function is called in the view function just prior to checking is_valid on the name formset, and the result is included with validation testing. The return value is also passed to the template for display (a blank is passed when there is no POST).

    [snip]
    NamesetError = cleanPreferred(nmformset) # check for clean nameset
    if psnform.is_valid() and nmformset.is_valid() and NamesetError == "":
        # test validation rules
        # save forms, etc.
    [snip]

A Little Javascript

One of the javascript functions added to the display screen handles changing the display icon for the private fields so that the lock and unlock image files from the Gramps application can be used in place of the default checkbox used by Django for Boolean fields. First, though, the child template needs to handle loading the appropriate image depending on the field's current value.

   <div id="imgPPrivate" 
        style="display: {% if PForm.private.data %}block{% else %}none{% endif %}">
    <img onclick="clickPrivate('id_person-private','imgPPrivate','imgPNotPrivate');" 
         src="/images/gramps-lock.png" alt="Private" />
   </div>

   <div id="imgPNotPrivate" 
        style="display: {% if PForm.private.data %}none{% else %}block{% endif %}">
    <img onclick="clickPrivate('id_person-private','imgPPrivate','imgPNotPrivate');" 
         src="/images/gramps-unlock.png" alt="Not Private" />
   </div>

The onclick actions call the javascript function clickPrivate, which is written into the base.html file. So that this function can be used for any number of private fields on the screen, this function is written so that it will take the id of the page element calling it.

{# clickPrivate enables the use of the lock & unlock #}
{# image files in place of a checkbox #}

function clickPrivate(field,imgTrue,imgFalse){
    var p = document.getElementById(field);
    var ip = document.getElementById(imgTrue);
    var inp = document.getElementById(imgFalse);

    if (ip.style.display == "block") {
        inp.style.display = "block";
        ip.style.display ="none";
        p.checked = "";
    } else {
        ip.style.display = "block";
        inp.style.display ="none";
        p.checked = "checked";
    }
}


Privacy

Currently, these are developer notes.

One of the most important aspects of an on-line version of Gramps is the protection of sensitive data. There are two categories of sensitive data:

  1. data marked private
  2. people currently alive

Gramps core has developed a layered proxy wrapper system for protection of this sensitive data for reports. Gramps-Connect can use this, but only in limited fashion as we wish to talk directly to the databases in certain places:

  1. running a search query
  2. editing data using Django Forms

Gramps-Connect uses name-replacement to protect living people, and skips over private data. This strategy is generally how many genealogy sites work, but is also technically the easiest/fastest to use. Living people are determined by using the function Utils.probably_alive. This function is recursive and generally expensive in the number of queries which must be executed to determine if a person does not have a death event. Thus, it is advantageous to not run that query until you know you want to show some details of that person. On the other hand, private data can be determined easily as they have a property that directly reflects this value. Running a query to skip over private records can be initiated easily:

                object_list = Family.objects \
                    .filter((Q(gramps_id__icontains=search) |
                             Q(family_rel_type__name__icontains=search) |
                             Q(father__name__surname__istartswith=search) |
                             Q(mother__name__surname__istartswith=search)) &
                            Q(private=False) &
                            Q(father__private=False) &
                            Q(mother__private=False) 
                            ) \
                    .order_by("gramps_id")

It would make much sense for probably_alive to be cached and stored on the person record, once it is determined. If that occurred, then living people could be excluded in top-level queries as are private data.