Welcome to CommCareHQ’s documentation!¶
Contents:
Reporting¶
- A report is
- a logical grouping of indicators with common config options (filters etc)
The way reports are produced in CommCare is still evolving so there are a number of different frameworks and methods for generating reports. Some of these are legacy frameworks and should not be used for any future reports.
Recommended approaches for building reports¶
TODO: SQL reports, Elastic reports, Custom case lists / details,
Things to keep in mind:
- report API
- Fluff
- Ctable
- sqlagg
- couchdbkit-aggregate (legacy)
Example Custom Report Scaffolding¶
class MyBasicReport(GenericTabularReport, CustomProjectReport):
name = "My Basic Report"
slug = "my_basic_report"
fields = ('corehq.apps.reports.filters.dates.DatespanFilter',)
@property
def headers(self):
return DataTablesHeader(DataTablesColumn("Col A"),
DataTablesColumnGroup(
"Group 1",
DataTablesColumn("Col B"),
DataTablesColumn("Col C")),
DataTablesColumn("Col D"))
@property
def rows(self):
return [
['Row 1', 2, 3, 4],
['Row 2', 3, 2, 1]
]
Hooking up reports to CommCare HQ¶
Custom reports can be configured in code or in the database. To configure custom reports in code follow the following instructions.
First, you must add the app to HQ_APPS in settings.py. It must have an __init__.py and a models.py for django to recognize it as an app.
Next, add a mapping for your domain(s) to the custom reports module root to the DOMAIN_MODULE_MAP variable in settings.py.
Finally, add a mapping to your custom reports to __init__.py in your custom reports submodule:
from myproject import reports
CUSTOM_REPORTS = (
('Custom Reports', (
reports.MyCustomReport,
reports.AnotherCustomReport,
)),
)
Reporting on data stored in SQL¶
As described above there are various ways of getting reporting data into and SQL database. From there we can query the data in a number of ways.
Extending the SqlData
class¶
The SqlData
class allows you to define how to query the data
in a declarative manner by breaking down a query into a number of components.
-
class
corehq.apps.reports.sqlreport.
SqlData
(config=None)[source]¶ -
columns
¶ Returns a list of Column objects. These are used to make up the from portion of the SQL query.
-
filter_values
¶ Return a dict mapping the filter keys to actual values e.g. {“enddate”: date(2013, 1, 1)}
-
filters
¶ Returns a list of filter statements. Filters are instances of sqlagg.filters.SqlFilter. See the sqlagg.filters module for a list of standard filters.
e.g. [EQ(‘date’, ‘enddate’)]
-
group_by
¶ Returns a list of ‘group by’ column names.
-
keys
¶ The list of report keys (e.g. users) or None to just display all the data returned from the query. Each value in this list should be a list of the same dimension as the ‘group_by’ list. If group_by is None then keys must also be None.
These allow you to specify which rows you expect in the output data. Its main use is to add rows for keys that don’t exist in the data.
- e.g.
- group_by = [‘region’, ‘sub_region’] keys = [[‘region1’, ‘sub1’], [‘region1’, ‘sub2’] ... ]
-
table_name
= None¶ The name of the table to run the query against.
-
This approach means you don’t write any raw SQL. It also allows you to easily include or exclude columns, format column values and combine values from different query columns into a single report column (e.g. calculate percentages).
In cases where some columns may have different filter values e.g. males vs females, sqlagg will handle executing the different queries and combining the results.
This class also implements the corehq.apps.reports.api.ReportDataSource
.
See Report API and sqlagg for more info.
e.g.
class DemoReport(SqlTabularReport, CustomProjectReport):
name = "SQL Demo"
slug = "sql_demo"
fields = ('corehq.apps.reports.filters.dates.DatespanFilter',)
# The columns to include the the 'group by' clause
group_by = ["user"]
# The table to run the query against
table_name = "user_report_data"
@property
def filters(self):
return [
BETWEEN('date', 'startdate', 'enddate'),
]
@property
def filter_values(self):
return {
"startdate": self.datespan.startdate_param_utc,
"enddate": self.datespan.enddate_param_utc,
"male": 'M',
"female": 'F',
}
@property
def keys(self):
# would normally be loaded from couch
return [["user1"], ["user2"], ['user3']]
@property
def columns(self):
return [
DatabaseColumn("Location", SimpleColumn("user_id"), format_fn=self.username),
DatabaseColumn("Males", CountColumn("gender"), filters=self.filters+[EQ('gender', 'male')]),
DatabaseColumn("Females", CountColumn("gender"), filters=self.filters+[EQ('gender', 'female')]),
AggregateColumn(
"C as percent of D",
self.calc_percentage,
[SumColumn("indicator_c"), SumColumn("indicator_d")],
format_fn=self.format_percent)
]
_usernames = {"user1": "Location1", "user2": "Location2", 'user3': "Location3"} # normally loaded from couch
def username(self, key):
return self._usernames[key]
def calc_percentage(num, denom):
if isinstance(num, Number) and isinstance(denom, Number):
if denom != 0:
return num * 100 / denom
else:
return 0
else:
return None
def format_percent(self, value):
return format_datatables_data("%d%%" % value, value)
Using the sqlalchemy API directly¶
TODO
Report API¶
Part of the evolution of the reporting frameworks has been the development of a report api. This is essentially just a change in the architecture of reports to separate the data from the display. The data can be produced in various formats but the most common is an list of dicts.
e.g.
data = [
{
'slug1': 'abc',
'slug2': 2
},
{
'slug1': 'def',
'slug2': 1
}
...
]
This is implemented by creating a report data source class that extends
corehq.apps.reports.api.ReportDataSource
and overriding the
get_data()
function.
-
class
corehq.apps.reports.api.
ReportDataSource
(config=None)[source]¶
These data sources can then be used independently or the CommCare reporting user interface and can also be reused for multiple use cases such as displaying the data in the CommCare UI as a table, displaying it in a map, making it available via HTTP etc.
An extension of this base data source class is the corehq.apps.reports.sqlreport.SqlData
class which simplifies creating data sources that get data by running
an SQL query. See section on SQL reporting for more info.
e.g.
class CustomReportDataSource(ReportDataSource):
def get_data(self):
startdate = self.config['start']
enddate = self.config['end']
...
return data
config = {'start': date(2013, 1, 1), 'end': date(2013, 5, 1)}
ds = CustomReportDataSource(config)
data = ds.get_data()
Adding dynamic reports¶
Domains support dynamic reports. Currently the only verison of these are maps reports. There is currently no documentation for how to use maps reports. However you can look at the drew or aaharsneha domains on prod for examples.
How pillow/fluff work¶
Note: This should be rewritten, I wrote it when I was first trying to understand how fluff works.
A Pillow provides the ability to listen to a database, and on changes, the class
BasicPillow calls change_transform and passes it the changed doc dict. This
method can process the dict and transform it, or not. The result is then
passed to the method change_transport
, which must be implemented in any
subclass of BasicPillow
. This method is responsible for acting upon the
changes.
In fluff’s case, it stores an indicator document with some data calculated from a particular type of doc. When a relevant doc is updated, the calculations are performed. The diff between the old and new indicator docs is calculated, and sent to the db to update the indicator doc.
fluff’s Calculator object auto-detects all methods that are decorated by subclasses of base_emitter and stores them in a _fluff_emitters array. This is used by the calculate method to return a dict of emitter slugs mapped to the result of the emitter function (called with the newly updated doc) coerced to a list.
to rephrase: fluff emitters accept a doc and return a generator where each element corresponds to a contribution to the indicator
API¶
Bulk User Resource¶
bulk_user
v0.5
This resource is used to get basic user data in bulk, fast. This is especially useful if you need to get, say, the name and phone number of every user in your domain for a widget.
Currently the default fields returned are:
id
email
username
first_name
last_name
phone_numbers
Supported Parameters:¶
q
- query stringlimit
- maximum number of results returnedoffset
- Use withlimit
to paginate resultsfields
- restrict the fields returned to a specified set
Example query string:
?q=foo&fields=username&fields=first_name&fields=last_name&limit=100&offset=200
This will return the first and last names and usernames for users matching the query “foo”. This request is for the third page of results (200-300)
Reporting: Maps in HQ¶
What is the “Maps Report”?¶
We now have map-based reports in HQ. The “maps report” is not really a report, in the sense that it does not query or calculate any data on its own. Rather, it’s a generic front-end visualization tool that consumes data from some other place... other places such as another (tabular) report, or case/form data (work in progress).
To create a map-based report, you must configure the map report template with specific parameters. These are:
data_source
– the backend data source which will power the report (required)display_config
– customizations to the display/behavior of the map itself (optional, but suggested for anything other than quick prototyping)
There are two options for how this configuration actually takes place:
- via a domain’s “dynamic reports” (see Adding dynamic reports), where you can create specific configurations of a generic report for a domain
- subclass the map report to provide/generate the config parameters. You should not need to subclass any code functionality. This is useful for making a more permanent map configuration, and when the configuration needs to be dynamically generated based on other data or domain config (e.g., for CommTrack)
Orientation¶
Abstractly, the map report consumes a table of data from some source. Each row of the table is a geographical feature (point or region). One column is identified as containing the geographical data for the feature. All other columns are arbitrary attributes of that feature that can be visualized on the map. Another column may indicate the name of the feature.
The map report contains, obviously, a map. Features are displayed on the map, and may be styled in a number of ways based on feature attributes. The map also contains a legend generated for the current styling. Below the map is a table showing the raw data. Clicking on a feature or its corresponding row in the table will open a detail popup. The columns shown in the table and the detail popup can be customized.
Attribute data is generally treated as either being numeric data or enumerated data (i.e., belonging to a number of discrete categories).
Strings are inherently treated as enum data.
Numeric data can be treated as enum data be specifying thresholds: numbers will be mapped to enum ‘buckets’ between consecutive thresholds (e.g, thresholds of 10
, 20
will create enum categories: < 10
, 10-20
, > 20
).
Styling¶
Different aspects of a feature’s marker on the map can be styled based on its attributes. Currently supported visualizations (you may see these referred to in the code as “display axes” or “display dimensions”) are:
- varying the size (numeric data only)
- varying the color/intensity (numeric data (color scale) or enum data (fixed color palette))
- selecting an icon (enum data only)
Size and color may be used concurrently, so one attribute could vary size while another varies the color... this is useful when the size represents an absolute magnitude (e.g., # of pregnancies) while the color represents a ratio (% with complications). Region features (as opposed to point features) only support varying color.
A particular configuration of visualizations (which attributes are mapped to which display axes, and associated styling like scaling, colors, icons, thresholds, etc.) is called a metric. A map report can be configured with many different metrics. The user selects one metric at a time for viewing. Metrics may not correspond to table columns one-to-one, as a single column may be visualized multiple ways, or in combination with other columns, or not at all (shown in detail popup only). If no metrics are specified, they will be auto-generated from best guesses based on the available columns and data feeding the report.
There are several sample reports that comprehensively demo the potential styling options:
Data Sources¶
Set this config on the data_source
property.
It should be a dict
with the following properties:
geo_column
– the column in the returned data that contains the geo point (default:"geo"
)adapter
– which data adapter to use (one of the choices below)- extra arguments specific to each data adapter
Note that any report filters in the map report are passed on verbatim to the backing data source.
One column of the data returned by the data source must be the geodata (in geo_column
).
For point features, this can be in the format of a geopoint xform question (e.g, 42.366 -71.104
).
The geodata format for region features is outside the scope of the document.
report
¶
Retrieve data from a ReportDataSource
(the abstract data provider of Simon’s new reporting framework – see Report API)
Parameters:
report
– fully qualified name ofReportDataSource
classreport_params
–dict
of static config parameters for theReportDataSource
(optional)
legacyreport
¶
Retrieve data from a GenericTabularReport
which has not yet been refactored to use Simon’s new framework.
Not ideal and should only be used for backwards compatibility.
Tabular reports tend to return pre-formatted data, while the maps report works best with raw data (for example, it won’t know 4%
or 30 mg
are numeric data, and will instead treat them as text enum values). Read more.
Parameters:
report
– fully qualified name of tabular report view class (descends fromGenericTabularReport
)report_params
–dict
of static config parameters for theReportDataSource
(optional)
case
¶
Pull case data similar to the Case List.
(In the current implementation, you must use the same report filters as on the regular Case List report)
Parameters:
geo_fetch
– a mapping of case types to directives of how to pull geo data for a case of that type. Supported directives:- name of case property containing the
geopoint
data "link:xxx"
wherexxx
is the case type of a linked case; the adapter will then serach that linked case for geo-data based on the directive of the linked case type (not supported yet)
In the absence of any directive, the adapter will first search any linked
Location
record (not supported yet), then try thegps
case property.- name of case property containing the
csv
and geojson
¶
Retrieve static data from a csv or geojson file on the server (only useful for testing/demo– this powers the demo reports, for example).
Display Configuration¶
Set this config on the display_config
property.
It should be a dict
with the following properties:
(Whenever ‘column’ is mentioned, it refers to a column slug as returned by the data adapter)
All properties are optional. The map will attempt sensible defaults.
name_column
– column containing the name of the row; used as the header of the detail popupcolumn_titles
– a mapping of columns to display titles for each columndetail_columns
– a list of columns to display in the detail popuptable_columns
– a list of columns to display in the data table below the mapenum_captions
– display captions for enumerated values. Adict
where each key is a column and each value is anotherdict
mapping enum values to display captions. These enum values reflect the results of any transformations frommetrics
(including_other
,_null
, and-
).numeric_format
– a mapping of columns to functions that apply the appropriate numerical formatting for that column. Expressed as the body of a function that returns the formatted value (return
statement required!). The unformatted value is passed to the function as the variablex
.detail_template
– an underscore.js template to format the content of the detail popupmetrics
– define visualization metrics (see Styling). An array of metrics, where each metric is adict
like so:auto
– column. Auto-generate a metric for this column with no additional manual input. Uses heuristics to determine best presentation format.
OR
title
– metric title in sidebar (optional)
AND one of the following for each visualization property you want to control
size
(static) – set the size of the marker (radius in pixels)size
(dynamic) – vary the size of the marker dynamically. A dict in the format:column
– column whose data to vary bybaseline
– value that should correspond to a marker radius of 10pxmin
– min marker radius (optional)max
– max marker radius (optional)
color
(static) – set the marker color (css color value)color
(dynamic) – vary the color of the marker dynamically. A dict in the format:column
– column whose data to vary bycategories
– for enumerated data; a mapping of enum values to css color values. Mapping key may also be one of these magic values:_other
: a catch-all for any value not specified_null
: matches rows whose value is blank; if absent, such rows will be hidden
colorstops
– for numeric data. Creates a sliding color scale. An array of colorstops, each of the format[<value>, <css color>]
.thresholds
– (optional) a helper to convert numerical data into enum data via “buckets”. Specify a list of thresholds. Each bucket comprises a range from one threshold up to but not including the next threshold. Values are mapped to the bucket whose range they lie in. The “name” (i.e., enum value) of a bucket is its lower threshold. Values below the lowest threshold are mapped to a special bucket called"-"
.
icon
(static) – set the marker icon (image url)icon
(dynamic) – vary the icon of the marker dynamically. A dict in the format:column
– column whose data to vary bycategories
– as incolor
, a mapping of enum values to icon urlsthresholds
– as incolor
size
andcolor
may be combined (such as one column controlling size while another controls the color).icon
must be used on its own.For date columns, any relevant number in the above config (
thresholds
,colorstops
, etc.) may be replaced with a date (in ISO format).
Raw vs. Formatted Data¶
Consider the difference between raw and formatted data.
Numbers may be formatted for readability (12,345,678
, 62.5%
, 27 units
); enums may be converted to human-friendly captions; null values may be represented as --
or n/a
.
The maps report works best when it has the raw data and can perform these conversions itself.
The main reason is so that it may generate useful legends, which requires the ability to appropriately format values that may never appear in the report data itself.
There are three scenarios of how a data source may provide data:
(worst) only provide formatted data
Maps report cannot distinguish numbers from strings from nulls. Data visualizations will not be useful.
(sub-optimal) provide both raw and formatted data (most likely via the
legacyreport
adapter)Formatted data will be shown to the user, but maps report will not know how to format data for display in legends, nor will it know all possible values for an enum field – only those that appear in the data.
(best) provide raw data, and explicitly define enum lists and formatting functions in the report config
UI Helpers¶
There are a few useful UI helpers in our codebase which you should be aware of. Save time and create consistency.
Paginated CRUD View¶
Use corehq.apps.hqwebapp.views.CRUDPaginatedViewMixin the with a TemplateView subclass (ideally one that also subclasses corehq.apps.hqwebapp.views.BasePageView or BaseSectionPageView) to have a paginated list of objects which you can create, update, or delete.
The Basic Paginated View¶
In its very basic form (a simple paginated view) it should look like:
class PuppiesCRUDView(BaseSectionView, CRUDPaginatedMixin):
# your template should extend style/bootstrap2/base_paginated_crud.html
template_name = 'puppyapp/paginated_puppies.html
# all the user-visible text
limit_text = "puppies per page"
empty_notification = "you have no puppies"
loading_messagge = "loading_puppies"
# required properties you must implement:
@property
def parameters(self):
"""
Specify a GET or POST from an HttpRequest object.
"""
# Usually, something like:
return self.request.POST if self.request.method == 'post' else self.request.GET
@property
def total(self):
# How many documents are you paginating through?
return Puppy.get_total()
@property
def column_names(self):
# What will your row be displaying?
return [
"Name",
"Breed",
"Age",
]
@property
def paginated_list(self):
"""
This should return a list (or generator object) of data formatted as follows:
[
{
'itemData': {
'id': <id of item>,
<json dict of item data for the knockout model to use>
},
'template': <knockout template id>
}
]
"""
for puppy in Puppy.get_all():
yield {
'itemData': {
'id': puppy._id,
'name': puppy.name,
'breed': puppy.breed,
'age': puppy.age,
},
'template': 'base-puppy-template',
}
def post(self, *args, **kwargs):
return self.paginated_crud_response
The template should use knockout templates to render the data you pass back to the view. Each template will have access to everything inside of itemData. Here’s an example:
{% extends 'style/bootstrap2/base_paginated_crud.html' %}
{% block pagination_templates %}
<script type="text/html" id="base-puppy-template">
<td data-bind="text: name"></td>
<td data-bind="text: breed"></td>
<td data-bind="text: age"></td>
</script>
{% endblock %}
Allowing Creation in your Paginated View¶
If you want to create data with your paginated view, you must implement the following:
class PuppiesCRUDView(BaseSectionView, CRUDPaginatedMixin):
...
def get_create_form(self, is_blank=False):
if self.request.method == 'POST' and not is_blank:
return CreatePuppyForm(self.request.POST)
return CreatePuppyForm()
def get_create_item_data(self, create_form):
new_puppy = create_form.get_new_puppy()
return {
'newItem': {
'id': new_puppy._id,
'name': new_puppy.name,
'breed': new_puppy.breed,
'age': new_puppy.age,
},
# you could use base-puppy-template here, but you might want to add an update button to the
# base template.
'template': 'new-puppy-template',
}
The form returned in get_create_form() should make use of crispy forms.
from django import forms
from crispy_forms.helper import FormHelper
from crispy_forms.layout import Layout
from crispy_forms.bootstrap import StrictButton, InlineField
class CreatePuppyForm(forms.Form):
name = forms.CharField()
breed = forms.CharField()
dob = forms.DateField()
def __init__(self, *args, **kwargs):
super(CreatePuppyForm, self).__init__(*args, **kwargs)
self.helper = FormHelper()
self.helper.form_style = 'inline'
self.helper.form_show_labels = False
self.helper.layout = Layout(
InlineField('name'),
InlineField('breed'),
InlineField('dob'),
StrictButton(
mark_safe('<i class="icon-plus"></i> %s' % "Create Puppy"),
css_class='btn-success',
type='submit'
)
)
def get_new_puppy(self):
# return new Puppy
return Puppy.create(self.cleaned_data)
Allowing Updating in your Paginated View¶
If you want to update data with your paginated view, you must implement the following:
class PuppiesCRUDView(BaseSectionView, CRUDPaginatedMixin):
...
def get_update_form(self, initial_data=None):
if self.request.method == 'POST' and self.action == 'update':
return UpdatePuppyForm(self.request.POST)
return UpdatePuppyForm(initial=initial_data)
@property
def paginated_list(self):
for puppy in Puppy.get_all():
yield {
'itemData': {
'id': puppy._id,
...
# make sure you add in this line, so you can use the form in your template:
'updateForm': self.get_update_form_response(
self.get_update_form(puppy.inital_form_data)
),
},
'template': 'base-puppy-template',
}
@property
def column_names(self):
return [
...
# if you're adding another column to your template, be sure to give it a name here...
_('Action'),
]
def get_updated_item_data(self, update_form):
updated_puppy = update_form.update_puppy()
return {
'itemData': {
'id': updated_puppy._id,
'name': updated_puppy.name,
'breed': updated_puppy.breed,
'age': updated_puppy.age,
},
'template': 'base-puppy-template',
}
The UpdatePuppyForm should look something like:
class UpdatePuppyForm(CreatePuppyForm):
item_id = forms.CharField(widget=forms.HiddenInput())
def __init__(self, *args, **kwargs):
super(UpdatePuppyForm, self).__init__(*args, **kwargs)
self.helper.form_style = 'default'
self.helper.form_show_labels = True
self.helper.layout = Layout(
Div(
Field('item_id'),
Field('name'),
Field('breed'),
Field('dob'),
css_class='modal-body'
),
FormActions(
StrictButton(
"Update Puppy",
css_class='btn-primary',
type='submit',
),
HTML('<button type="button" class="btn" data-dismiss="modal">Cancel</button>'),
css_class="modal-footer'
)
)
def update_puppy(self):
return Puppy.update_puppy(self.cleaned_data)
You should add the following to your base-puppy-template knockout template:
<script type="text/html" id="base-puppy-template">
...
<td> <!-- actions -->
<button type="button"
data-toggle="modal"
data-bind="
attr: {
'data-target': '#update-puppy-' + id
}
"
class="btn btn-primary">
Update Puppy
</button>
<div class="modal hide fade"
data-bind="
attr: {
id: 'update-puppy-' + id
}
">
<div class="modal-header">
<button type="button" class="close" data-dismiss="modal" aria-hidden="true">×</button>
<h3>
Update puppy <strong data-bind="text: name"></strong>:
</h3>
</div>
<div data-bind="html: updateForm"></div>
</div>
</td>
</script>
Allowing Deleting in your Paginated View¶
If you want to delete data with your paginated view, you should implement something like the following:
class PuppiesCRUDView(BaseSectionView, CRUDPaginatedMixin):
...
def get_deleted_item_data(self, item_id):
deleted_puppy = Puppy.get(item_id)
deleted_puppy.delete()
return {
'itemData': {
'id': deleted_puppy._id,
...
},
'template': 'deleted-puppy-template', # don't forget to implement this!
}
You should add the following to your base-puppy-template knockout template:
<script type="text/html" id="base-puppy-template">
...
<td> <!-- actions -->
...
<button type="button"
data-toggle="modal"
data-bind="
attr: {
'data-target': '#delete-puppy-' + id
}
"
class="btn btn-danger">
<i class="icon-remove"></i> Delete Puppy
</button>
<div class="modal hide fade"
data-bind="
attr: {
id: 'delete-puppy-' + id
}
">
<div class="modal-header">
<button type="button" class="close" data-dismiss="modal" aria-hidden="true">×</button>
<h3>
Delete puppy <strong data-bind="text: name"></strong>?
</h3>
</div>
<div class="modal-body">
<p>
Yes, delete the puppy named <strong data-bind="text: name"></strong>.
</p>
</div>
<div class="modal-footer">
<button type="button"
class="btn"
data-dismiss="modal">
Cancel
</button>
<button type="button"
class="btn btn-danger delete-item-confirm"
data-loading-text="Deleting Puppy...">
<i class="icon-remove"></i> Delete Puppy
</button>
</div>
</div>
</td>
</script>
Refreshing The Whole List Base on Update¶
If you want to do something that affects an item’s position in the list (generally, moving it to the top), this is the feature you want.
You implement the following method (note that a return is not expected):
class PuppiesCRUDView(BaseSectionView, CRUDPaginatedMixin):
...
def refresh_item(self, item_id):
# refresh the item here
puppy = Puppy.get(item_id)
puppy.make_default()
puppy.save()
Add a button like this to your template:
<button type="button"
class="btn refresh-list-confirm"
data-loading-text="Making Default...">
Make Default Puppy
</button>
Now go on and make some CRUD paginated views!
Using Class-Based Views in CommCare HQ¶
We should move away from function-based views in django and use class-based views instead. The goal of this section is to point out the infrastructure we’ve already set up to keep the UI standardized.
The Base Classes¶
There are two styles of pages in CommCare HQ. One page is centered (e.g. registration, org settings or the list of projects). The other is a two column, with the left gray column acting as navigation and the right column displaying the primary content (pages under major sections like reports).
A Basic (Centered) Page¶
To get started, subclass BasePageView in corehq.apps.hqwebapp.views. BasePageView is a subclass of django’s TemplateView.
class MyCenteredPage(BasePageView):
urlname = 'my_centered_page'
page_title = "My Centered Page"
template_name = 'path/to/template.html'
@property
def page_url(self):
# often this looks like:
return reverse(self.urlname)
@property
def page_context(self):
# You want to do as little logic here.
# Better to divvy up logical parts of your view in other instance methods or properties
# to keep things clean.
# You can also do stuff in the get() and post() methods.
return {
'some_property': self.compute_my_property(),
'my_form': self.centered_form,
}
- urlname
- This is what django urls uses to identify your page
- page_title
This text will show up in the <title> tag of your template. It will also show up in the primary heading of your template.
If you want to do use a property in that title that would only be available after your page is instantiated, you should override:
@property def page_name(self): return mark_safe("This is a page for <strong>%s</strong>" % self.kitten.name)
page_name will not show up in the <title> tags, as you can include html in this name.
- template_name
Your template should extend style/bootstrap2/base_page.html
It might look something like:
{% extends 'style/bootstrap2/base_page.html' %} {% block js %}{{ block.super }} {# some javascript imports #} {% endblock %} {% block js-inline %}{{ block.super }} {# some inline javascript #} {% endblock %} {% block page_content %} My page content! Woo! {% endblock %} {% block modals %}{{ block.super }} {# a great place to put modals #} {% endblock %}
A Section (Two-Column) Page¶
To get started, subclass BaseSectionPageView in corehq.apps.hqwebapp.views. You should implement all the things described in the minimal setup for A Basic (Centered) Page in addition to:
class MySectionPage(BaseSectionPageView):
... # everything from BasePageView
section_name = "Data"
template_name = 'my_app/path/to/template.html'
@property
def section_url(self):
return reverse('my_section_default')
Note
Domain Views
If your view uses domain, you should subclass BaseDomainView. This inserts the domain name as into the main_context and adds the login_and_domain_required permission. It also implements page_url to assume the basic reverse for a page in a project: reverse(self.urlname, args=[self.domain])
- section_name
- This shows up as the root name on the section breadcrumbs.
- template_name
Your template should extend style/bootstrap2/base_section.html
It might look something like:
{% extends 'style/bootstrap2/base_section.html' %} {% block js %}{{ block.super }} {# some javascript imports #} {% endblock %} {% block js-inline %}{{ block.super }} {# some inline javascript #} {% endblock %} {% block main_column %} My page content! Woo! {% endblock %} {% block modals %}{{ block.super }} {# a great place to put modals #} {% endblock %}
Note
Organizing Section Templates
Currently, the practice is to extend style/bootstrap2/base_section.html in a base template for your section (e.g. users/base_template.html) and your section page will then extend its section’s base template.
Adding to Urlpatterns¶
Your urlpatterns should look something like:
urlpatterns = patterns(
'corehq.apps.my_app.views',
...,
url(r'^my/page/path/$', MyCenteredPage.as_view(), name=MyCenteredPage.urlname),
)
Hierarchy¶
If you have a hierarchy of pages, you can implement the following in your class:
class MyCenteredPage(BasePageView):
...
@property
def parent_pages(self):
# This will show up in breadcrumbs as MyParentPage > MyNextPage > MyCenteredPage
return [
{
'title': MyParentPage.page_title,
'url': reverse(MyParentPage.urlname),
},
{
'title': MyNextPage.page_title,
'url': reverse(MyNextPage.urlname),
},
]
If you have a hierarchy of pages, it might be wise to implement a BaseParentPageView or Base<InsertSectionName>View that extends the main_context property. That way all of the pages in that section have access to the section’s context. All page-specific context should go in page_context.
class BaseKittenSectionView(BaseSectionPageView):
@property
def main_context(self):
main_context = super(BaseParentView, self).main_context
main_context.update({
'kitten': self.kitten,
})
return main_context
Permissions¶
To add permissions decorators to a class-based view, you need to decorate the dispatch instance method.
class MySectionPage(BaseSectionPageView):
...
@method_decorator(can_edit)
def dispatch(self, request, *args, **kwargs)
return super(MySectionPage, self).dispatch(request, *args, **kwargs)
GETs and POSTs (and other http methods)¶
Depending on the type of request, you might want to do different things.
class MySectionPage(BaseSectionPageView):
...
def get(self, request, *args, **kwargs):
# do stuff related to GET here...
return super(MySectionPage, self).get(request, *args, **kwargs)
def post(self, request, *args, **kwargs):
# do stuff related to post here...
return self.get(request, *args, **kwargs) # or any other HttpResponse object
Limiting HTTP Methods¶
If you want to limit the HTTP request types to just GET or POST, you just have to override the http_method_names class property:
class MySectionPage(BaseSectionPageView):
...
http_method_names = ['post']
Note
Other Allowed Methods
put, delete, head, options, and trace are all allowed methods by default.
Playing nice with Cloudant/CouchDB¶
We have a lot of views:
$ find . -path *_design*/map.js | wc -l
159
Things to know about views:
- Every time you create or update a doc, each map function is run on it and the btree for the view is updated based on the change in what the maps emit for that doc. Deleting a doc causes the btree to be updated as well.
- Every time you update a view, all views in the design doc need to be run, from scratch, in their entirety, on every single doc in the database, regardless of doc_type.
Things to know about our Cloudant cluster:
- It’s slow. You have to wait in line just to say “hi”. Want to fetch a single doc? So does everyone else. Get in line, I’ll be with you in just 1000ms.
- That’s pretty much it.
Takeaways:
- Don’t save docs! If nothing changed in the doc, just don’t save it. Couchdb isn’t smart enough to realize that nothing changed, so saving it incurs most of the overhead of saving a doc that actually changed.
- Don’t make http requests! If you need a bunch of docs by id,
get them all in one request or a few large requests
using
dimagi.utils.couch.database.iter_docs
. - Don’t make http requests! If you want to save a bunch of docs,
save them all at once
(after excluding the ones that haven’t changed and don’t need to be saved!)
using
MyClass.get_db().bulk_save(docs)
. If you’re writing application code that touches a number of related docs in a number of different places, and you want to bulk save them, you can usedimagi.utils.couch.bulk.CouchTransaction
. Note that this isn’t good for saving thousands of documents, because it doesn’t do any chunking. - Don’t save too many docs in too short a time! To give the views time to catch up, rate-limit your saves if going through hundreds of thousands of docs. One way to do this is to save N docs and then make a tiny request to the view you think will be slowest to update, and then repeat.
- Use different databases! All forms and cases save to the main database, but there is a _meta database we have just added for new doc or migrated doc types. When you use a different database you create two advantages: a) Documents you save don’t contribute to the view indexing load of all of the views in the main database. b) Views you add don’t have to run on all forms and cases.
- Split views! When a single view changes, the entire design doc has to reindex. If you make a new view, it’s much better to make a new design doc for it than to put it in with some other big, possibly expensive views. We use the couchapps folder/app for this.
Forms in HQ¶
Best practice principles:
- Use as little hardcoded HTML as possible.
- Submit and validate forms asynchronously to your class-based-view’s post method.
- Protect forms against CSRF
- Be consistent with style across HQ. We are currently using Bootstrap 2.3’s horizontal forms across HQ.
- Use django.forms.
- Use crispy forms <http://django-crispy-forms.readthedocs.org/en/latest/> for field layout.
Making forms CSRF safe¶
HQ is protected against cross site request forgery attacks i.e. if a POST/PUT/DELETE request doesn’t pass csrf token to corresponding View, the View will reject those requests with a 403 response. All HTML forms and AJAX calls that make such requests should contain a csrf token to succeed. Making a form or AJAX code pass csrf token is easy and the Django docs give detailed instructions on how to do so. Here we list out examples of HQ code that does that
- If crispy form is used to render HTML form, csrf token is included automagically
- For raw HTML form, use {% csrf_token %} tag in the form HTML, see tag_csrf_example.
- If request is made via AJAX, it will be automagically protected by ajax_csrf_setup.js (which is included in base bootstrap template) as long as your template is inherited from the base template. (ajax_csrf_setup.js overrides $.ajaxSettings.beforeSend to accomplish this)
- If an AJAX call needs to override beforeSend itself, then the super $.ajaxSettings.beforeSend should be explicitly called to pass csrf token. See ajax_csrf_example
- If request is made via Angluar JS controller, the angular app needs to be configured to send csrf token. See angular_csrf_example
- If HTML form is created in Javascript using raw nodes, csrf-token node should be added to that form. See js_csrf_example_1 and js_csrf_example_2
- If an inline form is generated using outside of RequestContext using render_to_string or its cousins, use csrf_inline custom tag. See inline_csrf_example
- If a View needs to be exempted from csrf check (for whatever reason, say for API), use csrf_exampt decorator to avoid csrf check. See csrf_exempt_example
- For any other special unusual case refer to Django docs. Essentially, either the HTTP request needs to have a csrf-token or the corresponding View should be exempted from CSRF check.
An Example Complex Asynchronous Form With Partial Fields¶
We create the following base form, subclassing django.forms.Form:
from django import forms
from crispy_forms.helper import FormHelper
from crispy_forms import layout as crispy
class PersonForm(forms.Form):
first_name = forms.CharField()
last_name = forms.CharField()
pets = forms.CharField(widget=forms.HiddenInput)
def __init__(self, *args, **kwargs):
super(PersonForm, self).__init__(*args, **kwargs)
self.helper = FormHelper()
self.helper.layout = crispy.Layout(
# all kwargs passed to crispy.Field turn into that tag's attributes and underscores
# become hyphens. so data_bind="value: name" gets inserted as data-bind="value: name"
crispy.Field('first_name', data_bind="value: first_name"),
crispy.Field('last_name', data_bind="value: last_name"),
crispy.Div(
data_bind="template: {name: 'pet-form-template', foreach: pets}, "
"visible: isPetVisible"
),
# form actions creates the gray box around the submit / cancel buttons
FormActions(
StrictButton(
_("Update Information"),
css_class="btn-primary",
type="submit",
),
# todo: add a cancel 'button' class!
crispy.HTML('<a href="%s" class="btn">Cancel</a>' % cancel_url),
# alternatively, the following works if you capture the name="cancel"'s event in js:
Button('cancel', 'Cancel'),
),
)
@property
def current_values(self):
values = dict([(name, self.person_form[name].value()) for name in self.person_form.keys()])
# here's where you would make sure events outputs the right thing
# in this case, a list so it gets converted an ObservableArray for the knockout model
return values
def clean_first_name(self):
first_name = self.cleaned_data['first_name']
# validate
return first_name
def clean_last_name(self):
last_name = self.cleaned_data['last_name']
# validate
return last_name
def clean_pets(self):
# since we could have any number of pets we tell knockout to store it as json in a hidden field
pets = json.loads(self.cleaned_data['pets'])
# validate pets
# suggestion:
errors = []
for pet in pets:
pet_form = PetForm(pet)
pet_form.is_valid()
errors.append(pet_form.errors)
# raise errors as necessary
return pets
class PetForm(forms.Form):
nickname = CharField()
def __init__(self, *args, **kwargs):
super(PetForm, self).__init__(*args, **kwargs)
self.helper = FormHelper()
# since we're using this form to 'nest' inside of PersonForm, we want to prevent
# crispy forms from auto-including a form tag:
self.helper.form_tag = False
self.helper.layout = crispy.Layout(
Field('nickname', data_bind="value: nickname"),
)
The view will look something like:
class PersonFormView(BaseSectionPageView):
# see documentation on ClassBasedViews for use of BaseSectionPageView
template_name = 'people/person_form.html'
allowed_post_actions = [
'person_update',
'select2_field_update', # an example of another action you might consider
]
@property
@memoized
def person_form(self):
initial = {}
if self.request.method == 'POST':
return PersonForm(self.request.POST, initial={})
return PersonForm(initial={})
@property
def page_context(self):
return {
'form': self.person_form,
'pet_form': PetForm(),
}
@property
def post_action:
return self.request.POST.get('action')
def post(self, *args, **kwargs):
if self.post_action in self.allowed_post_actions:
return HttpResponse(json.dumps(getattr(self, '%s_response' % self.action)))
# NOTE: doing the entire form asynchronously means that you have to explicitly handle the display of
# errors for each field. Ideally we should subclass crispy.Field to something like KnockoutField
# where we'd add something in the template for errors.
raise Http404()
@property
def person_update_response(self):
if self.person_form.is_valid():
return {
'data': self.person_form.current_values,
}
return {
'errors': self.person_form.errors.as_json(),
# note errors looks like:
# {'field_name': [{'message': "msg", 'code': "invalid"}, {'message': "msg", 'code': "required"}]}
}
The template people/person_form.html:
{% extends 'people/base_template.html' %}
{% load hq_shared_tags %}
{% load i18n %}
{% load crispy_forms_tags %}
{% block js %}{{ block.super }}
<script src="{% static 'people/ko/form.person.js' %}"></script>
{% endblock %}
{% block js-inline %}{{ block.super }}
<script>
var personFormModel = new PersonFormModel(
{{ form.current_values|JSON }},
);
$('#person-form').koApplyBindings(personFormModel);
personFormModel.init();
</script>
{% endblock %}
{% block main_column %}
<div id="manage-reminders-form">
<form class="form form-horizontal" method="post">
{% crispy form %}
</form>
</div>
<script type="text/html" id="pet-form-template">
{% crispy pet_form %}
</script>
{% endblock %}
Your knockout code in form.person.js:
var PersonFormModel = function (initial) {
'use strict';
var self = this;
self.first_name = ko.observable(initial.first_name);
self.last_name = ko.observable(initial.last_name);
self.petObjects = ko.observableArray();
self.pets = ko.computed(function () {
return JSON.stringify(_.map(self.petObjects(), function (pet) {
return pet.asJSON();
}));
});
self.init = function () {
var pets = $.parseJSON(initial.pets || '[]');
self.petObjects(_.map(pets, function (initial_data) {
return new Pet(initial_data);
}));
};
};
var Pet = function (initial) {
'use strict';
var self = this;
self.nickname = ko.observable(initial.nickname);
self.asJSON = ko.computed(function () {
return {
nickname: self.nickname()
}
});
};
That should hopefully get you 90% there. For an example on HQ see corehq.apps.reminders.views.CreateScheduledReminderView <https://github.com/dimagi/commcare-hq/blob/master/corehq/apps/reminders/views.py#L486>
HQ Management Commands¶
This is a list of useful management commands. They can be run using
$ python manage.py <command>
or $ ./manage.py <command>
.
For more information on a specific command, run
$ ./manage.py <command> --help
- bootstrap
- Bootstrap a domain and user who owns it. Usage:: $ ./manage.py bootstrap [options] <domain> <email> <password>
- bootstrap_app
- Bootstrap an app in an existing domain. Usage:: $ ./manage.py bootstrap_app [options] <domain_name> <app_name>
- clean_pyc
- Removes all python bytecode (.pyc) compiled files from the project.
- copy_domain
- Copies the contents of a domain to another database. Usage:: $ ./manage.py copy_domain [options] <sourcedb> <domain>
- ptop_fast_reindex_fluff
- Fast reindex of fluff docs. Usage:: $ ./manage.py ptop_fast_reindex_fluff [options] <domain> <pillow_class>
- run_ptop
- Run the pillowtop management command to scan all _changes feeds
- runserver
- Starts a lightweight web server for development which outputs additional debug information.
--werkzeug
Tells Django to use the Werkzeug interactive debugger. - syncdb
- Create the database tables for all apps in INSTALLED_APPS whose tables haven’t already been created, except those which use migrations.
--migrate
Tells South to also perform migrations after the sync. - test
- Runs the test suite for the specified applications, or the entire site if no apps are specified. Usage:: $ ./manage.py test [options] [appname ...]
CommTrack¶
What happens during a CommTrack submission?¶
This is the life-cycle of an incoming stock report via sms.
- SMS is received and relevant info therein is parsed out
- The parsed sms is converted to an HQ-compatible xform submission. This includes:
- stock/requisition info (i.e., just the data provided in the sms)
- location to which this message applies (provided in message or associated with sending user)
- standard HQ submission meta-data (submit time, user, etc.)
Notably missing: anything that updates cases
- The submission is not submitted yet, but rather processed further on the server. This includes:
looking up the product sub-cases that actually store stock/consumption values. (step (2) looked up the location ID; each supply point is a case associated with that location, and actual stock data is stored in a sub-case – one for each product – of the supply point case)
applying the stock actions for each product in the correct order (a stock report can include multiple actions; these must be applied in a consistent order or else unpredictable stock levels may result)
computing updated stock levels and consumption (using somewhat complex business and reconciliation logic)
dumping the result in case blocks (added to the submission) that will update the new values in HQ’s database
- post-processing also makes some changes elsewhere in the instance, namely:
- also added are ‘inferred’ transactions (if my stock was 20, is now 10, and i had receipts of 15, my inferred consumption was 25). This is needed to compute consumption rate later. Conversely, if a deployment tracks consumption instead of receipts, receipts are inferred this way.
- transactions are annotated with the order in which they were processed
Note that normally CommCare generates its own case blocks in the forms it submits.
- The updated submission is submitted to HQ like a normal form
Submitting a stock report via CommCare¶
CommTrack-enabled CommCare submits xforms, but those xforms do not go through the post-processing step in (3) above. Therefore these forms must generate their own case blocks and mimic the end result that commtrack expects. This is severely lacking as we have not replicated the full logic from the server in these xforms (unsure if that’s even possible, nor do we like the prospect of maintaining the same logic in two places), nor can these forms generate the inferred transactions. As such, the capabilities of the mobile app are greatly restricted and cannot support features like computing consumption.
This must be fixed and it’s really not worth even discussing much else about using a mobile app until it is.
CloudCare¶
Overview¶
The goal of this section is to give an overview of the CloudCare system for developers who are new to CloudCare. It should allow one’s first foray into the system to be as painless as possible by giving him or her a high level overview of the system.
Backbone¶
On the frontend, CloudCare is a single page backbone.js app. The app, module, form, and case selection parts of the interface are rendered by backbone while the representation of the form itself is controlled by touchforms (described below).
When a user navigates CloudCare, the browser is not making full page reload requests to our Django server, instead, javascript is used to modify the contents of the page and change the url in the address bar. Whenever a user directly enters a CloudCare url like /a/<domain>/cloudcare/apps/<urlPath>
into the browser, the cloudcare_main view is called. This page loads the backbone app and perhaps bootstraps it with the currently selected app and case.
The Backbone Views¶
The backbone app consists of several Backbone.View
s subclasses. What follows is a brief description of several of the most important classes used in the CloudCare backbone app.
cloudCare.AppListView
- Renders the list of apps in the current domain on the left hand side of the page.
cloudCare.ModuleListView
- Renders the list of modules in the currently selected app on the left hand side of the page.
cloudCare.FormListView
- Renders the list of forms in the currently selected module on the left hand side of the page.
cloudCare.CaseMainView
- Renders the list of cases for the selected form. Note that this list is populated asynchronously.
cloudCare.CaseDetailsView
- Renders the table displaying the currently selected case’s properties.
cloudCare.AppView
- AppView holds the module and form list views.
It is also responsible for inserting the form html into the DOM.
This html is constructed using JSON returned by the touchforms process and several js libs
found in the
/touchforms/formplayer/static/formplayer/script/
directory. This is kicked off by the AppView’s_playForm
method. AppView also insertscloudCare.CaseMainView
s as necessary. cloudCare.AppMainView
- AppMainView (not to be confused with AppView) holds all of the other views and is the entry point for the application. Most of the applications event handling is set up inside AppMainView’s
initialize
method. The AppMainView has a router. Event handlers are set on this router to modify the state of the backbone application when the browser’s back button is used, or when the user enters a link to a certain part of the app (like a particular form) directly.
Touchforms¶
The backbone app is not responsible for processing the XFrom. This is done instead by our XForms player, touchforms. Touchforms runs as a separate process on our servers, and sends JSON to the backbone application representing the structure of the XForm. Touchforms is written in jython, and serves as a wrapper around the JavaRosa that powers our mobile applications.
Offline Cloudcare¶
What is it?¶
First of all, the “offline” part is a misnomer. This does not let you use CloudCare completely offline. We need a new name.
Normal CloudCare requires a round-trip request to the HQ touchforms daemon every time you answer/change a question in a form. This is how it can handle validation logic and conditional questions with the exact same behavior as on the phone. On high-latency or unreliable internet this is a major drag.
“Offline” CloudCare fixes this by running a local instance of the touchforms daemon. CloudCare (in the browser) communicates with this daemon for all matters of maintaining the xform session state. However, CloudCare still talks directly to HQ for other CloudCare operations, such as initial launch of a form, submitting the completed form, and everything outside a form session (case list/select, etc.). Also, the local daemon itself will call out to HQ as needed by the form, such as querying against the casedb. So you still need internet!
How does it work?¶
The touchforms daemon (i.e., the standard JavaRosa/CommCare core with a Jython wrapper) is packaged up as a standalone jar that can be run from pure Java. This requires bundling the Jython runtime. This jar is then served as a “Java Web Start” (aka JNLP) application (same as how you download and run WebEx).
When CloudCare is in offline mode, it will prompt you to download the app; once you do the app will auto-launch. CloudCare will poll the local port the app should be running on, and once its ready, will then initialize the form session and direct all touchforms queries to the local instance rather than HQ.
The app download should persist in a local application cache, so it will not have to be downloaded each time. The initial download is somewhat beefy (14MB) primarily due to the inclusion of the Jython runtime. It is possible we may be able to trim this down by removing unused stuff. When started, the app will automatically check for updates (though there may be a delay before the updates take effect). When updating, only the components that changed need to be re-downloaded (so unless we upgrade Jython, the big part of the download is a one-time cost).
When running, the daemon creates an icon in the systray. This is also where you terminate it.
How do I get it?¶
Offline mode for CloudCare is currently hidden until we better decide how to intergrate it, and give it some minimal testing. To access:
- Go to the main CloudCare page, but don’t open any forms
- Open the chrome dev console (
F12
orctrl+shift+J
) - Type
enableOffline()
in the console - Note the new ‘Use Offline CloudCare’ checkbox on the left
Internationalization¶
This page contains the most common techniques needed for managing CommCare HQ localization strings. For more comprehensive information, consult the Django Docs translations page or this helpful blog post.
Tagging strings in views¶
TL;DR: ugettext
should be used in code that will be run per-request.
ugettext_lazy
should be used in code that is run at module import.
The management command makemessages
pulls out strings marked for
translation so they can be translated via transifex. All three ugettext
functions mark strings for translation. The actual translation is performed
separately. This is where the ugettext functions differ.
ugettext
: The function immediately returns the translation for the currently selected language.ugettext_lazy
: The function converts the string to a translation “promise” object. This is later coerced to a string when rendering a template or otherwise forcing the promise.ugettext_noop
: This function only marks a string as translation string, it does not have any other effect; that is, it always returns the string itself. This should be considered an advanced tool and generally avoided. It could be useful if you need access to both the translated and untranslated strings.
The most common case is just wrapping text with ugettext.
from django.utils.translation import ugettext as _
def my_view(request):
messages.success(request, _("Welcome!"))
Typically when code is run as a result of a module being imported, there is not yet a user whose locale can be used for translations, so it must be delayed. This is where ugettext_lazy comes in. It will mark a string for translation, but delay the actual translation as long as possible.
class MyAccountSettingsView(BaseMyAccountView):
urlname = 'my_account_settings'
page_title = ugettext_lazy("My Information")
template_name = 'settings/edit_my_account.html'
When variables are needed in the middle of translated strings, interpolation can be used as normal. However, named variables should be used to ensure that the translator has enough context.
message = _("User '{user}' has successfully been {action}.").format(
user=user.raw_username,
action=_("Un-Archived") if user.is_active else _("Archived"),
)
This ends up in the translations file as:
msgid "User '{user}' has successfully been {action}."
Using ugettext_lazy
¶
The ugettext_lazy method will work in the majority of translation situations. It flags the string for translation but does not translate it until it is rendered for display. If the string needs to be immediately used or manipulated by other methods, this might not work.
When using the value immediately, there is no reason to do lazy translation.
return HttpResponse(ugettext("An error was encountered."))
It is easy to forget to translate form field names, as Django normally builds nice looking text for you. When writing forms, make sure to specify labels with a translation flagged value. These will need to be done with ugettext_lazy.
class BaseUserInfoForm(forms.Form):
first_name = forms.CharField(label=ugettext_lazy('First Name'), max_length=50, required=False)
last_name = forms.CharField(label=ugettext_lazy('Last Name'), max_length=50, required=False)
ugettext_lazy
, a cautionary tale¶
ugettext_lazy
does not return a string. This can cause complications.
When using methods to manipulate a string, lazy translated strings will not work properly.
group_name = ugettext("mobile workers")
return group_name.upper()
Converting ugettext_lazy
objects to json will crash. You should use
dimagi.utils.web.json_handler
to properly coerce it to a string.
>>> import json
>>> from django.utils.translation import ugettext_lazy
>>> json.dumps({"message": ugettext_lazy("Hello!")})
TypeError: <django.utils.functional.__proxy__ object at 0x7fb50766f3d0> is not JSON serializable
>>> from dimagi.utils.web import json_handler
>>> json.dumps({"message": ugettext_lazy("Hello!")}, default=json_handler)
'{"message": "Hello!"}'
Tagging strings in template files¶
There are two ways translations get tagged in templates.
For simple and short plain text strings, use the trans template tag.
{% trans "Welcome to CommCare HQ" %}
More complex strings (requiring interpolation, variable usage or those that span multiple lines) can make use of the blocktrans tag.
If you need to access a variable from the page context:
{% blocktrans %}This string will have {{ value }} inside.{% endblocktrans %}
If you need to make use of an expression in the translation:
{% blocktrans with amount=article.price %}
That will cost $ {{ amount }}.
{% endblocktrans %}
This same syntax can also be used with template filters:
{% blocktrans with myvar=value|filter %}
This will have {{ myvar }} inside.
{% endblocktrans %}
In general, you want to avoid including HTML in translations. This will make it easier for the translator to understand and manipulate the text. However, you can’t always break up the string in a way that gives the translator enough context to accurately do the translation. In that case, HTML inside the translation tags will still be accepted.
{% blocktrans %}
Manage Mobile Workers <small>for CommCare Mobile and
CommCare HQ Reports</small>
{% endblocktrans %}
Text passed as constant strings to template block tag also needs to be translated. This is most often the case in CommCare with forms.
{% bootstrap_fieldset form _("Specify New Password") %}
Keeping translations up to date¶
Once a string has been added to the code, we can update the .po file by running makemessages.
To do this for all langauges:
$ django-admin.py makemessages --all
It will be quicker for testing during development to only build one language:
$ django-admin.py makemessages -l fra
After this command has run, your .po files will be up to date. To have content in this file show up on the website you still need to compile the strings.
$ django-admin.py compilemessages
You may notice at this point that not all tagged strings with an associated translation in the .po shows up translated. That could be because Django made a guess on the translated value and marked the string as fuzzy. Any string marked fuzzy will not be displayed and is an indication to the translator to double check this.
Example:
#: corehq/__init__.py:103
#, fuzzy
msgid "Export Data"
msgstr "Exporter des cas"
Profiling¶
Practical guide to profiling a slow view or function¶
This will walkthrough one way to profile slow code using the @profile decorator.
At a high level this is the process:
- Find the function that is slow
- Add a decorator to save a raw profile file that will collect information about function calls and timing
- Use libraries to analyze the raw profile file and spit out more useful information
- Inspect the output of that information and look for anomalies
- Make a change, observe the updated load times and repeat the process as necessary
Finding the slow function¶
This is usually pretty straightforward. The easiest thing to do is typically use the top-level entry point for a view call. In this example we are investigating the performance of commtrack location download, so the relevant function would be commtrack.views.location_export.:
@login_and_domain_required
def location_export(request, domain):
response = HttpResponse(mimetype=Format.from_format('xlsx').mimetype)
response['Content-Disposition'] = 'attachment; filename="locations.xlsx"'
dump_locations(response, domain)
return response
Getting a profile dump¶
To get a profile dump, simply add the following decoration to the function.:
from dimagi.utils.decorators.profile import profile
@login_and_domain_required
@profile('locations_download.prof')
def location_export(request, domain):
response = HttpResponse(mimetype=Format.from_format('xlsx').mimetype)
response['Content-Disposition'] = 'attachment; filename="locations.xlsx"'
dump_locations(response, domain)
return response
Now each time you load the page a raw dump file will be created with a timestamp of when it was run. These are created in /tmp/ by default, however you can change it by adding a value to your settings.py like so:
PROFILE_LOG_BASE = "/home/czue/profiling/"
Note that the files created are huge; this code should only be run locally.
Creating a more useful output from the dump file¶
The raw profile files are not human readable, and you need to use something like hotshot to make them useful. A script that will generate what is typically sufficient information to analyze these can be found in the commcarehq-scripts repository. You can read the source of that script to generate your own analysis, or just use it directly as follows:
$ ./reusable/convert_profile.py /path/to/profile_dump.prof
Reading the output of the analysis file¶
The analysis file is broken into two sections. The first section is an ordered breakdown of calls by the cumulative time spent in those functions. It also shows the number of calls and average time per call.
The second section is harder to read, and shows the callers to each function.
This analysis will focus on the first section. The second section is useful when you determine a huge amount of time is being spent in a function but it’s not clear where that function is getting called.
Here is a sample start to that file:
loading profile stats for locations_download/commtrack-location-20140822T205905.prof
361742 function calls (355960 primitive calls) in 8.838 seconds
Ordered by: cumulative time, call count
List reduced from 840 to 200 due to restriction <200>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 8.838 8.838 /home/czue/src/commcare-hq/corehq/apps/locations/views.py:336(location_export)
1 0.011 0.011 8.838 8.838 /home/czue/src/commcare-hq/corehq/apps/locations/util.py:248(dump_locations)
194 0.001 0.000 8.128 0.042 /home/czue/src/commcare-hq/corehq/apps/locations/models.py:136(parent)
190 0.002 0.000 8.121 0.043 /home/czue/src/commcare-hq/corehq/apps/cachehq/mixins.py:35(get)
190 0.003 0.000 8.021 0.042 submodules/dimagi-utils-src/dimagi/utils/couch/cache/cache_core/api.py:65(cached_open_doc)
190 0.013 0.000 7.882 0.041 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/client.py:362(open_doc)
396 0.003 0.000 7.762 0.020 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/_socketio.py:56(readinto)
396 7.757 0.020 7.757 0.020 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/_socketio.py:24(<lambda>)
196 0.001 0.000 7.414 0.038 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/resource.py:40(json_body)
196 0.011 0.000 7.402 0.038 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/wrappers.py:270(body_string)
590 0.019 0.000 7.356 0.012 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/reader.py:19(readinto)
198 0.002 0.000 0.618 0.003 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/resource.py:69(request)
196 0.001 0.000 0.616 0.003 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/resource.py:105(get)
198 0.004 0.000 0.615 0.003 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/resource.py:164(request)
198 0.002 0.000 0.605 0.003 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/client.py:415(request)
198 0.003 0.000 0.596 0.003 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/client.py:293(perform)
198 0.005 0.000 0.537 0.003 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/client.py:456(get_response)
396 0.001 0.000 0.492 0.001 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/http.py:135(headers)
790 0.002 0.000 0.452 0.001 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/http.py:50(_check_headers_complete)
198 0.015 0.000 0.450 0.002 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/http.py:191(__next__)
1159/1117 0.043 0.000 0.396 0.000 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/jsonobject/base.py:559(__init__)
13691 0.041 0.000 0.227 0.000 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/jsonobject/base.py:660(__setitem__)
103 0.005 0.000 0.219 0.002 /home/czue/src/commcare-hq/corehq/apps/locations/util.py:65(location_custom_properties)
103 0.000 0.000 0.201 0.002 /home/czue/src/commcare-hq/corehq/apps/locations/models.py:70(<genexpr>)
333/303 0.001 0.000 0.190 0.001 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/jsonobject/base.py:615(wrap)
289 0.002 0.000 0.185 0.001 /home/czue/src/commcare-hq/corehq/apps/locations/models.py:31(__init__)
6 0.000 0.000 0.176 0.029 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/client.py:1024(_fetch_if_needed)
The most important thing to look at is the cumtime (cumulative time) column. In this example we can see that the vast majority of the time (over 8 of the 8.9 total seconds) is spent in the cached_open_doc function (and likely the library calls below are called by that function). This would be the first place to start when looking at improving profile performance. The first few questions that would be useful to ask include:
- Can we optimize the function?
- Can we reduce calls to that function?
- In the case where that function is hitting a database or a disk, can the code be rewritten to load things in bulk?
In this practical example, the function is clearly meant to already be caching (based on the name alone) so it’s possible that the results would be different if caching was enabled and the cache was hot. It would be good to make sure we test with those two parameters true as well. This can be done by changing your localsettings file and setting the following two variables:
COUCH_CACHE_DOCS = True
COUCH_CACHE_VIEWS = True
Reloading the page twice (the first time to prime the cache and the second time to profile with a hot cache) will then produce a vastly different output:
loading profile stats for locations_download/commtrack-location-20140822T211654.prof
303361 function calls (297602 primitive calls) in 0.484 seconds
Ordered by: cumulative time, call count
List reduced from 741 to 200 due to restriction <200>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.484 0.484 /home/czue/src/commcare-hq/corehq/apps/locations/views.py:336(location_export)
1 0.004 0.004 0.484 0.484 /home/czue/src/commcare-hq/corehq/apps/locations/util.py:248(dump_locations)
1159/1117 0.017 0.000 0.160 0.000 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/jsonobject/base.py:559(__init__)
4 0.000 0.000 0.128 0.032 /home/czue/src/commcare-hq/corehq/apps/locations/models.py:62(filter_by_type)
4 0.000 0.000 0.128 0.032 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/client.py:986(all)
103 0.000 0.000 0.128 0.001 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/client.py:946(iterator)
4 0.000 0.000 0.128 0.032 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/client.py:1024(_fetch_if_needed)
4 0.000 0.000 0.128 0.032 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/client.py:995(fetch)
9 0.000 0.000 0.124 0.014 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/_socketio.py:56(readinto)
9 0.124 0.014 0.124 0.014 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/_socketio.py:24(<lambda>)
4 0.000 0.000 0.114 0.029 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/couchdbkit/resource.py:40(json_body)
4 0.000 0.000 0.114 0.029 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/restkit/wrappers.py:270(body_string)
13 0.000 0.000 0.114 0.009 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/http_parser/reader.py:19(readinto)
103 0.000 0.000 0.112 0.001 /home/czue/src/commcare-hq/corehq/apps/locations/models.py:70(<genexpr>)
13691 0.018 0.000 0.094 0.000 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/jsonobject/base.py:660(__setitem__)
103 0.002 0.000 0.091 0.001 /home/czue/src/commcare-hq/corehq/apps/locations/util.py:65(location_custom_properties)
194 0.000 0.000 0.078 0.000 /home/czue/src/commcare-hq/corehq/apps/locations/models.py:136(parent)
190 0.000 0.000 0.076 0.000 /home/czue/src/commcare-hq/corehq/apps/cachehq/mixins.py:35(get)
103 0.000 0.000 0.075 0.001 submodules/dimagi-utils-src/dimagi/utils/couch/database.py:50(iter_docs)
4 0.000 0.000 0.075 0.019 submodules/dimagi-utils-src/dimagi/utils/couch/bulk.py:81(get_docs)
4 0.000 0.000 0.073 0.018 /home/czue/.virtualenvs/commcare-hq/local/lib/python2.7/site-packages/requests/api.py:80(post)
Yikes! It looks like this is already quite fast with a hot cache! And there don’t appear to be any obvious candidates for further optimization. If it is still a problem it may be an indication that we need to prime the cache better, or increase the amount of data we are testing with locally to see more interesting results.
Aggregating data from multiple runs¶
In some cases it is useful to run a function a number of times and aggregate the profile data. To do this follow the steps above to create a set of ‘.prof’ files (one for each run of the function) then use the ‘gather_profile_stats.py’ script included with django (lib/python2.7/site-packages/django/bin/profiling/gather_profile_stats.py) to aggregate the data.
This will produce a ‘.agg.prof’ file which can be analysed with the prof.py script.
Line profiling¶
In addition to the above methods of profiling it is possible to do line profiling of code which attached profile data to individual lines of code as opposed to function names.
The easiest way to do this is to use the line_profile decorator.
Example output:
File: demo.py
Function: demo_follow at line 67
Total time: 1.00391 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
67 def demo_follow():
68 1 34 34.0 0.0 r = random.randint(5, 10)
69 11 81 7.4 0.0 for i in xrange(0, r):
70 10 1003800 100380.0 100.0 time.sleep(0.1)
File: demo.py
Function: demo_profiler at line 72
Total time: 1.80702 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
72 @line_profile(follow=[demo_follow])
73 def demo_profiler():
74 1 17 17.0 0.0 r = random.randint(5, 10)
75 9 66 7.3 0.0 for i in xrange(0, r):
76 8 802921 100365.1 44.4 time.sleep(0.1)
77
78 1 1004013 1004013.0 55.6 demo_follow()
More details here:
Additional references¶
Memory profiling¶
Refer to these resources which provide good information on memory profiling:
- Memory usage graphs with ps
- while true; do ps -C python -o etimes=,pid=,%mem=,vsz= >> mem.txt; sleep 1; done
ElasticSearch¶
Indexes¶
- We have indexes for each of the following doc types:
- Applications -
hqapps
- Cases -
hqcases
- Domains -
hqdomains
- Forms -
xforms
- Groups -
hqgroups
- Users -
hqusers
- Report Cases -
report_cases
- Report Forms -
report_xforms
- SMS logs -
smslogs
- TrialConnect SMS logs -
tc_smslogs
- Applications -
The Report cases and forms indexes are only configured to run for a few domains, and they store additional mappings allowing you to query on form and case properties (not just metadata).
Each index has a corresponding mapping file in corehq/pillows/mappings/
.
Each mapping has a hash that reflects the current state of the mapping.
This is appended to the index name so the index is called something like
xforms_1cce1f049a1b4d864c9c25dc42648a45
. Each type of index has an alias
with the short name, so you should normally be querying just xforms
, not
the fully specified index+hash.
Whenever the mapping is changed, this hash should be updated. That will
trigger the creation of a new index on deploy (by the $ ./manage.py
ptop_preindex
command). Once the new index is finished, the alias is
flipped ($ ./manage.py ptop_es_manage --flip_all_aliases
) to point
to the new index, allowing for a relatively seamless transition.
Keeping indexes up-to-date¶
Pillowtop looks at the changes feed from couch and listens for any relevant new/changed docs. In order to have your changes appear in elasticsearch, pillowtop must be running:
$ ./manage.py run_ptop --all
You can also run a once-off reindex for a specific index:
$ ./manage.py ptop_fast_reindex_users
Changing a mapping or adding data¶
If you’re adding additional data to elasticsearch, you’ll need modify that index’s mapping file in order to be able to query on that new data.
Adding data to an index¶
Each pillow has a change_transform
method which you can override to
perform additional transformations or lookups on the data. If for example,
you wanted to store username in addition to user_id on cases in elastic,
you’d add username
to corehq.pillows.mappings.case_mapping
, then
modify corehq.pillows.case.CasePillow.change_transform
to do the
appropriate lookup. It accepts a doc_dict
for the case doc and is
expected to return a doc_dict
, so just add the username
to that.
Building the new index¶
Once you’ve made the change, you’ll need to build a new index which uses that new mapping, so you’ll have to update the hash at the top of the file. This can just be a random alphanumeric string. This will trigger a preindex as outlined in the Indexes section.
How to un-bork your broken indexes¶
Sometimes things get in a weird state and (locally!) it’s easiest to just blow away the index and start over.
- Delete the affected index. The easiest way to do this is with elasticsearch-head.
You can delete multiple affected indices with
curl -X DELETE http://localhost:9200/*
.*
can be replaced with any regex to delete matched indices, similar to bash regex. - Run
$ ./manage.py ptop_preindex && ./manage.py ptop_es_manage --flip_all_aliases
. - Try again
ESQuery¶
ESQuery¶
ESQuery is a library for building elasticsearch queries in a friendly, more readable manner.
Basic usage¶
There should be a file and subclass of ESQuery for each index we have.
Each method returns a new object, so you can chain calls together like SQLAlchemy. Here’s an example usage:
q = (FormsES()
.domain(self.domain)
.xmlns(self.xmlns)
.submitted(gte=self.datespan.startdate_param,
lt=self.datespan.enddateparam)
.fields(['xmlns', 'domain', 'app_id'])
.sort('received_on', desc=False)
.size(self.pagination.count)
.start(self.pagination.start)
.terms_aggregation('babies.count', 'babies_saved'))
result = q.run()
total_docs = result.total
hits = result.hits
Generally useful filters and queries should be abstracted away for re-use, but you can always add your own like so:
q.filter({"some_arbitrary_filter": {...}})
q.set_query({"fancy_query": {...}})
For debugging or more helpful error messages, you can use query.dumps()
and query.pprint()
, both of which use json.dumps()
and are suitable for
pasting in to ES Head or Marvel or whatever
Filtering¶
Filters are implemented as standalone functions, so they can be composed and
nested q.OR(web_users(), mobile_users())
.
Filters can be passed to the query.filter
method: q.filter(web_users())
There is some syntactic sugar that lets you skip this boilerplate and just
call the filter as if it were a method on the query class: q.web_users()
In order to be available for this shorthand, filters are added to the
builtin_filters
property of the main query class.
I know that’s a bit confusing, but it seemed like the best way to make filters
available in both contexts.
Generic filters applicable to all indices are available in
corehq.apps.es.filters
. (But most/all can also be accessed as a query
method, if appropriate)
Filtering Specific Indices¶
There is a file for each elasticsearch index (if not, feel free to add one). This file provides filters specific to that index, as well as an appropriately-directed ESQuery subclass with references to these filters.
These index-specific query classes also have default filters to exclude things
like inactive users or deleted docs.
These things should nearly always be excluded, but if necessary, you can remove
these with remove_default_filters
.
Language¶
- es_query - the entire query, filters, query, pagination, facets
- filters - a list of the individual filters
- query - the query, used for searching, not filtering
- field - a field on the document. User docs have a ‘domain’ field.
- lt/gt - less/greater than
- lte/gte - less/greater than or equal to
-
class
corehq.apps.es.es_query.
ESQuery
(index=None)[source]¶ This query builder only outputs the following query structure:
{ "query": { "filtered": { "filter": { "and": [ <filters> ] }, "query": <query> } }, <size, sort, other params> }
-
builtin_filters
¶ A list of callables that return filters. These will all be available as instance methods, so you can do
self.term(field, value)
instead ofself.filter(filters.term(field, value))
-
fields
(fields)[source]¶ Restrict the fields returned from elasticsearch
Deprecated. Use source instead.
-
filter
(filter)[source]¶ Add the passed-in filter to the query. All filtering goes through this class.
-
filters
¶ Return a list of the filters used in this query, suitable if you want to reproduce a query with additional filtering.
-
scroll
()[source]¶ Run the query against the scroll api. Returns an iterator yielding each document that matches the query.
-
search_string_query
(search_string, default_fields=None)[source]¶ Accepts a user-defined search string
-
set_query
(query)[source]¶ Add a query. Most stuff we want is better done with filters, but if you actually want Levenshtein distance or prefix querying...
-
-
class
corehq.apps.es.es_query.
ESQuerySet
(raw, query)[source]¶ - The object returned from
ESQuery.run
- ESQuerySet.raw is the raw response from elasticsearch
- ESQuerySet.query is the ESQuery object
-
doc_ids
¶ Return just the docs ids from the response.
-
hits
¶ Return the docs from the response.
-
total
¶ Return the total number of docs matching the query.
- The object returned from
Available Filters¶
The following filters are available on any ESQuery instance - you can chain any of these on your query.
Note also that the term
filter accepts either a list or a single element.
Simple filters which match against a field are based on this filter, so those
will also accept lists.
That means you can do form_query.xmlns(XMLNS1)
or
form_query.xmlns([XMLNS1, XMLNS2, ...])
.
Contributing:
Additions to this file should be added to the builtin_filters
method on
either ESQuery or HQESQuery, as appropriate (is it an HQ thing?).
-
corehq.apps.es.filters.
date_range
(field, gt=None, gte=None, lt=None, lte=None)[source]¶ Range filter that accepts datetime objects as arguments
-
corehq.apps.es.filters.
empty
(field)[source]¶ Only return docs with a missing or null value for
field
-
corehq.apps.es.filters.
missing
(field, exist=True, null=True)[source]¶ Only return docs missing a value for
field
-
corehq.apps.es.filters.
nested
(path, filter_)[source]¶ Query nested documents which normally can’t be queried directly
-
corehq.apps.es.filters.
non_null
(field)[source]¶ Only return docs with a real, non-null value for
field
Available Queries¶
Queries are used for actual searching - things like relevancy scores, Levenstein distance, and partial matches.
View the elasticsearch documentation to see what other options are available, and put ‘em here if you end up using any of ‘em.
-
corehq.apps.es.queries.
search_string_query
(search_string, default_fields=None)[source]¶ Allows users to use advanced query syntax, but if
search_string
does not use the ES query string syntax, default to doing an infix search for each term. (This may later change to some kind of fuzzy matching).This is also available via the main ESQuery class.
Aggregate Queries¶
Aggregations are a replacement for Facets
Here is an example used to calculate how many new pregnancy cases each user has opened in a certain date range.
res = (CaseES()
.domain(self.domain)
.case_type('pregnancy')
.date_range('opened_on', gte=startdate, lte=enddate))
.aggregation(TermsAggregation('by_user', 'opened_by')
.size(0)
buckets = res.aggregations.by_user.buckets
buckets.user1.doc_count
There’s a bit of magic happening here - you can access the raw json data from
this aggregation via res.aggregation('by_user')
if you’d prefer to skip it.
The res
object has a aggregations
property, which returns a namedtuple
pointing to the wrapped aggregation results. The name provided at instantiation is
used here (by_user
in this example).
The wrapped aggregation_result
object has a result
property containing the
aggregation data, as well as utilties for parsing that data into something more
useful. For example, the TermsAggregation
result also has a counts_by_bucket
method that returns a {bucket: count}
dictionary, which is normally what you
want.
As of this writing, there’s not much else developed, but it’s pretty easy to add support for other aggregation types and more results processing
-
class
corehq.apps.es.aggregations.
AggregationRange
[source]¶ Note that a range includes the “start” value and excludes the “end” value. i.e. start <= X < end
Parameters: - start – range start
- end – range end
- key – optional key name for the range
-
class
corehq.apps.es.aggregations.
DateHistogram
(name, datefield, interval, timezone=None)[source]¶ Aggregate by date range. This can answer questions like “how many forms were created each day?”.
This class can be instantiated by the
ESQuery.date_histogram
method.Parameters: - name – what do you want to call this aggregation
- datefield – the document’s date field to look at
- interval – the date interval to use: “year”, “quarter”, “month”, “week”, “day”, “hour”, “minute”, “second”
- timezone – do bucketing using this time zone instead of UTC
-
class
corehq.apps.es.aggregations.
FilterAggregation
(name, filter)[source]¶ Bucket aggregation that creates a single bucket for the specified filter
Parameters: - name – aggregation name
- filter – filter body
-
class
corehq.apps.es.aggregations.
FiltersAggregation
(name, filters=None)[source]¶ Bucket aggregation that creates a bucket for each filter specified using the filter name.
Parameters: name – aggregation name
-
class
corehq.apps.es.aggregations.
RangeAggregation
(name, field, ranges=None, keyed=True)[source]¶ Bucket aggregation that creates one bucket for each range :param name: the aggregation name :param field: the field to perform the range aggregations on :param ranges: list of AggregationRange objects :param keyed: set to True to have the results returned by key instead of as a list
(see RangeResult.normalized_buckets)
AppES¶
UserES¶
Here’s an example adapted from the case list report - it gets a list of the ids of all unknown users, web users, and demo users on a domain.
from corehq.apps.es import users as user_es
user_filters = [
user_es.unknown_users(),
user_es.web_users(),
user_es.demo_users(),
]
query = (user_es.UserES()
.domain(self.domain)
.OR(*user_filters)
.show_inactive()
.fields([]))
owner_ids = query.run().doc_ids
-
class
corehq.apps.es.users.
UserES
(index=None)[source]¶ -
builtin_filters
¶
-
default_filters
= {'active': {'term': {'is_active': True}}, 'not_deleted': {'term': {'base_doc': 'couchuser'}}}¶
-
index
= 'users'¶
-
-
corehq.apps.es.users.
admin_users
()[source]¶ Return only AdminUsers. Admin users are mock users created from xform submissions with unknown user ids whose username is “admin”.
CaseES¶
Here’s an example getting pregnancy cases that are either still open or were closed after May 1st.
from corehq.apps.es import cases as case_es
q = (case_es.CaseES()
.domain('testproject')
.case_type('pregnancy')
.OR(case_es.is_closed(False),
case_es.closed_range(gte=datetime.date(2015, 05, 01))))
FormES¶
DomainES¶
Here’s an example generating a histogram of domain creations (that’s a type of faceted query), filtered by a provided list of domains and a report date range.
from corehq.apps.es import DomainES
domains_after_date = (DomainES()
.in_domains(domains)
.created(gte=datespan.startdate, lte=datespan.enddate)
.date_histogram('date', 'date_created', interval)
.size(0))
histo_data = domains_after_date.run().aggregations.date.buckets_list
DHIS2 Integration¶
Requirements¶
Users can use CommCareHQ to manually associate information on the DHIS2 instance (ex. reporting organizations, user identifiers) with mobile workers created in CommCareHQ.
- Custom user data in CommCareHQ will be used to specify a custom field on each mobile worker called dhis2_organization_unit_id. This will contain the ID of the DHSI2 organization unit that the mobile worker (midwife) is associated with and submitted data for. The organization unit ID will be included as a case property on each newly created case.
CommCareHQ will be setup to register child entities in DHIS2 and enroll them in the Pediatric Nutrition Assessment and Underlying Risk Assessment programs when new children are registered through CommCareHQ. This will be done through the DHIS2 API and may occur as soon as data is received by CommCareHQ from the mobile, or be configured to run on a regular basis.
- When a new child_gmp case is registered on CommCare HQ, we will use the DHIS2 trackedEntity API to generate a new Child entity. We will also register that new entity in a new Pediatric Nutrition Assessment program. The new Child entity will be updated with the attribute cchq_case_id that contains the case ID of the CommCareHQ case.
- The Pediatric Nutrition Assessment program will be updated with First Name, Last Name, Date of Birth, Gender, Name of Mother/Guradian, Mobile Number of the Mother and Address attributes.
- For children of the appropriate risk level (conditions to be decided) we will also enroll that Child entity in the Underlying Risk Assessment program.
- The entity will be registered for the organization unit specified by the dhis2_organization_unit_id case property.
- If a CommCareHQ case does not have a dhis2_organization_unit_id
- The corresponding CommCareHQ case will be updated with the IDs of the registered entity and each program that entity was registered in. This will be used for later data submissions to DHIS2.
CommCareHQ will be configured to use the DHIS2 API and download a list of registered children on a regular basis. This will be used to create new child cases for nutrition tracking in CommCare or to associate already registered child cases on the mobile with the DHIS2 child entities and the Pediatric Nutrition Assessment and Underlying Risk Assessment programs.
- Custom code in HQ will run on a periodic basis (nightly) to poll the DHIS2 trackedEntity API and get a list of all registered Child entities
- For all child entities without a provided cchq_case_id attribute, a new child_gmp case will be registered on CommCareHQ. It will be assigned to a mobile worker with the appropriate dhis2_organization_unit_id corresponding to the organization of the tracked entity in DHIS2.
- The registered child_gmp will be updated with additional case properties to indicate its corresponding DHIS2 entities and programs.
- Once the case has been registered in CommCareHQ, the DHIS2 tracked entity will be updated with the corresponding cchq_case_id.
CommCareHQ will use the DHIS API to send received nutrition data to DHIS2 as an event that is associated with the correct entity, program, DHIS2 user and organization.
- On a periodic basis, CommCareHQ will submit an appropriate event using the DHIS2 events API (Multiple Event with Registration) for any unprocessed Growth Monitoring forms.
- The event will only be submitted if the corresponding case has a DHIS2 mapping (case properties that indicate they DHIS2 tracked entity instance and programs for the case). If the case is not yet mapped to DHIS2, the Growth Monitoring form will not be processed and could be processed in the future (if the case is later associated with a DHIS entity).
- The event will contain the program ID associated with case, the tracked entity ID and the program stage (Nutrition Assessment). It will also contain the recorded height and weight as well as mobile- calculated BMI and Age at time of visit.
Implementation¶
DHIS2 integration code is found in the corehg/apps/dhis2/ directory.
Analyzing Test Coverage¶
This page goes over some basic ways to analyze code coverage locally.
Using coverage.py¶
First thing is to install the coverage.py library:
$ pip install coverage
Now you can run your tests through the coverage.py program:
$ coverage run manage.py test commtrack
This will create a binary commcare-hq/.coverage file (that is already ignored by our .gitignore) which contains all the magic bits about what happened during the test run.
You can be as specific or generic as you’d like with what selection of tests you run through this. This tool will track which lines of code in the app have been hit during execution of the tests you run. If you’re only looking to analyze (and hopefully increase) coverage in a specific model or utils file, it might be helpful to cut down on how many tests you’re running.
Make an HTML view of the data¶
The simplest (and probably fastest) way to view this data is to build an HTML view of the code base with the coverage data:
$ coverage html
This will build a commcare-hq/coverage-report/ directory with a ton of HTML files in it. The important one is commcare-hq/coverage-report/index.html.
View the result in Vim¶
Install coveragepy.vim (https://github.com/alfredodeza/coveragepy.vim) however you personally like to install plugins. This plugin is old and out of date (but seems to be the only reasonable option) so because of this I personally think the HTML version is better.
Then run :Coveragepy report in Vim to build the report (this is kind of slow).
You can then use :Coveragepy hide and :Coveragepy show to add/remove the view from your current buffer.
Advanced App Features¶
See corehq.apps.app_manager.suite_xml.SuiteGenerator
and corehq.apps.app_manager.xform.XForm
for code.
Child Modules¶
In principle child modules is very simple. Making one module a child of another
simply changes the menu
elements in the suite.xml file. For example in the
XML below module m1
is a child of module m0
and so it has its root
attribute set to the ID of its parent.
<menu id="m0">
<text>
<locale id="modules.m0"/>
</text>
<command id="m0-f0"/>
</menu>
<menu id="m1" root="m0">
<text>
<locale id="modules.m1"/>
</text>
<command id="m1-f0"/>
</menu>
Session Variables¶
This is all good and well until we take into account the way the Session works on the mobile which “prioritizes the most relevant piece of information to be determined by the user at any given time”.
This means that if all the forms in a module require the same case (actually just the same session IDs) then the user will be asked to select the case before selecting the form. This is why when you build a module where all forms require a case the case selection happens before the form selection.
From here on we will assume that all forms in a module have the same case management and hence require the same session variables.
When we add a child module into the mix we need to make sure that the session variables for the child module forms match those of the parent in two ways, matching session variable names and adding in any missing variables.
Matching session variable names¶
For example, consider the session variables for these two modules:
module A:
case_id: load mother case
module B child of module A:
case_id_mother: load mother case
case_id_child: load child case
You can see that they are both loading a mother case but are using different session variable names.
To fix this we need to adjust the variable name in the child module forms otherwise the user will be asked to select the mother case again:
case_id_mother -> case_id
module B final:
case_id: load mother case
case_id_child: load child case
Inserting missing variables¶
In this case imagine our two modules look like this:
module A:
case_id: load patient case
case_id_new_visit: id for new visit case ( uuid() )
module B child of module A:
case_id: load patient case
case_id_child: load child case
Here we can see that both modules load the patient case and that the session IDs match so we don’t have to change anything there.
The problem here is that forms in the parent module also add a case_id_new_visit
variable to the session
which the child module forms do not. So we need to add it in:
module B final:
case_id: load patient case
case_id_new_visit: id for new visit case ( uuid() )
case_id_child: load child case
Note that we can only do this for session variables that are automatically computed and hence does not require user input.
Shadow Modules¶
A shadow module is a module that piggybacks on another module’s commands (the “source” module). The shadow module has its own name, case list configuration, and case detail configuration, but it uses the same forms as its source module.
This is primarily for clinical workflows, where the case detail is a list of patients and the clinic wishes to be able to view differently-filtered queues of patients that ultimately use the same set of forms.
Shadow modules are behind the feature flag Shadow Modules.
Scope¶
The shadow module has its own independent:
- Name
- Menu mode (display module & forms, or forms only)
- Media (icon, audio)
- Case list configuration (including sorting and filtering)
- Case detail configuration
The shadow module inherits from its source:
- case type
- commands (which forms the module leads to)
- end of form behavior
Limitations¶
A shadow module can neither be a parent module nor have a parent module
A shadow module’s source can be a parent module (the shadow will include a copy of the children), or have a parent module (the shadow will appear as a child of that same parent)
Shadow modules are designed to be used with case modules. They may behave unpredictably if given an advanced module, reporting module, or careplan module as a source.
Shadow modules do not necessarily behave well when the source module uses custom case tiles. If you experience problems, make the shadow module’s case tile configuration exactly matches the source module’s.
Entries¶
A shadow module duplicates all of its parent’s entries. In the example below, m1 is a shadow of m0, which has one form. This results in two unique entries, one for each module, which share several properties.
<entry>
<form>
http://openrosa.org/formdesigner/86A707AF-3A76-4B36-95AD-FF1EBFDD58D8
</form>
<command id="m0-f0">
<text>
<locale id="forms.m0f0"/>
</text>
</command>
</entry>
<entry>
<form>
http://openrosa.org/formdesigner/86A707AF-3A76-4B36-95AD-FF1EBFDD58D8
</form>
<command id="m1-f0">
<text>
<locale id="forms.m0f0"/>
</text>
</command>
</entry>
Menu structure¶
In the simplest case, shadow module menus look exactly like other module menus. In the example below, m1 is a shadow of m0. The two modules have their own, unique menu elements.
<menu id="m0">
<text>
<locale id="modules.m0"/>
</text>
<command id="m0-f0"/>
</menu>
<menu id="m1">
<text>
<locale id="modules.m1"/>
</text>
<command id="m1-f0"/>
</menu>
Menus get more complex when shadow modules are mixed with parent/child modules. In the following example, m0 is a basic module, m1 is a child of m0, and m2 is a shadow of m0. All three modules have put_in_root=false (see Child Modules > Menu structure above). The shadow module has its own menu and also a copy of the child module’s menu. This copy of the child module’s menu is given the id m1.m2 to distinguish it from m1, the original child module menu.
<menu id="m0">
<text>
<locale id="modules.m0"/>
</text>
<command id="m0-f0"/>
</menu>
<menu root="m0" id="m1">
<text>
<locale id="modules.m1"/>
</text>
<command id="m1-f0"/>
</menu>
<menu root="m2" id="m1.m2"> <text>
<locale id="modules.m1"/>
</text> <command id="m1-f0"/>
</menu>
<menu id="m2"> <text>
<locale id="modules.m2"/>
</text> <command id="m2-f0"/>
</menu>
Tips for documenting¶
Documenting¶
Documentation is awesome. You should write it. Here’s how.
All the CommCareHQ docs are stored in a docs/
folder in the root of the repo.
To add a new doc, make an appropriately-named rst file in the docs/
directory.
For the doc to appear in the table of contents, add it to the toctree
list in index.rst
.
Sooner or later we’ll probably want to organize the docs into sub-directories, that’s fine, you can link to specific locations like so: `Installation <intro/install>`
.
For a more complete working set of documentation, check out Django’s docs directory. This is used to build docs.djangoproject.com.
Index¶
- Sphinx is used to build the documentation.
- Writing Documentation - Some general tips for writing documentation
- reStructuredText is used for markup.
- Editors with RestructuredText support
Sphinx¶
Sphinx builds the documentation and extends the functionality of rst a bit for stuff like pointing to other files and modules.
To build a local copy of the docs (useful for testing changes), navigate to the docs/
directory and run make html
.
Open <path_to_commcare-hq>/docs/_build/html/index.html
in your browser and you should have access to the docs for your current version (I bookmarked it on my machine).
Writing Documentation¶
For some great references, check out Jacob Kaplan-Moss’s series Writing Great Documentation and this blog post by Steve Losh. Here are some takeaways:
Use short sentences and paragraphs
Break your documentation into sections to avoid text walls
Avoid making assumptions about your reader’s background knowledge
Consider three types of documentation:
- Tutorials - quick introduction to the basics
- Topical Guides - comprehensive overview of the project; everything but the dirty details
- Reference Material - complete reference for the API
One aspect that Kaplan-Moss doesn’t mention explicitly (other than advising us to “Omit fluff” in his Technical style piece) but is clear from both his documentation series and the Django documentation, is what not to write. It’s an important aspect of the readability of any written work, but has other implications when it comes to technical writing.
Antoine de Saint Exupéry wrote, ”... perfection is attained not when there is nothing more to add, but when there is nothing more to remove.”
Keep things short and take stuff out where possible. It can help to get your point across, but, maybe more importantly with documentation, means there is less that needs to change when the codebase changes.
Think of it as an extension of the DRY principle.
reStructuredText¶
reStructuredText is a markup language that is commonly used for Python documentation. You can view the source of this document or any other to get an idea of how to do stuff (this document has hidden comments). Here are some useful links for more detail:
Editors¶
While you can use any text editor for editing RestructuredText documents, I find two particularly useful:
- PyCharm (or other JetBrains IDE, like IntelliJ), which has great syntax highlighting and linting.
- Sublime Text, which has a useful plugin for hard-wrapping lines called Sublime Wrap Plus. Hard-wrapped lines make documentation easy to read in a console, or editor that doesn’t soft-wrap lines (i.e. most code editors).
- Vim has a command
gq
to reflow a block of text (:help gq
). It uses the value oftextwidth
to wrap (:setl tw=75
). Also check out:help autoformat
. Syntastic has a rst linter. To make a line a header, justyypVr=
(or whatever symbol you want).
Examples¶
Some basic examples adapted from 2 Scoops of Django:
Section Header¶
Sections are explained well here
emphasis (bold/strong)
italics
Simple link: http://commcarehq.org
Inline link: CommCareHQ
Fancier Link: CommCareHQ
- An enumerated list item
- Second item
First bullet
- Second bullet
- Indented Bullet
- Note carriage return and indents
Literal code block:
def like():
print("I like Ice Cream")
for i in range(10):
like()
Python colored code block (requires pygments):
# You need to "pip install pygments" to make this work.
for i in range(10):
like()
JavaScript colored code block:
console.log("Don't use alert()");