You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 57 Next »

This feature requires a CommCare Software Plan

This feature (API Access) is only available to CommCare users with a Standard Plan or higher. For more details, see the CommCare Software Plan page.

Summary

commcare-export is a tool designed to let you export your data from CommCare HQ in a simple, configurable, and reusable way. commcare-export will automatically pull your form and case data from CommCare HQ and save it locally, typically to either an excel spreadsheet or to a local database. You can configure it to pull data on a schedule to automatically keep your data up to date, or run it as a one-time data dump. Here are the basic steps:

  1. Configure your export files (excel or JSON)
  2. Run export
  3. (Optionally) configure export to run automatically
  4. Use data

The rest of this page describes how to use commcare-export in more detail.

Who should use this tool

  • You are looking to export data from CommCareHQ into a SQL database or Excel Workbook.
  • You are moderately technical, comfortable running command-line programs on your platform of choice. (If you are a programmer you will be better served by going directly to the README on Github)

Prerequisites and Conventions

 

If you are looking to export into a SQL database, you must have this database set up, and you should know how to provide a URL for it.

 

  • For Windows instructions, if something is typed in monospace font starting with a > sign, this is something for you to type into the command prompt, accessible via Start > Run > cmd. For example, the following indicates that you should type "this is a command" into the Command Prompt.
        > this is a command
  • For non-Windows instructions, if something is typed in monospace font starting with a $ sign, this is something for you to type into a command prompt. For example, the following indicates that you should type "this is a command" into a terminal.
        $ this is a command

A. Set Up CommCare-Export

This section guides you through getting your system ready to use the commcare-export command-line tool.

1. Install Python

On Windows:

Visit the Python for Windows page and download a package appropriate for your system. The following instructions assume that you have select Python 2.7, but this only affects the location of Python on your hard drive. This tool works with all recent version of Pythons. If you use a different version (i.e. Python 3.2) then you will need to replace each instance of "python27" with your version number (i.e. "python32")

On Mac OSX:

Some versions of Mac OSX come with outdated versions of Python. To be safe, you should upgrade; this is very easy. The best way to install software on a Mac is via Homebrew. To install Homebrew and then a modern version of Python, open a terminal and run the following two commands. The first installs Homebrew, interactively telling you each thing it is going to do before it does it. The second installs Python and any prerequisites. (BTW, Mac does not have apt-get)

        $ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
        $ brew install python

After that, usage of Homebrew is brew install <SOME PACKAGE>.

 

On Linux:

If you are on a Linux-based operating system, you probably have an automatic package manager available. Lucky you! You probably already have Python installed or know how to install it, but if not, then you will need to invoke the package manager for your operating system. Here are some possible commands.

        $ apt-get install python
        $ yum install python
        $ emerge python

 

2. Install pip (a program that automatically installs Python programs)

On Windows:

Install pip for windows (https://sites.google.com/site/pydatalog/python/pip-for-windows)

On Linux:

$ sudo easy_install pip

On Mac OSX python setup tools and pip have been pre-installed by Homebrew.

3. Install commcare-export

On Windows:

    (with pip)           > c:\python27\scripts\pip install commcare-export
Additional Notes

If you have installed an earlier version of the commcare-export tool and need to upgrade, try using:

    > c:\python27\scripts\easy_install --upgrade commcare-export

NOTE: if you "tab to complete" in the command line you will see something more like this:

    > c:\python27\scripts\easyinstall.exe --upgrade commcare-export

On Mac OSX and Linux:

    $ sudo pip install commcare-export
Additional Notes

If you have installed an earlier version of the commcare-export tool and need to upgrade, try using:

    $ sudo pip install -U commcare-export

If you get an error "ValueError: Invalid column index 0" you should run this command to downgrade on of the python libraries:

$ pip install -U openpyxl==1.8.6

 

In case you have any problem setting up python on Mac OSX, reference here: http://docs.python-guide.org/en/latest/starting/install/osx/

 

B. Getting started with commcare-export

This section describes how to run the examples that come with commcare-export against a sample project from the CommCareHQ exchange. After working through it, you should have some familiarity with the, its configuration, and how it is used.

1. Set up some demonstration data in CommCareHQ

  1. Visit the CommCareHQ Exchange and add the Simple CommCare Demo/Tutorial app to a new project space.
  2. Select the 'Deploy' tab, select 'Make a New Version', star the build.
  3. Visit CloudCare and fill out a bunch of forms.

2. Run the examples and see the output in the Terminal

Download the examples demo-registrations.xlsx, demo-deliveries.xlsx and demo-pregnancy-cases.xlsx attached to this page. In the following, you will need to adjust the path to the file to be wherever you downloaded them and fill in the name of your project - the bits in bold. The following commands will run the examples in such a way that the "tables" will be simply printed.

Many different fields are available for output in this way in addition to what is shown in these examples. The fields can be determined form the JSON output as they follow the JSON structure down to each property. For instance, based on the form output here: https://www.commcarehq.org/a/YOUR_PROJECT/api/v0.5/form/ - you would be able to use form.case.@caseid or form.meta.username.

YOUR_PROJECT is the name that you gave to the project space for your application. The name of your project is listed on the first page, Dashboard, when you sign into CommCareHQ. You can also see your project space name in any url while working in your : www.commcarehq.org/a/yourprojectname

On Windows:

    > C:\python27\scripts\commcare-export --output-format markdown  --project YOUR_PROJECT --query demo-registrations.xlsx

    > C:\python27\scripts\commcare-export --output-format markdown  --project YOUR_PROJECT --query demo-deliveries.xlsx

    > C:\python27\scripts\commcare-export --output-format markdown  --project YOUR_PROJECT --query demo-pregnancy-cases.xlsx


On Mac OSX / Linux:

    $ commcare-export --output-format markdown --project YOUR_PROJECT --query demo-registrations.xlsx

    $ commcare-export --output-format markdown --project YOUR_PROJECT --query demo-deliveries.xlsx

    $ commcare-export --output-format markdown --project YOUR_PROJECT --query demo-pregnancy-cases.xlsx

Tips:
  • 'YOUR_PROJECT' refers to the name of your project space, not the entire domain URL.  For instance, the URL would be https://www.commcarehq.org/a/YOUR_PROJECT. For the tool, simply type in YOUR_PROJECT. 
  • If you are using a domain that isn't stored on www.commcarehq.org, then you will need to use the "--commcare-hq YOUR_URL" flag in the cmd prompt.
  • In this example, the output will just be printed in the Command tool. If you want to save the data to a file or database, continue to the next section.

3. Run the examples and save the output to an Excel Workbook

Nearly the same command will save the output to an Excel Workbook.

On Windows:

    > C:\python27\scripts\commcare-export --output-format xlsx --output registrations-export-results.xlsx --project YOUR_PROJECT --query demo-registrations.xlsx

    (try modifying the command above to save the results of each of the examples to a file)


On Mac OSX / Linux:

    (try coming up with this command yourself based on reading the Windows example above!)

 

Tips:
  • If you are using windows, the command line for any path reference to folder names with spaces in the titles need to be included in quotes.  For exampe, the folder 'My Data' would need to be referenced by "C:\Users\YOU\My Data" (quotes included in the command). If you start typing the filepath and when you are almost done, hit the "tab" key, the file name will automatically populate with quotations.
  • If you are not going to save data to a SQL data base, you can skip ahead to Excel Configuration Details so learn how to set up the export for your project space.

4. Save the output of an example query to a SQL database

By choosing "--output-format sql" and "--output URL" you can save the output to a SQL database. First, let us try it with SQLite, a simple database that is almost certainly already installed and ready to be used on any modern computer, so you do not have to have anything set up to run these examples. In the following, when you see a lite that starts with "sqlite>" it is mimicking what it is like to interact with the database, so you should type in the bits that come after that.

In these commands, the term sqlite:///registrations.db is a standard URL format for databases. If, for example, you have a PostgreSQL database hosted on db.example.com named registrations that you would like to save into instead, the URL might look like postgresql://db.example.com/registrations (you may need to include username and password in the URL, but fully explaining the format is beyond the scope of this tutorial).

On Windows:

    > C:\python27\scripts\commcare-export --output-format sql --output sqlite:///registrations.db --project YOUR_PROJECT --query demo-registrations.xlsx

        (you will not see any output)

    > sqlite3 registrations.db

    sqlite> .schema

    (you will see the definition of the table that has been created)

    sqlite> SELECT * FROM "Registrations"

    (you will see the rows of the table)


On Mac OSX / Linux:

    $ commcare-export --output-format sql --output sqlite:///registrations.db --project YOUR_PROJECT --query demo-registrations.xlsx

        (you will not see any output)

    $ sqlite3 registrations.db

    sqlite> .schema

    (you will see the definition of the table that has been created)

    sqlite> SELECT * FROM "Registrations"

    (you will see the rows of the table)


Note that for use with a SQL database, a field named id is required.

 

5. Synchronize recent changes periodically

One expected way to use this tool is to run a weekly/daily/hourly export of just those forms and cases which have changed. To set this up you will need knowledge of how to set up scheduled tasks, so this tutorial will just show a single example command and trust that you are comfortable adjusting it to suit your needs. For SOME_RECENT_DATE, any typical date format should be understood.

$ commcare-export --output-format sql --output YOUR_DATABASE_URL --since SOME_RECENT_DATE (the rest of the command as above)

NOTE: It is always safe to run this for the same time period twice, or overlapping time periods, as the command always inserts new data, but simply updates existing rows.

C. Excel Configuration Details

This section describes in detail the Excel configuration for commcare-export. You should open one (or many) of the examples and compare them to the explanations below.

One Configuration Sheet Per Output Table:

Each sheet of the configuration workbook is the configuration for an output table. The table will have the same name as the sheet. The sheet consist of a series of columns with names that are special to commcare-export. These column headers are required but also freeing - you may have any other columns you like as long as they do not have one of the reserved titles.

Columns of a Configuration Sheet

  1. Data Source: (required) This column specifies whether you want to export from the form API or the case API. In most cases, you will simply put one of those two words in the first cell of the column. (Advanced Usage: You may specify a deep JSONPath into the data in order to iterate over repeats. If these terms don't sound familiar, then you probably don't need them!)
  2. Filter Name: (optional) This column specifies the names of filters to use for forms and cases. For a full list of parameters that can be used to filter the response see:
    1. List Forms API
    2. List Cases API
  3. Filter Value: (optional) This column must be the same length as Filter Name as the values are paired with names in order to filter the results you request from CommCareHQ. See API documentation listed above for details or acceptable values and formats.
  4. Field: Each row of this column specifies the name of one output field. Note that for use with a SQL database, a field named id is required.
  5. Source Field: Each row of this column specifies where the data of the corresponding Field should be extracted from. This must be the same length as Field, naturally!
  6. Map Via: This column allows you to convert the raw value to another form (only relevant if you're exporting to SQL). See section E. Type Support below for details
  7. Include Referenced Items (advanced!): Each item in this column is a referenced id for which the entire object will be included. For example, when fetching from the case API, if you put xforms into this column, then instead of just getting the xform_ids, you will get the entire forms in a property called xforms. For cases, you can include: child_cases, parent_cases, xforms. For forms, you can include: cases.

Here is an example:

D. How To Reference Fields in Forms and Cases

In general, the name of form and case fields matches those that you wrote when you created your application, with predictable modifications. You should open an example Excel configuration file right now to compare what you see there with the explanations below.

Top-Level Form Fields: A field named FIELD in a form is referenced by form.FIELD. For example, the field age is referenced by form.age in an Excel config file.

Top-Level Case Fields: A field name FIELD in a case is referenced by properties.FIELD. For example, the field delivery_date is referenced by properties.delivery_date in an Excel config file.

Nested Fields: A field nested more deeply within a form or case may be referenced by separating each level of nesting with a period. For example in demo-deliveries.xlsx you will see the nested field form.delivery_information.delivery_type which will pull out the value the delivery_type question inside the group delivery_information. This will always match the structure of your application. As of the time this document was written, you can find the right name by looking at the column name in a regular form/case export from CommCareHQ.

(Power Users / Programmers Only) To find out exactly what fields are available, you must view the raw output of the CommCareHQ Case List API and Form List API. You can then reference these fields using an arbitrary JSONPath into the structure.

 

E. Type Support

This is only relevant if you're exporting to SQL.

By default the export tool only supports exporting strings, integer values and boolean values. In order to support dates as well as conversion of other types the following functions can be used in the Map Via column:

Function nameDescription
str2bool

Convert a string value to a boolean. The following values result in a True output:

'true', 't', '1' (case doesn't matter)

If the value is already a boolean then it will be output without change. All other values result in a False output.

bool2intThis supports converting boolean values and booleans represented as strings to the integers 1 or 0.
str2numConvert a string representation of a number to a number.
str2dateConvert a string representation of a date or timestamp to a date or timestamp.

You can also use the command line option "--strict-types" which will prevent SQL column types from being changed. This may result in errors if incompatible data is received from CommCare HQ.

F. Functions

The following are functions which can be used to change the value of a column. They can be added to the Map Via column in the Excel query.

Function nameDescriptionExample usageSince version
count-selectedCount the number of options selected for a multi-select value.
Valuea b c d
Map Viacount-selected
Output4
0.12.4
selected-at

Get the option at the given position for a multi-select value. Note: the position index is zero-indexed.

This means that to choose the first item, you need to enter 0, the second item, 1, etc.

Valuered green blue
Map Viaselected-at(2)
Outputblue
0.12.4
selectedChecks to see if a value was selected from a multi-select value. Outputs 'True' or 'False'.
Valuerash fever
Map Viaselected(fever)
OutputTrue
0.12.4

Example usage

Data SourceFilter NameFilter ValueFieldSource FieldMap Via
formxmlns.exacthttp://openrosa.org/formdesigner/abcdefFirst optionmuti_question1selected-at(0)
   Second optionmuti_question1selected-at(1)
   Num optionsmuti_question1count-selected
   Has Fevermuti_question1selected(fever)

Tips & Tricks

See Data Export Tool: Tips and Tricks

Troubleshooting

  • The SQL table or Excel sheet does not get created
    • The export tool will only create the SQL table or Excel sheet if there is data to export. If there are no forms or cases matching the filters then it will do nothing.

 

 

  • No labels