Welcome to Widukind’s documentation!

Travis Build Status Documentation Status Number of PyPI downloads Latest Version

Resume

Version:0.5.0
Mise à jour:January 03, 2016
Licence:BSD ?
Repository:https://github.com/Widukind/dlstats
Tickets:https://github.com/Widukind/dlstats/issues
Doc:http://widukind-dlstats.readthedocs.org/en/latest/

Contents:

What is a dlstats?

This python module retrieves times series from the major statistical offices (Eurostat, World Bank, european statistical officesdots) and records those series in a pyMongo database. MongoDB allows to download the data using a REST interface. Widukind provides a graphical client, and a set of functions for Matlab, R, Excel and pandas.

API

Fetchers Commons

Fetcher
DlstatsCollection
Providers
Categories
Datasets
Series

Changed in version 0.3.0: Remove Inherit DlstatsCollection

CodeDict

dlstats.fetchers.BEA

dlstats.fetchers.bis

dlstats.fetchers.bis.extract_zip_file()
dlstats.fetchers.bis.csv_dict()
dlstats.fetchers.bis.local_read_csv()

All Fetchers

dlstats.fetchers.ecb
dlstats.fetchers.esri
dlstats.fetchers.eurostat
dlstats.fetchers.IMF
dlstats.fetchers.oecd
dlstats.fetchers.world_bank

Commands

Environment

Les variables d’environnement peuvent être utilisés pour définir la valeur des options de la ligne de commande.

Toutes les variables de l’application, commence par DLSTATS_

Example:

$ DLSTATS_DEBUG=True dlstats fetchers run -v -S -f BIS

# Or:

$ export DLSTATS_DEBUG=True
$ dlstats fetchers run -v -S -f BIS

# Is the same as:

$ dlstats fetchers run --debug -v -S -f BIS

dlstats.client

$ dlstats --help

Usage: dlstats [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  fetchers  Fetchers commands.
  mongo     MongoDB commands.

dlstats fetchers

$ dlstats fetchers --help

Usage: dlstats fetchers [OPTIONS] COMMAND [ARGS]...

  Fetchers commands.

Options:
  --help  Show this message and exit.

Commands:
  datasets  Show datasets list
  list      Show fetchers list
  report    Fetchers report
  run       Run Fetcher - All datasets or selected...
fetchers list
$ dlstats fetchers list

----------------------------------------------------
BIS
INSEE
BEA
IMF
EUROSTAT
WB
----------------------------------------------------
fetchers datasets
$ dlstats fetchers datasets

Usage: dlstats fetchers datasets [OPTIONS]

  Show datasets list

Options:
  -f, --fetcher [INSEE|BIS|BEA|IMF|WB|EUROSTAT]
                                  Fetcher choice  [required]
  --help                          Show this message and exit.
fetchers report
$ dlstats fetchers report --help

Usage: dlstats fetchers report [OPTIONS]

  Fetchers report

Options:
  --mongo-url TEXT  URL for MongoDB connection.  [default:
                    mongodb://127.0.0.1:27017/widukind]
  --help            Show this message and exit.

Example

$ dlstats fetchers report
-----------------------------------------------------------------------------------------
MongoDB: mongodb://127.0.0.1:27017/widukind :
-----------------------------------------------------------------------------------------
Provider             | Dataset                        | Series     | last Update
-----------------------------------------------------------------------------------------
WorldBank            | GEM                            |       9346 | 2015-09-15 21:38:18
Eurostat             | demo_pjanbroad                 |        834 | 2015-04-23 00:00:00
Eurostat             | gov_10a_taxag                  |      94512 | 2015-07-01 00:00:00
Eurostat             | gov_10q_ggnfa                  |      19218 | 2015-07-01 00:00:00
Eurostat             | namq_10_a10_e                  |      24265 | 2015-09-18 00:00:00
Eurostat             | namq_gdp_p                     |      11956 | 2015-04-13 00:00:00
INSEE                | 1427                           |         37 | 1900-01-01 00:00:00
INSEE                | 158                            |        393 | 1900-01-01 00:00:00
IMF                  | WEO                            |      10936 | 2015-04-01 00:00:00
BIS                  | CNFS                           |        938 | 2015-09-16 09:34:20
BIS                  | DSRP                           |         66 | 2015-09-16 08:47:38
-----------------------------------------------------------------------------------------
fetchers run
$ dlstats fetchers run --help

Usage: dlstats fetchers run [OPTIONS]

  Run Fetcher - All datasets or selected dataset

Options:
  -v, --verbose                   Enables verbose mode.
  -S, --silent                    Suppress confirm
  -D, --debug
  --mongo-url TEXT                URL for MongoDB connection.  [default:
                                  mongodb://127.0.0.1:27017/widukind]
  -f, --fetcher [EUROSTAT|BEA|BIS|IMF|INSEE|WB]
                                  Fetcher choice  [required]
  -d, --dataset TEXT              Run selected dataset only
  --help                          Show this message and exit.

dlstats mongo

$ dlstats mongo --help

Usage: dlstats mongo [OPTIONS] COMMAND [ARGS]...

  MongoDB commands.

Options:
  --help  Show this message and exit.

Commands:
  check          Verify connection
  check-schemas  Check datas in DB with schemas
  clean          Delete MongoDB collections
  reindex        Reindex collections
mongo check
$ dlstats mongo check --help

Usage: dlstats mongo check [OPTIONS]

  Verify connection

Options:
  -v, --verbose     Enables verbose mode.
  --pretty          Pretty display.
  --mongo-url TEXT  URL for MongoDB connection.  [default:
                    mongodb://127.0.0.1:27017/widukind]
  --help            Show this message and exit.

Example:

$ dlstats mongo check
------------------------------------------------------
Connection OK
------------------------------------------------------
pymongo version : 3.1
-------------------- Server Infos --------------------
{'allocator': 'system',
 'bits': 64,
 'compilerFlags': '/TP /nologo /EHsc /W3 /wd4355 /wd4800 /wd4267 /wd4244 /Z7 '
                  '/errorReport:none /O2 /Oy- /MT /GL',
 'debug': False,
 'gitVersion': '05bebf9ab15511a71bfbded684bb226014c0a553',
 'javascriptEngine': 'V8',
 'loaderFlags': '/nologo /LTCG /DEBUG /LARGEADDRESSAWARE '
                '/NODEFAULTLIB:MSVCPRT',
 'maxBsonObjectSize': 16777216,
 'ok': 1.0,
 'sysInfo': 'windows sys.getwindowsversion(major=6, minor=1, build=7601, '
            "platform=2, service_pack='Service Pack 1') "
            'BOOST_LIB_VERSION=1_49',
 'version': '2.4.14',
 'versionArray': [2, 4, 14, 0]}
-------------------- Host Infos ----------------------
{'extra': {'pageSize': 4096},
 'ok': 1.0,
 'os': {'name': 'Microsoft Windows 7',
        'type': 'Windows',
        'version': '6.1 SP1 (build 7601)'},
 'system': {'cpuAddrSize': 64,
            'cpuArch': 'x86_64',
            'currentTime': datetime.datetime(2015, 11, 5, 7, 9, 6, 766000),
            'hostname': 'admin-VAIO',
            'memSizeMB': 6125,
            'numCores': 4,
            'numaEnabled': False}}
------------------------------------------------------
mongo check-schemas
$ dlstats mongo check-schemas --help

Usage: dlstats mongo check-schemas [OPTIONS]

  Check datas in DB with schemas

Options:
  -v, --verbose             Enables verbose mode.
  -S, --silent              Suppress confirm
  -D, --debug
  --mongo-url TEXT          URL for MongoDB connection.  [default:
                            mongodb://127.0.0.1:27017/widukind]
  -M, --max-errors INTEGER  [default: 0]
  --help                    Show this message and exit.

Example:

dlstats mongo check-schemas --max-errors 5 --silent
Attention, opération très longue
check series...
Max error attempt. Skip test !
check categories...
Max error attempt. Skip test !
check datasets...
Max error attempt. Skip test !
check providers...
-------------------------------------------------------------------
Collection           | Count      | Verified   | Errors     | Time
series               |     315032 |       9826 |          5 | 10.488
categories           |       6875 |       1200 |          5 | 0.335
datasets             |         23 |          9 |          5 | 0.012
providers            |          5 |          5 |          0 | 0.001
-------------------------------------------------------------------
time elapsed : 10.841 seconds
mongo clean

Warning

Dangerous operation !

$ dlstats mongo clean --help

Usage: dlstats mongo clean [OPTIONS]

  Delete MongoDB collections

Options:
  -v, --verbose     Enables verbose mode.
  -S, --silent      Suppress confirm
  -D, --debug
  --mongo-url TEXT  URL for MongoDB connection.  [default:
                    mongodb://127.0.0.1:27017/widukind]
  --help            Show this message and exit.
mongo reindex

Warning

All Writes operations is blocked pending run !

$ dlstats mongo reindex --help

Usage: dlstats mongo reindex [OPTIONS]

  Reindex collections

Options:
  -v, --verbose     Enables verbose mode.
  -S, --silent      Suppress confirm
  -D, --debug
  --mongo-url TEXT  URL for MongoDB connection.  [default:
                    mongodb://127.0.0.1:27017/widukind]
  --help            Show this message and exit.

Configuration

Medium

The configuration of dlstats is achieved through editing of an INI file named dlstats. For example, on a UNIX platform, the user-specific configuration would be found in $HOME/.dlstats and the system configuration is in /etc. If the user executing dlstats has a personal configuration file, the system-wide configuration is simply ignored.

Structure

The INI file is divided in sections, enclosed in square brackets.

MongoDB

Those options are passed to the MongoClient instance used by dlstats and follow the pymongo API. Please refer to the pymongo documentation[1]_ for more information.

[1]http://api.mongodb.org/python/

Database

Specification

dlstats stores information from various statistical providers. The main goal is to keep up-to-date time series that are useful to the economist as well as their historical revisions.

Structure

The database structure is described in bson[1]_.

Journal

On top of MongoDB internal journaling mechanics, we keep a reference of all operations impacting the database. The method field stores the name of the method from dlstats.

journal : {
              _id : MongoID,
              method : str,
              arguments : []
             }
Categories
Generic schema

Time series are organized in a tree of categories. Each node stores a reference to the node’s children. It provides a simple and efficient solution to tree storage[2]_.

categories : {
              _id : MongoID,
              _id_journal : MongoID,
              name : str,
              children_id : [MongoID],
              series_id : [MongoID]
             }
[1]http://www.bsonspec.org
[2]http://docs.mongodb.org/manual/tutorial/model-tree-structures/
Metadata

The metadata differs across statistical providers. We add the corresponding fields when needed.

Eurostat

For eurostat, we add a number of URLs for accessing the raw tsv, dft or sdmx files. Also, there is a field for the flowRef identifying the dataflow[3]_. We name codes the nomenclature of attributes that defines atomically the time series. Those codes are only provided for exploration of the database. In the program, a time series is of course identified by its unique id. A document from the codes collection contains all the series related to this code. Consequently, it is possible to query for time series using a set of constraint on codes; at the application level, the client would differentiate all the series_id sets to only get the relevant time series. We keep a pointer to the time series for better performances.

categories : {
              _id : MongoID,
              _id_journal : [MongoID],
              name : str,
              children_id : MongoID,
              url_tsv : str,
              url_dft : str,
              url_sdmx : str,
              flowRef : str,
              codes : {
                       _id_journal : MongoID,
                       name : str,
                       values : {
                                 key : str,
                                 description : str,
                                 series_id : [MongoID]
                                }
                      }
             }
[3]http://epp.eurostat.ec.europa.eu/portal/page/portal/sdmx_web_services/getting_started/rest_sdmx_2.1
Time series

The values are in a list. The position field in the revisions subcollection relates to the index of that list.

series : {
          _id : MongoID,
          _id_journal : MongoID,
          name : str,
          start_date : timestamp,
          end_date : timestamp,
          release_dates : [timestamp],
          values : [float64],
          frequency : str,
          revisions : {
                       value : float64,
                       position : int,
                       release_date : timestamp
                      },
          codes : {
                   name : str,
                   value : str
                  },
          categories_id : MongoID
         }

Implementation

MongoDB
Pros
  • simple (from a developer perspective)
  • large number of drivers
  • no ORM headache
  • painless sharding
  • very large user base
  • decent documentation
Cons
  • immature (mongodb 1.x was scary, 2.x is stable)
  • complex configuration, lot of fine-tuning required
  • slow map/reduce
Impact on the structure

Growing documents impact performance and should be avoided. Preallocation can alleviate the issue. Alternatively, setting the padding to a higher value may help but comes with a memory cost.

Large number of keys are bad because MongoDB isn’t Python. Collections aren’t indexed with hash tables; if the collection has a large number of keys, mongoDB has to do a large number of comparisons to execute a query. In case of reading performance issues, normalization should improve the results.

HDF5

Better than all the other solutions as long as everything is loaded in RAM. Unfit for our job,

Cassandra
Pros
  • supported by the Apache Software Foundation
  • excellent write performances

MongoDB Schemas

categories

Fields
Unique Constraint
Fields:provider + categoryCode
id
required:Yes
unique:Yes
type:ObjectID
comments:Unique ID
name
required:Yes ???
unique:No ???
type:String
default value:null ???
comments:???
  • Examples:
    • Catches by fishing area - historical data (1950-1999)
    • Soil erosion by water by NUTS 3 regions (data source: JRC)
    • Enterprises in high-tech sectors by NACE Rev.2 activity
    • Enterprises in high-tech sectors by NACE Rev.1.1 activity
    • Business statistics
    • High-technology trade
    • Data on employment at national level
categoryCode
required:Yes ???
unique:No
type:String
default value:null ???
comments:???
  • Examples:
    • fish_ca_h
    • aei_pr_soiler
    • htec_eco_ent2
    • htec_eco_ent
    • htec_sti_pat
    • ipr_dfa_cres
provider
required:Yes
unique:No
type:String
comments:Name of Provider
  • Examples:
    • WorldBank
    • Eurostat
    • INSEE
    • IMF
children
required:No
unique:No
type:Array of bson.objectid.ObjectId or null
default value:[None]
comments:???
docHref
required:No
unique:No
type:String
default value:null
comments:Not used
lastUpdate
required:No
unique:No
type:ISODate / datetime
default value:null
comments:???
exposed
required:No ???
unique:No
type:Bool
default value:false
comments:???
Examples
{
    "_id": ObjectId('559d6f819f8f0807a98ee821'),
    "provider": "WorldBank",
    "docHref": null,
    "lastUpdate": null,
    "children": null,
    "categoryCode": "GEM",
    "exposed": false,
    "name": "GEM"
},
{
    "_id": ObjectId('559e40c29f8f081123ecd8f8'),
    "docHref": null,
    "categoryCode": "WEO",
    "provider": "IMF",
    "exposed": false,
    "name": "WEO",
    "lastUpdate": null,
    "children": null
},
{
    "_id": ObjectId('559d6fc69f8f0807a98f0c2f'),
    "lastUpdate": null,
    "categoryCode": "ei_bcs_cs",
    "exposed": false,
    "children": [
        ObjectId('560287d79f8f0857111ce31d'),
        ObjectId('560287d79f8f0857111ce31e')
    ],
    "provider": "Eurostat",
    "docHref": null,
    "name": "Consumer surveys (source: DG ECFIN)"
}

providers

Fields
Unique Constraint
Fields:name
id
required:Yes
unique:Yes
type:ObjectID
comments:Unique ID
name
required:Yes
unique:Yes
type:String
comments:Name of Provider
  • Examples:
    • WorldBank
    • Eurostat
    • INSEE
    • IMF
website
required:Yes ???
unique:No
type:String
comments:URL of Provider Site
Examples
{
    "_id": ObjectId('559d6f81bc00a4d38e44ed74'),
    "website": "http://www.worldbank.org/",
    "name": "WorldBank"
},
{
    "_id": ObjectId('559d6fc6bc00a4d38e44ed76'),
    "website": "http://ec.europa.eu/eurostat",
    "name": "Eurostat"
}

datasets

Fields
Unique Constraint
Fields:provider + datasetCode
id
required:Yes
unique:Yes
type:ObjectID
comments:Unique ID
provider
required:Yes
unique:No
type:String
comments:Name of Provider
  • Examples:
    • WorldBank
    • Eurostat
    • INSEE
    • IMF
datasetCode
required:Yes ???
unique:No ???
type:String
comments:???
  • Examples:
    • demo_pjanbroad
    • GEM
    • 158
    • 1427
    • 1430
    • WEO
    • namq_gdp_c
    • namq_gdp_k
    • namq_gdp_p
    • nama_gdp_c
    • nama_gdp_k
    • nama_gdp_p
    • namq_10_a10
    • namq_10_an6
    • lfsi_act_q
    • gov_10a_taxag
    • gov_10q_ggdebt
    • gov_10q_ggnfa
    • namq_10_a10_e
    • irt_st_q
    • namq_10_gdp
name
required:Yes ???
unique:Yes ???
type:String
default value:null ???
comments:???
  • Examples:
    • Population on 1 January by broad age group and sex
    • Global Economic Monirtor
    • Harmonised consumer price index - Base 2005 - French series by product according to the European classification
    • Producer price indices of French industry for all markets (base 2010) - Main aggregates
    • Producer price indices of French industry for the French market (base 2010) - Basic price - Main aggregates
    • World Economic Outlook
    • GDP and main components - Current prices
    • GDP and main components - volumes
    • GDP and main components - Price indices
    • Gross value added and income A*10 industry breakdowns
    • Gross fixed capital formation with AN_F6 asset breakdowns
    • Population, activity and inactivity - quarterly data
    • Main national accounts tax aggregates
    • Quarterly government debt
    • Quarterly non-financial accounts for general government
    • Employment A*10 industry breakdowns
    • Money market interest rates - quarterly data
    • GDP and main components (output, expenditure and income)
lastUpdate
required:No ???
unique:No
type:ISODate / datetime
default value:null
comments:???
dimensionList
required:Yes
unique:No
type:dlstats.fetchers._commons.CodeDict (list of OrderedDict)
default value:CodeDict()
comments:???
attributeList
required:No
unique:No
type:dlstats.fetchers._commons.CodeDict (list of OrderedDict)
default value:CodeDict()
comments:???
notes
required:No
unique:No
type:String
default value:empty string
comments:???
Examples
{
    "_id": ObjectId('56016d84fab819e7b143892a'),
    "dimensionList": {
        "geo": [
            [
                "EU28",
                "European Union (28 countries)"
            ],
            [
                "EU27",
                "European Union (27 countries)"
            ],
            [
                "EA19",
                "Euro area (19 countries)"
            ],
            [
                "EA18",
                "Euro area (18 countries)"
            ],
            [
                "BE",
                "Belgium"
            ],
            [
                "BG",
                "Bulgaria"
            ],
            [
                "CZ",
                "Czech Republic"
            ],
            [
                "DK",
                "Denmark"
            ],
            [
                "DE",
                "Germany (until 1990 former territory of the FRG)"
            ],
            [
                "DE_TOT",
                "Germany (including former GDR)"
            ],
            [
                "EE",
                "Estonia"
            ],
            [
                "IE",
                "Ireland"
            ],
            [
                "EL",
                "Greece"
            ],
            [
                "ES",
                "Spain"
            ],
            [
                "FR",
                "France"
            ],
            [
                "FX",
                "France (metropolitan)"
            ],
            [
                "HR",
                "Croatia"
            ],
            [
                "IT",
                "Italy"
            ],
            [
                "CY",
                "Cyprus"
            ],
            [
                "LV",
                "Latvia"
            ],
            [
                "LT",
                "Lithuania"
            ],
            [
                "LU",
                "Luxembourg"
            ],
            [
                "HU",
                "Hungary"
            ],
            [
                "MT",
                "Malta"
            ],
            [
                "NL",
                "Netherlands"
            ],
            [
                "AT",
                "Austria"
            ],
            [
                "PL",
                "Poland"
            ],
            [
                "PT",
                "Portugal"
            ],
            [
                "RO",
                "Romania"
            ],
            [
                "SI",
                "Slovenia"
            ],
            [
                "SK",
                "Slovakia"
            ],
            [
                "FI",
                "Finland"
            ],
            [
                "SE",
                "Sweden"
            ],
            [
                "UK",
                "United Kingdom"
            ],
            [
                "EEA31",
                "European Economic Area (EU-28 plus IS, LI, NO)"
            ],
            [
                "EEA30",
                "European Economic Area (EU-27 plus IS, LI, NO)"
            ],
            [
                "EFTA",
                "European Free Trade Association"
            ],
            [
                "IS",
                "Iceland"
            ],
            [
                "LI",
                "Liechtenstein"
            ],
            [
                "NO",
                "Norway"
            ],
            [
                "CH",
                "Switzerland"
            ],
            [
                "ME",
                "Montenegro"
            ],
            [
                "MK",
                "Former Yugoslav Republic of Macedonia, the"
            ],
            [
                "AL",
                "Albania"
            ],
            [
                "RS",
                "Serbia"
            ],
            [
                "TR",
                "Turkey"
            ],
            [
                "AD",
                "Andorra"
            ],
            [
                "BY",
                "Belarus"
            ],
            [
                "BA",
                "Bosnia and Herzegovina"
            ],
            [
                "XK",
                "Kosovo (under United Nations Security Council Resolution 1244/99)"
            ],
            [
                "MD",
                "Moldova"
            ],
            [
                "MC",
                "Monaco"
            ],
            [
                "RU",
                "Russia"
            ],
            [
                "SM",
                "San Marino"
            ],
            [
                "UA",
                "Ukraine"
            ],
            [
                "AM",
                "Armenia"
            ],
            [
                "AZ",
                "Azerbaijan"
            ],
            [
                "GE",
                "Georgia"
            ]
        ],
        "freq": [
            [
                "A",
                "Annual"
            ],
            [
                "S",
                "Half-yearly, semester"
            ],
            [
                "Q",
                "Quarterly"
            ],
            [
                "M",
                "Monthly"
            ],
            [
                "W",
                "Weekly"
            ],
            [
                "B",
                "Business week"
            ],
            [
                "D",
                "Daily"
            ],
            [
                "H",
                "Hourly"
            ],
            [
                "N",
                "Minutely"
            ]
        ],
        "age": [
            [
                "TOTAL",
                "Total"
            ],
            [
                "Y_LT15",
                "Less than 15 years"
            ],
            [
                "Y15-64",
                "From 15 to 64 years"
            ],
            [
                "Y_GE65",
                "65 years or over"
            ],
            [
                "UNK",
                "Unknown"
            ]
        ],
        "sex": [
            [
                "T",
                "Total"
            ],
            [
                "M",
                "Males"
            ],
            [
                "F",
                "Females"
            ]
        ]
    },
    "lastUpdate": ISODate('2015-04-23T00:00:00.000Z'),
    "attributeList": {
        "obs_status": [
            [
                "b",
                "break in time series"
            ],
            [
                "c",
                "confidential"
            ],
            [
                "d",
                "definition differs (see metadata)"
            ],
            [
                "e",
                "estimated"
            ],
            [
                "f",
                "forecast"
            ],
            [
                "i",
                "see metadata (phased out)"
            ],
            [
                "n",
                "not significant"
            ],
            [
                "p",
                "provisional"
            ],
            [
                "r",
                "revised"
            ],
            [
                "s",
                "Eurostat estimate (phased out)"
            ],
            [
                "u",
                "low reliability"
            ],
            [
                "z",
                "not applicable"
            ]
        ],
        "time_format": [
            [
                "P1Y",
                "Annual"
            ],
            [
                "P6M",
                "Semi-annual"
            ],
            [
                "P3M",
                "Quarterly"
            ],
            [
                "P1M",
                "Monthly"
            ],
            [
                "P7D",
                "Weekly"
            ],
            [
                "P1D",
                "Daily"
            ],
            [
                "PT1M",
                "Minutely"
            ]
        ]
    },
    "name": "Population on 1 January by broad age group and sex",
    "provider": "Eurostat",
    "datasetCode": "demo_pjanbroad",
    "docHref": null
}

series

Fields
Unique Constraint
Fields:provider + datasetCode + key
id
required:Yes
unique:Yes
type:ObjectID
comments:Unique ID
provider
required:Yes
unique:No
type:String
comments:Name of Provider
  • Examples:
    • WorldBank
    • Eurostat
    • INSEE
    • IMF
key
required:Yes
unique:Yes
type:String
comments:Unique key of Serie
  • Examples:
    • Q.PYP_MNAC.WDA.P3.IT
    • Q.PYP_MNAC.WDA.P3.LU
    • Q.PYP_MNAC.WDA.P3.LV
    • Q.PYP_MNAC.WDA.P31_S13.IT
    • Q.PYP_MNAC.WDA.P31_S13.LU
    • Q.PYP_MNAC.WDA.P31_S13.LV
name
required:Yes
unique:Yes
type:String
comments:Unique name of Serie
attributes
required:No
unique:No
type:Dict
comments:???
datasetCode
required:Yes ???
unique:No ???
type:String
comments:???
  • Examples:
    • GEM
    • nama_gdp_c
    • namq_gdp_c
    • 158
    • 1427
    • 1430
    • WEO
    • namq_gdp_k
    • namq_gdp_p
    • nama_gdp_k
    • nama_gdp_p
    • demo_pjanbroad
    • namq_10_a10
    • gov_10a_taxag
    • namq_10_an6
    • lfsi_act_q
    • gov_10q_ggdebt
    • gov_10q_ggnfa
    • namq_10_a10_e
    • irt_st_q
    • namq_10_gdp
dimensions
required:Yes ???
unique:No
type:Dict
comments:???
startDate
required:Yes ???
unique:No
type:Integer ???
comments:???
endDate
required:Yes ???
unique:No
type:Integer ???
comments:???
frequency
required:Yes ???
unique:No
type:String
comments:???
  • Examples:
    • A
    • M
    • Q
releaseDates
required:Yes ???
unique:No
type:Array
comments:???
revisions
required:Yes ???
unique:No
type:Dict
comments:???
values
required:Yes ???
unique:No
type:Array
comments:???
notes
required:No ???
unique:No
type:String
comments:???
Example - IMF
{
    "_id": ObjectId('560154fe9f8f084db8e653a3'),
    "attributes": {
        "flag": [
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "",
            "e",
            "e",
            "e",
            "e",
            "e",
            "e",
            "e",
            "e"
        ]
    },
    "datasetCode": "WEO",
    "dimensions": {
        "Scale": "Billions",
        "WEO Country Code": "512",
        "Country": "AFG",
        "Units": "0",
        "Subject": "NGDP"
    },
    "endDate": 50,
    "frequency": "A",
    "key": "NGDP.AFG.0",
    "name": "Gross domestic product, current prices.Afghanistan.National currency",
    "notes": "Expressed in billions of national currency units . Expenditure-based GDP is total final expenditures at purchasers? prices (including the f.o.b. value of exports of goods and services), less the f.o.b. value of imports of goods and services. [SNA 1993]",
    "provider": "IMF",
    "releaseDates": [
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z'),
        ISODate('2015-04-01T00:00:00.000Z')
    ],
    "revisions": {
        "33": [
            {
                "value": "1,148.113",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ],
        "34": [
            {
                "value": "1,248.663",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ],
        "35": [
            {
                "value": "1,378.499",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ],
        "36": [
            {
                "value": "1,526.441",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ],
        "37": [
            {
                "value": "1,682.614",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ],
        "38": [
            {
                "value": "1,858.130",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ],
        "39": [
            {
                "value": "2,057.319",
                "releaseDates": ISODate('2014-10-01T00:00:00.000Z')
            }
        ]
    },
    "startDate": 10,
    "values": [
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "n/a",
        "181.605",
        "220.013",
        "246.210",
        "304.926",
        "345.817",
        "427.495",
        "517.509",
        "607.227",
        "711.759",
        "836.222",
        "1,033.591",
        "1,114.649",
        "1,165.605",
        "1,250.023",
        "1,382.709",
        "1,535.283",
        "1,699.171",
        "1,884.765",
        "2,081.098"
    ]
}
Example - WorldBank (GEM)
{
    "_id": ObjectId('55f927739f8f087fa959e3ed'),
    "values": [
        "",
        "0.736533",
        "0.68195",
        "0.714125",
        "0.666342",
        "0.840883",
        "0.881617",
        "1.02235",
        "1.041133",
        "1.085208",
        "1.222867",
        "1.304292",
        "1.345858",
        "1.480267",
        "2.010875",
        "1.582225",
        "1.327092",
        "1.581",
        "1.506467",
        "2.138275",
        "2.884017",
        "2.759967",
        "2.474358",
        "2.389625",
        "2.439892",
        "2.27315",
        "2.154142",
        "2.09215",
        "2.385942",
        "2.517075",
        "2.569058",
        "2.563233",
        "2.663358",
        "2.454575",
        "2.617542",
        "2.33285",
        "1.907025",
        "1.785242",
        "1.855433",
        "1.725658",
        "1.84335",
        "1.932108",
        "2.129333",
        "2.1048",
        "1.979733",
        "2.512563",
        "2.616948",
        "2.547255",
        "2.602967",
        "3.137989",
        "2.636496",
        "3.351451",
        "4.042094",
        "4.142297",
        "4.073449",
        "4.948254",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA",
        "NA"
    ],
    "key": "Commodity_Prices.Meat, beef, $/kg, nominal$.A",
    "startDate": -10,
    "releaseDates": [
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z'),
        ISODate('2015-07-08T11:24:24.000Z')
    ],
    "dimensions": {
        "Commodity": "4"
    },
    "name": "Commodity Prices; Meat, beef, $/kg, nominal$; Annual",
    "frequency": "A",
    "attributes": {},
    "endDate": 55,
    "provider": "WorldBank",
    "datasetCode": "GEM"
}

Sources Referential

Standards

IN-PROGRESS SDMX
  • Standards for the exchange of statistical information
  • Raw data : SDMX-ML(XML),SDMX-EDI(EDIFACT)
  • DSD: Data Structure Definition, logical container for specific data with specific format
  • Dimension/attribute, identifier, attachement level, code list
  • Concept scheme : Dimension(time,country,frequency,topic), Attributes(observation statsus:estimated or provisional), Measures(observation values)
  • SDMX Compact, Query
  • Cross domain concept
  • Provision agreement: describes the way of delivering data in specific DSD at certain period
  • SDMX registry: central online repositery
  • SDMX Technical speficiations:Preparation, SDMX compliance with DSD standard, implementation(installed), production
  • Sponsor: BIS, ECB, Eurostat, IMF, OECD, UN, World Bank
  • Rest(SDMX2.1): URL with a keyfilter and a periodfilter
  • http://sdmx.org
  • https://webgate.ec.europa.eu/fpfis/mwikis/sdmx/index.php/Main_Page
  • http://sdmx.wikispaces.com/
  • http://opensdmx.wikispaces.com/Presentations
  • SDMX-JSON:
  • SDMX ISO: 17369
TODO Others
TODO RDF
  • Resource Description Framework respresentation : metadata data model.

General method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats.

TODO SAX
  • Simple API for XML (Faster and less memory than DOM)
TODO XLST
  • eXtensible Stylesheet Transformation of XML
IN-PROGRESS RSS
  • Really Simple Syndication: RSS feeds enable publishers to syndicate data automatically

Sources

IN-PROGRESS African Development Bank
IN-PROGRESS Asian Development Bank
Bank of International Settlements
  • CSV, Excel, SDMX-ML, URL of favorties queries
IN-PROGRESS Belgium
IN-PROGRESS Statistics Belgium
IN-PROGRESS Canada
IN-PROGRESS China
IN-PROGRESS National Bureau of Statistics of China
IN-PROGRESS CIRCAB
CME groupe
IN-PROGRESS EUROSTAT
  • SDMX(REST,SOAP with ZIP),TSV, DFT(multi-dimensional table) with GZ
  • Html explorator online search with New (data and DSD) and Modified online search
IN-PROGRESS FAO
  • SDMX registry and repo, SDMX-ML
  • data.fao.org
Germany
Federal Statistical Office and the statistical Offices of the Länder
IN-PROGRESS DESTATIS
Greece
hellenic Statistical Authority(EL.STAT.)
IN-PROGRESS IMF
IN-PROGRESS India
Indian Statistical System official statistics

-http://164.100.34.62:8080/dwh/

IN-PROGRESS INSEE
  • Banque de données macro-économiques (BDM)(server unavailable?): consult and download more than 170.000 series and index over all economics and social area
IN-PROGRESS ISTAT
IN-PROGRESS Japan
Statistics of Japan
Bank of Japan
  • no API? Only files?
IN-PROGRESS Luxembourg
Central Service for Statistics and Economic Studies (STATEC)
IN-PROGRESS Madagascar
IN-PROGRESS Monaco
National Bureau of Statistics of China
IN-PROGRESS Netherlands
Statistics Netherlands (CBS)
OECD
OpenDataAPI
SDMX-JSON API
IN-PROGRESS Russia
IN-PROGRESS Sweden
Statistics Sweden
IN-PROGRESS South Korea
IN-PROGRESS Swiss
IN-PROGRESS Federal Statistical Office (FSO)
IN-PROGRESS United Nations
IN-PROGRESS National accounts (Excel files)
Monthly Bulletin of Statistics Online (MBS)
IN-PROGRESS USA
Bureau of Economic Analysis (BEA)
Bureau of Labor Statistics (BLS)
World Bank
IN-PROGRESS General information on API
  • http://data.worldbank.org/node/9
  • RESTful interfaces
  • Indicators (or time series data): API, XML and JSON
  • Projects (or data on the World Bank’s operations) : Atom representation
  • the World Bank financial data (World Bank Finances API):API, XML, JSON and RDF

Indices and tables