Data packaging system and utilities
datapm COMMAND [OPTIONS]
datapm (data package manager) is a command line tool and python library and for working with Data Packages and interacting with data hubs like those powered by CKAN
about
About datapm
clone src-spec path [format-pattern] [url-pattern]
Download a package (i.e. metadata and resources) specified by src-spec to path
Resources to retrieve are selected interactively if no format-pattern is given. If provided, the optional glob-style format-pattern and url-pattern arguments are matched against the format and url of the resource to determine whether it should be retrieved.
download src-spec path [format-pattern] [url-pattern]
Download a package (i.e. metadata and resources) specified by src-spec to path
Resources to retrieve are selected interactively if no format-pattern is given. If provided, the optional glob-style format-pattern and url-pattern arguments are matched against the format and url of the resource to determine whether it should be retrieved.
dump pkg-spec path-of-resource-within-pkg
Dump contents of specified resource in specified package to stdout.
help
Show available commands
info package-spec [manifest]
Get information about a package (print package metadata). If manifest specified then show manifest info rather than package metadata.
WARNING: if you change the metadata for a python distribution you may need to rebuild the egg-info for changes to show up here.
init [path-or-name]
Initialize a data package at path. Package Name will be taken from last portion of path. If path simply a name then create in the current directory.
license
Show the license
list [index-spec]
List registered packages. If index-spec is not provided use default index.
man
Show the manual
push [source-file] [webstore-url]
Push local package in current directory to remote repository specified in .dpm/config. Alternatively push a single file to the webstore.
register src-spec dest-spec
Register package at src-spec into index at dest-spec.
search index-spec query
Search registered packages in index-spec.
setup action
config [location]: Create configuration file at location. If not location specified use default (see --config).
index [location]: Setup an index at location specified in config.
repo: Setup a repository. The repository will be created at the location specified via the --repository option or default location specified by config.
update src-spec dest-spec
As for register.
upload path upload-spec
Upload a file or package at path to upload-spec. The upload-spec are of the form:
upload-dest-id://BUCKET/LABEL
For example:
## default ckan upload ckan://BUCKET/LABEL
## an s3 upload destination my-s3://BUCKET/LABEL
## local pairtree my-pairtree://BUCKET/LABEL
## google storage my-google-storage://BUCKET/LABEL
Upload destinations are specified in your datapm config file and are of the form:
[upload:dest-id] ofs.backend = s3|google|archive.org|... ## see OFS documentation for a given backend config-option = config-value
--version
show program's version number and exit
-h, --help
show this help message and exit
-v, --verbose
Give more output
-d, --debug
Print debug output
-q, --quiet
Give less output
--log=FILENAME
Log file where a complete (maximum verbosity) record will be kept
-c CONFIG, --config=CONFIG
Path to config file (if any) - defaults to $HOME/.dpmrc
-r REPOSITORY, --repository=REPOSITORY
Path to repository - overrides value in config
-k API_KEY, --api-key=API_KEY
CKAN API Key (overrides value in config)
[dpm] repo.default_path = $HOME/.dpm/repository index.default = file
[index:ckan] ckan.url = http://thedatahub.org/api/ ckan.api_key =
[index:db] db.dburi = sqlite://$HOME/.datapm/repository/index.db
[upload:ckan] ofs.backend = reststore host = http://storage.ckan.net
~/.dpmrc
Per user datapm configuration file.
Grabbing some data from an index
datapm index-add file:///.... datapm update datapm search "military spending" some-id Military Spending 1890-1914 some-id-2 Military Spending 1890-1914 (normalized) datapm install some-id datapm plot some-id
Get two different datasets and use them together
datapm install pkg-a datapm install pkg-b datapm create merged # manual merge # e.g. PPP, GDP datapm register my-merged-package
For more information visit the documentation at: http://readthedocs.org/docs/dpm