Command-line interface

You should run all LexMapr commands inside a virtual environment:

$ conda activate LexMapr

Once inside the environment, you must supply the path to your input file:

$ lexmapr some_input_file.csv

Input files can be either csv or tsv files. Each row must contain (in order) an id and sample description. The first row is not mapped, and can be a header. For example:

id,sample
0,apple
1,corn
2,potato

id	sample
0	apple
1	corn
2	potato

Running LexMapr with only an input file argument will map samples against a limited collection of pre-defined resources. For a list of all command-line options:

$ lexmapr --help

Command-line options

`-o [OUTPUT], --output [OUTPUT]`

Output results to a specified file, as opposed to the default output to terminal. For example:

$ lexmapr some_input_file.csv -o some_output_file.tsv

Output file contents will always be in tsv format.

`-f, --full`

Output a more detailed description of sample mappings.

`-c [CONFIG], --config [CONFIG]`

Map samples against ontology terms specified in a config file.

By default, LexMapr maps samples against a limited collection of pre-defined ontology terms. If you specify a config file, you can instruct LexMapr to use ontology terms of your own choosing. For example:

$ lexmapr some_input_file.csv -c some_config_file.json

some_config_file.json:

[
  {
    "http://purl.obolibrary.org/obo/foodon.owl": 
    "http://purl.obolibrary.org/obo/BFO_0000001"},
  {
    "http://purl.obolibrary.org/obo/envo.owl": 
    "http://purl.obolibrary.org/obo/BFO_0000001"}
]

Config files must be json files, and follow a specific format. Inside a single JSON array, you must place JSON objects containing a single key-value pairing. Each key-value pairing specifies a collection of ontology terms you want LexMapr to map your samples against. The key provides the IRI of an ontology, and the value provides the IRI of a “root term” inside that ontology. If you specify a config file, LexMapr will map your samples against all terms that are descended from the listed root terms inside the listed ontologies.

The first time you run LexMapr with a specific config file, it will take some time to fetch all the terms. LexMapr will cache the terms to disk so subsequent runs are much faster.

`--no-cache`

Do not retrieve ontology terms from cache.

If you update a config file after LexMapr has already cached the specified terms, your changes will not be reflected in the mappings. You must use this flag to overwrite the cached terms.

`-p {ifsac}, --profile {ifsac}`

Map your samples against a specific profile.

The purpose of profiles is to develop LexMapr extensions for third-party organizations that wish to map samples against their own classification schemes.

Currently, the only profile available is for IFSAC:

$ lexmapr some_input_file.csv -p ifsac

some_input_file.csv:

id,sample
0,potato
1,fish fillet
2,wheat

ifsac_output.tsv:

Sample_Id	Sample_Desc	Cleaned_Sample	Matched_Components	Third Party Classification
0	potato	potato	['potato (whole, raw):foodon_03301449']	['root/underground (tubers)']
1	fish fillet	fish fillet	['fish fillet:foodon_00002679']	['fish']
2	wheat	wheat	['wheat:foodon_03315184']	['grains']

By default, the IFSAC profile outputs to ifsac_output.tsv.

Genomic Epidemiology Ontology

Working together to develop a more comprehensive controlled vocabulary for infectious disease surveillance and outbreak investigations