Ontologies are semantic representations of reality, where entities represent universals or classes linked by subtype relations. That is to say, ontologies represent things that exist, and are arranged in hierarchies according specific architectural frameworks. GenEpiO is being built according to the principles of the Open Biomedical Ontologies (OBO) Foundry (http://www.obofoundry.org/), which consists of a community of scientists committed to developing interoperable semantic resources for common use through collaborative development, as a means for co-ordinating best practices. The OBO Foundry encourages the use of common relations (specified by the Relations Ontology, http://www.obofoundry.org/ontology/ro.html) and syntax, Aristotelian (simplified) definitions and good documentation. Ontology development can be summarized in four general steps consisting of the following:
- Demarcation of the nature and the scope of the subject matter. Each new ontology should be independent (orthogonal) of other domains of knowledge.
- Gathering of information. To gather terms with which to populate the ontology, domain experts, textbooks, leaders of the community etc should be consulted in order to ensure maximum consensus about vocabulary usage. It is also useful to identify terms where usage is inconsistent. Every term should be given a unique identifier – Uniform Resource Identifiers (URIs), and existing terms (and their identifiers) from other ontologies should be reused. Simple definitions which capture essential features of the term, and references supporting the described usage, should be provided.
- Ordering of terms in a hierarchy. Every term should have a single parent. Careful ordering of terms in a hierarchy from the root to the highest node, is crucial to ensure coherence. Relationships between terms are built on a “is_a” relation backbone. The Basic Formal Ontology (BFO) outlines how knowledge within domains should be organized, while the Relation Ontology (RO) outlines ways in which classes and individuals can be related to one another.
- Formalization of the ontology in a computer usable language that can be implemented as a computable framework e.g OWL or RDF. Ontologies are commonly encoded using ontology languages like OWL in order to standardize syntax.
GenEpiO is an application ontology and so reuses terminology and associated unique identifiers from many existing OBO Foundry ontologies. GenEpiO is encoded in OWL and can be edited using the free Protege ontology editor produced by Stanford University (http://protege.stanford.edu/). To identify terms in existing ontologies, lookup services such as OLS (https://www.ebi.ac.uk/ols/index), Ontobee (http://www.ontobee.org/) and NCBO BioPortal (http://bioportal.bioontology.org/) can be easily employed.
The Genomic Epidemiology Application Ontology is being constructed using a bottom-up approach to ensure practicality. This approach focuses on the standardization of terms necessary to perform the processes pertaining to laboratory, clinical and epidemiological work flows of outbreak investigations, and other infectious disease research and interventions employing diagnostic whole genome sequencing.