The RDF model of the Gene Ontology, demystified

The Gene Ontology (GO) is a controlled vocabulary of terms describing genes and gene products. It facilitates the integration of biological and biomedical data and is widely used in bioinformatics.

The ontology defines terms (GO terms) and relationships between those terms. The terms can be thought of as tags that can be assigned to gene products. An individual gene can also be annotated with multiple GO terms.

Gene Ontology in RDF

The GO terms and relationships between them form a graph which can be described in RDF.

In the RDF representation of the Gene Ontology, each term has the IRI of the form <http://purl.obolibrary.org/obo/GO_XXXXXXX>. For example, the term "mitochondrion inheritance" is represented by the node <http://purl.obolibrary.org/obo/GO_0000001> (obo:GO_0000001). Here is an excerpt of the ontology showing information about obo:GO_0000001:

@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

obo:GO_0000001 a owl:Class .
                 rdfs:subClassOf obo:GO_0048308 ,
                                 obo:GO_0048311 ;
                 obo:IAO_0000115 "The distribution of mitochondria, including the mitochondrial genome, into daughter cells after mitosis or meiosis, mediated by interactions between mitochondria and the cytoskeleton." ;
                 rdfs:label "mitochondrion inheritance" .

Running SPARQL queries against the Gene Ontology data

A SPARQL query interface provided by the Gene Ontology project and powered by Blazegraph makes it possible to issue queries against the GO RDF dataset directly in the browser. For example, this query returns all superclasses of mitochondrion inheritance:

PREFIX obo: <http://purl.obolibrary.org/obo/>
SELECT ?class ?classLabel
WHERE {
  obo:GO_0000001 rdfs:subClassOf ?class .
  ?class rdfs:label ?classLabel .
}

Query result:

classclassLabel
<http://purl.obolibrary.org/obo/BFO_0000003>"occurrent"
<http://purl.obolibrary.org/obo/BFO_0000015>"process"
<http://purl.obolibrary.org/obo/GO_0008150>"biological_process"
<http://purl.obolibrary.org/obo/GO_0000001>"mitochondrion inheritance"
<http://purl.obolibrary.org/obo/GO_0007005>"mitochondrion organization"
<http://purl.obolibrary.org/obo/GO_0009987>"cellular process"
<http://purl.obolibrary.org/obo/GO_0016043>"cellular component organization"
<http://purl.obolibrary.org/obo/GO_0048308>"organelle inheritance"
<http://purl.obolibrary.org/obo/GO_0048311>"mitochondrion distribution"
<http://purl.obolibrary.org/obo/GO_0006996>"organelle organization"
<http://purl.obolibrary.org/obo/GO_0051179>"localization"
<http://purl.obolibrary.org/obo/GO_0051640>"organelle localization"
<http://purl.obolibrary.org/obo/GO_0051641>"cellular localization"
<http://purl.obolibrary.org/obo/GO_0051646>"mitochondrion localization"
<http://purl.obolibrary.org/obo/GO_0071840>"cellular component organization or biogenesis"

Bottom line

An instance of a Linked Data set, the Gene Ontology RDF graph is a step towards the more efficient sharing and reuse of biological information. While the GO terms provide the common language for annotating genes, RDF enables the interoperability and integration of those annotations with other datasets and ontologies.

See also

Big Sequence Logos
$29.99

A collection of large-format sequence logos.

Kyrgyz Alphabet Poster, English-Labeled
$17.99

The Kyrgyz alphabet chart.

Fretboard Poster
$19.99

A poster featuring a fretboard diagram.

Adenine Molecule Poster, 2D Structure, English-Labeled
$19.99

A poster featuring the 2D structure of the adenine molecule.

Notes Poster, English-Labeled
$19.99

A poster featuring the names of musical notes.

OWL Functional Syntax Parse Tree Viewer

View the parse tree of OWL Functional Syntax.

OWL reasoning

Using OWL reasoning to infer new knowledge.

Bioinformatics Crossword

A daily crossword puzzle for bioinformatics terms.

Plasmid Map Generator

A tool to generate plasmid maps from GenBank files.

Scalable genomic alignment with Progressive Cactus

How progressive alignment makes it possible to efficiently align hundreds to thousands of large genomes.

All prices listed are in United States Dollars (USD). Visual representations of products are intended for illustrative purposes. Actual products may exhibit variations in color, texture, or other characteristics inherent to the manufacturing process. The products' design and underlying technology are protected by applicable intellectual property laws. Unauthorized reproduction or distribution is prohibited.