An ontology of a given domain of interest is a formal representation of knowledge that provides a vocabulary for describing the domain of interest, annotations that explain and clarify each term in the vocabulary, and a logical theory (consisting of axioms and definitions) for the vocabulary, enabling effective modelling and validation of ontologies [1].
Ontologies encompass a multidisciplinary field that draws upon knowledge from information organization, natural language processing, information extraction, artificial intelligence, knowledge representation, and acquisition. In recent years, they have gained popularity as an emerging helpful technology for improving information, organization, management, and understanding. They are shared between communities and application systems, having a significant impact on areas where there is a vast amount of heterogeneous computer-based information, such as the World Wide Web and industrial software applications, knowledge management, and electronic records.
In the Semantic Web, ontologies can be seen as metadata that explicitly represent in a machine-processable way. Ontology-based reasoning services can apply semantics and are helpful for various things like consistency checking, query answering, and subsumption reasoning [2]. Due to their popularity, they are represented online in various languages (e.g., OWL, RDF, XML, FOL); each one facilitates interoperability in distributed environments. The availability of multi-format releases facilitates the application of ontologies within microservices, triplestores, rule engines, and data-integration workflows, eliminating the need for additional conversion processes [3].
In this project, we introduce GBO (Gut-Brain Ontology), a structured knowledge base designed to capture the interactions along the gut-brain axis. By formalizing both anatomical and microbiota relationships, GBO enables systematic exploration of how gut microbiota, metabolites, and immune signals influence neurological health and disease. Our primary focus surrounds a large spectrum of disorders from Amyotrophic Lateral Sclerosis, Multiple Sclerosis, Parkinson's disease, Alzheimer's disease, Frontotemporal Dementia, to stroke, diabetes, obesity, and neuropsychiatric conditions such as depression and ADHD; each of them has a connection to the gut-brain interaction. By integrating clinical phenotypes, experimental findings, and molecular pathways into a unified and consistent ontology, GBO will help researchers to discover new risk factors, predict treatment responses, and raise public awareness of the critical role the gut-brain axis plays in human health.
LOD exposes open, external sources that can be reused. To design our ontology, we searched for existing ontologies and vocabularies to maximize reuse, interoperability, and scientific consistency. Because the gut-brain axis encompasses microbiology, neuroscience, and clinical medicine, no single resource covers the entire conceptual space; instead, the web provides a set of specialized ontologies, vocabularies, and thesauri for specific domains.
We now identify the knowledge bases used to represent individuals in our knowledge graph, along with a brief description.
We also employed Mondo Disease Ontology (Mondo) that describes particular disorders or diseases coming from the DDF category. Info
To provide consistency to GBO, we employed the following principles for defining classes and properties. These guidelines involve external reference, annotation properties, metadata, and naming conventions.
https://w3id.org/hereditary/ontology/schema/ and the resource namespace https://w3id.org/hereditary/ontology/resource/. All URIs that correspond to classes, object properties, and data properties belong to the first namespace, while instances belong to the second one. In addition to these, GBO also contains other instances with different namespaces, such as the SKOS Concept.skos:hasExactmatch, coming from the SKOS namespace, that is used only for classes. Since they have been defined using the GBO namespace, we added this property to ensure consistency, referring to the same external class in an external knowledge base.During the ontology modelling, it is common to use external taxonomies to ensure terminology consistency, especially in the biomedical field. Despite the large number of comprehensive medical ontologies, reusing them entails significant costs that may outweigh the costs of a new implementation. In GBO, external sources have been employed to model humans, species, diseases, disorders, and lab techniques. The common aspect between these elements is that we are interested in the main idea behind the medical term. To avoid classism, a typical pattern in ontologies is to model an external hierarchy as a hierarchy of ontological classes. This occurs when we model a named individual that belongs to two classes: SKOS Concept and the one in GBO.
The use of SKOS has two main advantages: first, it allows for saving space by reducing the number of required URIs (e.g., when defining more classes), and it reduces the complexity of the queries. The main drawback is that it prevents attaching class-specific characteristics directly to those classes. We adopted this pattern for all the class instances, except for Paper, PaperCollection, PaperTitle, PaperAbstract, Mention, and Sentence.
As the primary reference, we consulted other well-known resources. We defined new individuals that were not present in external ontologies with our namespace, GBO.
In the GBO, we introduce dedicated classes for structuring the primary literature: Paper, Mention, and Sentence. They are central components for organizing research articles and their content.
The Paper class is designed to represent a general biomedical article within the ontology. Each paper has several key components:
xsd:integer.rdf:PlainLiteral.xsd:integer.rdf:PlainLiteral.Each Paper is linked to one or more Sentence instances via the partOf property. Sentences are extracted from the paper's title and abstract.
The Mention class captures every entity mention of an ontology individual within a paper's title or abstract. This establishes a connection between the paper and the individual found in it. They are composed of two main components:
hasMentionText. It captures the text's exact position inside a specific sentence.taggedAs that identifies the label of the category associated with that text span (e.g., human, anatomical location, DDF).During the data ingestion process, after the match process, the entity mentions are created using the raw text, and the individual is connected with them via the containedIn predicate. This relationship ensures that each term is traceable back to its exact textual source.
We assign sent_id 0 to the sentence title, then sequentially index the others that are part of the abstract. Finally, we link each sentence to its corresponding entity mentions using the locatedIn property. Each sentence has two main properties:
hasSentenceText: shows the raw sentence text.partOf: connects the sentence to its title or abstract.Figure 1 illustrates how Paper, Sentence, and Mention interconnect to capture both document structure and the text occurrences of biomedical concepts.
To support high-quality semantic connections of gut-brain interactions, GBO defines a set of fourteen distinct entity types, which are crucial for annotating biomedical entities across the dataset. These categories serve as a semantic foundation for analyzing relationships within the gut-brain axis. Each entity type is associated with a unique URI that links it to a standardized concept in an external knowledge base and is accompanied by an explanation that defines its scope and semantic meaning. All category instances are linked to external identifiers whenever possible or to our namespace.
A relevant aspect of our category modeling is the fine-grained representation of bacterial taxa. We treat Bacteria as a general skos:Concept corresponding to the Family level. At the same time, each expert-annotated bacterial entity mention is instantiated at the Species level, our ontology's most specific rank. During the data ingestion process, any species that cannot be matched to an external ontology is automatically linked back to the Bacteria skos:Concept, ensuring all entity mentions are mapped.
To make the GBO even more complete and consistent, we included additional categories specific to the gut-brain axis, such as Assay, Intervention or Procedure, Metabolite, or Specimen, even if we didn't employ them actively in this work.
GBO utilizes seventeen predicates, many of which are overloaded, meaning that the same predicate can link multiple combinations of head and tail entity types, resulting in over 50 possible (head, predicate, tail) triples. Figure 2 shows the Category ontology schema.
Moreover, Figure 3 presents the complete GBO, merging the document classes (Paper, Sentence, Mention) with the full set of entity categories and relations.
| [Ontology NS Prefix] | <https://w3id.org/hereditary/ontology/gutbrain/schema/> |
| [Ontology Individual Prefix] | <https://w3id.org/hereditary/ontology/gutbrain/resource/> |
| dc | <http://purl.org/dc/elements/1.1/> |
| obo | <http://purl.obolibrary.org/obo/> |
| orcid | <https://orcid.org/0000-0002-0676-682> |
| owl | <http://www.w3.org/2002/07/owl#> |
| rdf | <http://www.w3.org/1999/02/22-rdf-syntax-ns#> |
| rdfs | <http://www.w3.org/2000/01/rdf-schema#> |
| schema | <https://w3id.org/brainteaser/ontology/schema/> |
| skos | <http://www.w3.org/2004/02/skos/core#> |
| terms | <http://purl.org/dc/terms/> |
| xml | <http://www.w3.org/XML/1998/namespace> |
| xsd | <http://www.w3.org/2001/XMLSchema#> |
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Animal
IRI: https://w3id.org/hereditary/ontology/genomics/schema/Assay
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/BiomedicalTechnique
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Chemical
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Class
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/DietarySupplement
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Domain
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Drug
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/ExperimentalResult
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Family
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Food
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Genome
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Genus
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Human
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Mention
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Metabolite
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Microbiome
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Microbiota
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Neurotransmitter
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Order
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Organism
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Paper
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/PaperAbstract
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/PaperCollection
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/PaperTitle
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Phylum
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Sentence
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Species
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/Specimen
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/SpecimenCollectionProcess
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/StatisticalTechnique
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/SuperKingdom
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/administered
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/affect
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/changeAbundance
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/changeEffect
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/changeExpression
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/comparedTo
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/composedOf
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/containedIn
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/contains
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/hasAbstract
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/hasTitle
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/impact
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/influence
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/interact
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/isA
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/isLinkedTo
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/locatedIn
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/partOf
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/producedBy
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/strike
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/target
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/treatedBy
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/usedBy
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/hasAbstractText
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/hasMentionText
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/hasSentenceText
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/hasTitleText
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/paperAnnotator
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/paperAuthor
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/paperId
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/paperJournal
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/paperYear
IRI: https://w3id.org/hereditary/ontology/gutbrain/schema/taggedAs
IRI: http://purl.org/dc/elements/1.1/creator
IRI: http://www.w3.org/2004/02/skos/core#exactMatch
[1] Fabian Neuhaus. "What is an Ontology?" 2018. arXiv: 1810.09171 [cs.AI].
[2] Ying Ding and Schubert Foo. "Ontology research and development. Part I - A review of ontology generation". In: Journal of Information Science 28 (Apr. 2002), pp. 123–136. DOI: 10.1177/016555150202802004.
[3] Hossam Ishkewy, Hany Harb, and Hassan Farahat. "A Comprehensive Semantic Web Survey". In: Al-Azhar University Engineering Journal (JAUES), 9 (Dec. 2014), p. 2014.
[4] Jennifer Golbeck et al. "The National Cancer Institute’s Thesaurus and Ontology". In: SSRN Electronic Journal (Jan. 2003). DOI: 10.2139/ssrn.3199007.
[5] Kirill Degtyarenko et al. "ChEBI: a database and ontology for chemical entities of biological interest". In: Nucleic Acids Research 36. Database issue (Jan. 2008), pp. D344–D350. DOI: 10.1093/nar/gkm791.
[6] Damion Dooley et al. "FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration". In: npj Science of Food 2 (Dec. 2018). DOI: 10.1038/s41538-018-0032-6.
[7] Yongcun He et al. "OHMI: The ontology of host-microbiome interactions". In: Journal of Biomedical Semantics 10 (Dec. 2019). DOI: 10.1186/s133326-019-0217-1.
[8] Christopher Townsend et al. "OMIT: Domain Ontology and Knowledge Acquisition in MicroRNA Target Prediction". In: On the Move to Meaningful Internet Systems, OTM 2010. Ed. by Robert Meersman, Tharam Dillon, and Pilar Herrero. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 1160–1167. ISBN: 978-3-642-16949-6.
[9] Xizeng Mao et al. "Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary". In: Bioinformatics 21.19 (Apr. 2005), pp. 3787–3793. ISSN: 1367-4803. DOI: 10.1093/bioinformatics/bti430.
[10] Guglielmo Faggioli et al. "The BrainTeaser Ontology for ALS and MS Clinical Data. Version 3.0. Zenodo, July 2024. DOI: 10.5281/zenodo.12789731.
[11] Elena Simperl. "Reusing ontologies on the Semantic Web: A feasibility study". In: Data & Knowledge Engineering 68.10 (2009), pp. 905-925. ISSN: 0169-023X. DOI: 10.1016/j.datak.2009.02.002.
[12] C. E. Lipscomb. “Medical Subject Headings (MeSH)”. In: Bull Med Libr Assoc 88.3 (July 2000), pp. 265–266. url: https://pmc.ncbi.nlm.nih.gov/articles/PMC35238/