Ontotext GraphDB
   HOME

TheInfoList



OR:

Ontotext Ontotext is a software company with offices in Europe and USA. It is the semantic technology branch of Sirma Group. Its main domain of activity is the development of software based on the Semantic Web languages and standards, in particular RDF, ...
GraphDB (previously known as BigOWLIM) is a
graph database A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the ''graph'' (or ''edge'' or ''relationship''). The graph relat ...
and
knowledge discovery Knowledge extraction is the creation of Knowledge representation and reasoning, knowledge from structured (relational databases, XML) and unstructured (text (literary theory), text, documents, images) sources. The resulting knowledge needs to be in ...
tool compliant with RDF and
SPARQL SPARQL (pronounced "sparkle" , a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description F ...
and available as a high-availability cluster. Ontotext GraphDB is used in various European research projects. As of April 2021, Graph DB is ranked as the 4th most -popular RDF store and 6th most-popular Graph DBMS system. Some categorize it as a NoSQL database. In 2014
Ontotext Ontotext is a software company with offices in Europe and USA. It is the semantic technology branch of Sirma Group. Its main domain of activity is the development of software based on the Semantic Web languages and standards, in particular RDF, ...
acquired the trademark "GraphDB" from Sones. As for a typical graph DB,
ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
are an important input for the databases. The underlying idea is a semantic repository.


Architecture

GraphDB is used to store and manage semantic
Knowledge Graph The Google Knowledge Graph is a knowledge base from which Google serves relevant information in an infobox beside its search results. This allows the user to see the answer in a glance. The data is generated automatically from a variety of so ...
data. It is built on top of the
RDF4J Eclipse RDF4J (formerly OpenRDF Sesame) is an open-source framework for storing, querying, and analysing RDF data. It was created by the Dutch software company Aduna as part of "On-To-Knowledge", a semantic web project that ran from 1999 to 2002 ...
architecture implemented through
RDF4J Eclipse RDF4J (formerly OpenRDF Sesame) is an open-source framework for storing, querying, and analysing RDF data. It was created by the Dutch software company Aduna as part of "On-To-Knowledge", a semantic web project that ran from 1999 to 2002 ...
's Storage and Inference Layer (SAIL). The architecture is made of three main components: * The Workbench is a web-based administration tool. The user interface is based on RDF4J Workbench Web Application * The Engine consists of a
query optimizer Query optimization is a feature of many relational database management systems and other databases such as NoSQL and graph databases. The query optimizer attempts to determine the most efficient way to execute a given query by considering the pos ...
, reasoner, storage and plugin manager. The reasoner in GraphDB is
Forward chaining Forward chaining (or forward reasoning) is one of the two main methods of reasoning when using an inference engine and can be described logically as repeated application of ''modus ponens''. Forward chaining is a popular implementation strategy fo ...
with the goal of total materialization. The plugin manager supports user-defined indexes and can be configured dynamically during run-time. These include: ** RDF Rank, which is an algorithm that identifies the most relevant entities, similar to
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
's
PageRank PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. According ...
by evaluating their interconnectedness **
GeoSPARQL GeoSPARQL is a standard for representation and querying of geospatial linked data for the Semantic Web from the Open Geospatial Consortium (OGC). The definition of a small ontology based on well-understood OGC standards is intended to provide a st ...
, which is the standard for geographical
linked data In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but r ...
. The plugin is able to convert between coordinate reference systems into the default, which OGC specifies as CRS84 format **
Lucene Apache Lucene is a free and open-source search engine software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as a ...
, which supports full-text search capabilities. This provides a variety of indexing options and the ability to simultaneously use multiple, differently configured indexes in the same query using
Apache Lucene Apache Lucene is a free and open-source search engine software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as ...
, a high-performance, full-featured text search engine * The Connectors: The performance of search such as full-text search and
faceted search Faceted search is a technique that involves augmenting traditional search techniques with a faceted navigation system, allowing users to narrow down search results by applying multiple filters based on faceted classification of the items. It is som ...
can be vastly improved via the connectors by enabling the implementation by an external component or service. GraphDB has a connector for both well-known
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
search engines,
Solr Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features an ...
and
Elasticsearch Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is dual-l ...
. ** There is also a connector enabling
MongoDB MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Serve ...
integration, providing the scalability and performance advantages. ** Relational data virtualization ( Ontology-Based Data Access, OBDA) is provided by integration of ontop ** SQL Access over JDBC is provided for traditional analytics tools such as
Tableau Tableau (French for 'little table' literally, also used to mean 'picture'; tableaux or, rarely, tableaus) may refer to: Arts * ''Tableau'', a series of four paintings by Piet Mondrian titled ''Tableau I'' through to ''Tableau IV'' * ''Tableau viv ...
and
PowerBI Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on business intelligence. It is part of the Microsoft Power Platform. Power BI is a collection of software services, apps, and connectors ...
**
Kafka Franz Kafka (3 July 1883 – 3 June 1924) was a German-speaking Bohemian novelist and short-story writer, widely regarded as one of the major figures of 20th-century literature. His work fuses elements of realism and the fantastic. It typ ...
Sink Connector for ingesting large amounts of data. **
GraphQL GraphQL is an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data. GraphQL was developed internally by Facebook (now Meta) in 2012 before being publicly released in 2015. On 7 Nov ...
access to knowledge graphs and semantic search based on
Elasticsearch Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is dual-l ...
and exposed through GraphQL.


Features and Integrations

According to Ontotext, Graph DB supports: * GraphDB uses
RDF4J Eclipse RDF4J (formerly OpenRDF Sesame) is an open-source framework for storing, querying, and analysing RDF data. It was created by the Dutch software company Aduna as part of "On-To-Knowledge", a semantic web project that ran from 1999 to 2002 ...
as a library, utilizing its APIs for storage and querying. * It supports the
GraphQL GraphQL is an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data. GraphQL was developed internally by Facebook (now Meta) in 2012 before being publicly released in 2015. On 7 Nov ...
,
SPARQL SPARQL (pronounced "sparkle" , a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description F ...
and SeRQL languages and RDF (e.g., RDF/XML, N3, Turtle) serialization formats. * It supports custom reasoning rulesets, as well as
RDFS RDF Schema (Resource Description Framework Schema, variously abbreviated as RDFS, , RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the descr ...
, RDFS-plus,
OWL 2 The Web Ontology Language (OWL) is a family of Knowledge representation and reasoning, knowledge representation languages for authoring Ontology (information science), ontologies. Ontologies are a formal way to describe taxonomies and classificat ...
RL and QL. * It integrates
OpenRefine OpenRefine is an open-source desktop application for data cleanup and transformation to other formats, an activity commonly known as data wrangling. It is similar to spreadsheet applications, and can handle spreadsheet file formats such as CSV, ...
for the ingestion of tabular data and provides semantic similarity search at the document level.


Uses

Ontotext Graph DB is used in various scientific areas, e.g.,
Genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar wor ...
,
Healthcare Health care or healthcare is the improvement of health via the prevention, diagnosis, treatment, amelioration or cure of disease, illness, injury, and other physical and mental impairments in people. Health care is delivered by health profe ...
, Data Forensics,
Cultural Heritage Cultural heritage is the heritage of tangible and intangible heritage assets of a group or society that is inherited from past generations. Not all heritages of past generations are "heritage"; rather, heritage is a product of selection by soci ...
,
Geography Geography (from Greek: , ''geographia''. Combination of Greek words ‘Geo’ (The Earth) and ‘Graphien’ (to describe), literally "earth description") is a field of science devoted to the study of the lands, features, inhabitants, and ...
, Infrastructure Planning,
Civil Engineering Civil engineering is a professional engineering discipline that deals with the design, construction, and maintenance of the physical and naturally built environment, including public works such as roads, bridges, canals, dams, airports, sewage ...
, Digital Historiography,
Oceanography Oceanography (), also known as oceanology and ocean science, is the scientific study of the oceans. It is an Earth science, which covers a wide range of topics, including ecosystem dynamics; ocean currents, waves, and geophysical fluid dynamic ...
. For more examples see "Diverse Uses of a Semantic Graph Database for Knowledge Organization and Research" below. Commercial clients include
BBC Sport BBC Sport is the sports division of the BBC, providing national sports coverage for BBC television, radio and online. The BBC holds the television and radio UK broadcasting rights to several sports, broadcasting the sport live or alongside flag ...
, ''
Financial Times The ''Financial Times'' (''FT'') is a British daily newspaper printed in broadsheet and published digitally that focuses on business and economic current affairs. Based in London, England, the paper is owned by a Japanese holding company, Nik ...
'',
Springer Nature Springer Nature or the Springer Nature Group is a German-British academic publishing company created by the May 2015 merger of Springer Science+Business Media and Holtzbrinck Publishing Group's Nature Publishing Group, Palgrave Macmillan, and Macm ...
,
UK Parliament The Parliament of the United Kingdom is the supreme legislative body of the United Kingdom, the Crown Dependencies and the British Overseas Territories. It meets at the Palace of Westminster, London. It alone possesses legislative supremac ...
,
AstraZeneca AstraZeneca plc () is a British-Swedish multinational pharmaceutical and biotechnology company with its headquarters at the Cambridge Biomedical Campus in Cambridge, England. It has a portfolio of products for major diseases in areas includin ...
as well as in the pharmaceutical and finance industries. Some use cases focus on scalability and large data sizes.


See also

*
Graph database A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the ''graph'' (or ''edge'' or ''relationship''). The graph relat ...
s *
Graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
*
RDF database A triplestore or RDF store is a purpose-built database for the storage and retrieval of Semantic triple, triples through Semantic query, semantic queries. A triple is a data entity composed of subject–predicate (grammar), predicate–object, like ...
*
Glossary of graph theory This is a glossary of graph theory. Graph theory is the study of graphs, systems of nodes or vertices connected in pairs by lines or edges. Symbols A B ...


External links


Ontotext's Product Website

Github repository for Apache Licensed Workbench for GraphDB

Register Article from 15 Jan 2020 about Ontotext GraphDB

W3.org entry for GraphDB
* Diverse Uses of a Semantic Graph Database for Knowledge Organization and Research: presentation, video,
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
project and
Zotero Zotero () is a free and open-source software, free and open-source reference management software to manage bibliographic data and related research materials, such as Portable Document Format, PDF files. Features include web browser integration, ...
bibliography.


References

{{DEFAULTSORT:Ontotext Graphdb Graph databases Big data products Database engines Metadata NoSQL Online databases Semantic Web Structured storage Triplestores