Open Semantic Framework
   HOME

TheInfoList



OR:

The Open Semantic Framework (OSF) is an integrated
software stack In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete platform such that no additional software is needed to support applications. Applications are said to "run on" or "run on t ...
using semantic technologies for
knowledge management Knowledge management (KM) is the collection of methods relating to creating, sharing, using and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieve organisational objectives by making ...
. It has a layered architecture that combines existing
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
with additional open source components developed specifically to provide a complete
Web application framework A web framework (WF) or web application framework (WAF) is a software framework that is designed to support the development of web applications including web services, web resources, and web APIs. Web frameworks provide a standard way to build and ...
. OSF is made available under the Apache 2 license. OSF is a platform-independent Web services framework for accessing and exposing
structured data A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be co ...
,
semi-structured data Semi-structured data is a form of structured data that does not obey the tabular structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic eleme ...
, and
unstructured data Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, num ...
using
ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
to reconcile semantic heterogeneities within the contributing data and
schema The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms. Schema may refer to: Science and technology * SCHEMA ...
. Internal to OSF, all data is converted to RDF to provide a
common data model A common data model (CDM) can refer to any standardised data model which allows for data and information exchange between different applications and data sources. Common data models aim to standardise logical infrastructure so that related applicat ...
. The
OWL 2 The Web Ontology Language (OWL) is a family of Knowledge representation and reasoning, knowledge representation languages for authoring Ontology (information science), ontologies. Ontologies are a formal way to describe taxonomies and classificat ...
ontology language In computer science and artificial intelligence, ontology languages are formal languages used to construct ontologies. They allow the encoding of knowledge about specific domains and often include reasoning rules that support the processing of th ...
is used to describe the data schema overlaying all of the constituent data sources. The
architecture Architecture is the art and technique of designing and building, as distinguished from the skills associated with construction. It is both the process and the product of sketching, conceiving, planning, designing, and constructing building ...
of OSF is built around a central layer of RESTful web services, designed to enable most constituent modules within the software stack to be substituted without major adverse impacts on the entire stack. A central organizing perspective of OSF is that of the
dataset A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the ...
. These datasets contain the records in any given OSF instance. One or more domain ontologies is used by a given OSF instance to define the structural relationships amongst the data and their attributes and concepts. Some of the use applications for OSF include
local government Local government is a generic term for the lowest tiers of public administration within a particular sovereign state. This particular usage of the word government refers specifically to a level of administration that is both geographically-loca ...
,
health information systems Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic hea ...
, community indicator systems,
eLearning Educational technology (commonly abbreviated as edutech, or edtech) is the combined use of computer hardware, software, and educational theory and practice to facilitate learning. When referred to with its abbreviation, edtech, it often refers ...
, citizen engagement, or any domain that may be modeled by ontologies. Documentation and training videos are provided with the open-source OSF application.


History

Early components of OSF were provided under the names of structWSF and conStruct starting in June 2009. The first version 1.x of OSF was announced in August 2010. The first automated OSF installer was released in March 2012. OSF was expanded with an ontology manager, structOntology in August 2012. The version 2.x developments of OSF occurred for enterprise sponsors in the period of early 2012 until the end of 2013. None of these interim 2.x versions were released to the public. Then, at the conclusion of this period, Structured Dynamics, the main developer of OSF,
refactored In computer programming and software design, code refactoring is the process of restructuring existing computer code—changing the '' factoring''—without changing its external behavior. Refactoring is intended to improve the design, structur ...
these specific enterprise developments to leapfrog to a new version 3.0 of OSF, announced in early 2014. These public releases were last updated to OSF version 3.4.0 in August 2016.


Architecture and technologies

The Open Semantic Framework has a basic three-layer architecture. User interactions and content management are provided by an external
content management system A content management system (CMS) is computer software used to manage the creation and modification of digital content (content management).''Managing Enterprise Content: A Unified Content Strategy''. Ann Rockley, Pamela Kostur, Steve Manning. New ...
, which is currently
Drupal Drupal () is a free and open-source web content management system (CMS) written in PHP and distributed under the GNU General Public License. Drupal provides an open-source back-end framework for at least 14% of the top 10,000 websites worldwide ...
(but does not depend on it). This layer accesses the pivota
OSF Web Services
there are now more than 20 providing OSF's
distributed computing A distributed system is a system whose components are located on different computer network, networked computers, which communicate and coordinate their actions by message passing, passing messages to one another from any system. Distributed com ...
functionality. Full
CRUD In computer programming, create, read, update, and delete (CRUD) are the four basic operations of persistent storage. CRUD is also sometimes used to describe user interface conventions that facilitate viewing, searching, and changing information u ...
access and user permissions and security is provided to all digital objects in the stack. This
middleware Middleware is a type of computer software that provides services to software applications beyond those available from the operating system. It can be described as "software glue". Middleware makes it easier for software developers to implement co ...
layer then provides a means to access the third layer, the engines and indexers that drive the entire stack. Both the top CMS layer and the engines layer are provided by existing off-the-shelf software. What makes OSF a complete stack are the connecting scripts and the intermediate Web services layer. The premise of the OSF stack is based on the RDF data model. RDF provides the means for integrating existing structured data assets in any format, with semi-structured data like XML and HTML, and unstructured documents or text. The OSF framework is made operational via ontologies that capture the domain or knowledge space, matched with internal ontologies that guide OSF operations and data display. This design approach is known as ''ODapps'', for ontology-driven applications.


Content management layer

OSF delegates all direct user interactions and standard content management to an external
CMS CMS may refer to: Computing * Call management system * CMS-2 (programming language), used by the United States Navy * Code Morphing Software, a technology used by Transmeta * Collection management system for a museum collection * Color manag ...
. In the case of Drupal, this integration is tighter, and supports connectors and modules that can replace standard Drupal storage and databases with OSF
triplestore A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred". Much like a relat ...
s.


Web services layer

This intermediate OSF Web Services layer may also be accessed directly via API or command line or utilities like
cURL cURL (pronounced like "curl", UK: , US: ) is a computer software project providing a library (libcurl) and command-line tool (curl) for transferring data using various network protocols. The name stands for "Client URL". History cURL was fi ...
, suitable for interfacing with standard content management systems (CMSs), or via a dedicated suite of connectors and modules that leverage the open source Drupal CMS. These connectors and modules, also part of the standard OSF stack and calle
OSF for Drupal
natively enable Drupal's existing thousands of modules and ecosystem of developers and capabilities to access OSF using familiar Drupal methods. The OSF middleware framework is generally RESTful in design and is based on
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
and Web protocols and
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
open standards. The initial OSF framework comes packaged with a baseline set of more than 20 Web services in CRUD, browse, search, tagging, ontology management, and export and import. All Web services are exposed via APIs and
SPARQL SPARQL (pronounced "sparkle" , a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description F ...
endpoints. Each request to an individual Web service returns an HTTP status and optionally a document of ''resultsets''. Each results document can be serialized in many ways, and may be expressed as either RDF, pure
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
,
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
, or other formats.


Engines layer

The engines layer represents the major workflow requirements and data management and indexing of the system. The premise of the Open Semantic Framework is based on the RDF data model. Using a common data model means that all Web services and actions against the data only need to be programmed via a single,
canonical form In mathematics and computer science, a canonical, normal, or standard form of a mathematical object is a standard way of presenting that object as a mathematical expression. Often, it is one which provides the simplest representation of an obje ...
. Simple converters convert external, native data formats to the RDF form at time of ingest; similar converters can translate the internal RDF form back into native forms for export (or use by external applications). This use of a canonical form leads to a simpler design at the core of the stack and a uniform basis to which tools or other work activities can be written. The OSF engines are all open source and work to support this premise. The OSF engines layer governs the index and management of all OSF content. Documents are indexed by the
Solr Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features an ...
engine for full-text search, while information about their structural characteristics and metadata are stored in an RDF
triplestore A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred". Much like a relat ...
database provided by OpenLink's
Virtuoso A virtuoso (from Italian ''virtuoso'' or , "virtuous", Late Latin ''virtuosus'', Latin ''virtus'', "virtue", "excellence" or "skill") is an individual who possesses outstanding talent and technical ability in a particular art or field such as ...
software. The schema aspects of the information (the "ontologies") are separately managed and manipulated with their own W3C standard application, th
OWL API
At ingest time, the system automatically routes and indexes the content into its appropriate stores. Another engine, GATE (
General Architecture for Text Engineering General Architecture for Text Engineering or GATE is a Java suite of tools originally developed at the University of Sheffield beginning in 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for many nat ...
), provides semi-automatic assistance in tagging input information and other
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...
(NLP) tasks.


Alternatives

OSF is sometimes referred to as a linked data application. Alternative applications in this space include:
Callimachus
*
CubicWeb CubicWeb is a free and open-source semantic web application framework, licensed under the LGPL. It is written in Python. It has been an open free software project since October 2008, but the project began in 2000 and was initially developed by L ...

LOD2 Stack
*
Apache Marmotta Apache Marmotta is a linked data platform that comprises several components. In its most basic configuration it is a Linked Data server. Marmotta is one of the reference projects early implementing the new Linked Data Platform recommendation that ...
The Open Semantic Framework also has alternatives in the
semantic publishing Semantic publishing on the Web, or semantic web publishing, refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning ...
and semantic computing arenas.


See also

*
Data integration Data integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies ...
*
Data management Data management comprises all disciplines related to handling data as a valuable resource. Concept The concept of data management arose in the 1980s as technology moved from sequential processing (first punched cards, then magnetic tape) to r ...
*
Drupal Drupal () is a free and open-source web content management system (CMS) written in PHP and distributed under the GNU General Public License. Drupal provides an open-source back-end framework for at least 14% of the top 10,000 websites worldwide ...
*
Enterprise information integration Enterprise information integration (EII) is the ability to support an unified view of data and information for an entire organization. In a data virtualization application of EII, a process of information integration, using data abstraction to pr ...
*
Knowledge organization Knowledge organization (KO), organization of knowledge, organization of information, or information organization is an intellectual discipline concerned with activities such as document description, indexing, and classification that serve to ...
*
Linked data In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but r ...
*
Middleware Middleware is a type of computer software that provides services to software applications beyond those available from the operating system. It can be described as "software glue". Middleware makes it easier for software developers to implement co ...
*
Ontology-based data integration Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View ...
*
Resource Description Framework The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of ...
*
Resource-oriented architecture In software engineering, a resource-oriented architecture (ROA) is a style of software architecture and programming paradigm for supportive designing and developing software in the form of Internetworking of System resource, resources with "Represen ...
* Semantic computing *
Semantic integration Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information (physical, psychological, and social), documents of all sorts, contacts (including ...
*
Semantic publishing Semantic publishing on the Web, or semantic web publishing, refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning ...
*
Semantic search Semantic search denotes search with meaning, as distinguished from lexical search where the search engine looks for literal matches of the query words or variants of them, without understanding the overall meaning of the query. Semantic search seek ...
*
Semantic service-oriented architecture A Semantic Service Oriented Architecture (SSOA) is an architecture that allows for scalable and controlled Enterprise Application Integration solutions. SSOA describes an approach to enterprise-scale IT infrastructure. It leverages rich, machine-i ...
*
Semantic technology The ultimate goal of semantic technology is to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF (Resource Description Framework) and OWL (Web Ontology Language). These technologies ...
*
Software framework In computer programming, a software framework is an abstraction in which software, providing generic functionality, can be selectively changed by additional user-written code, thus providing application-specific software. It provides a standard ...
*
Web Ontology Language The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for variou ...


References


External links


Official website

Drupal

GATE

Open Semantic Framework code repository
at
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...

OSF interest group

OWL API

Virtuoso


Further information

* Technical documentation library at the * Video training series at the {{cite web , title=OSF Academy , website=YouTube Open Semantic Framework Academy , accessdate=30 September 2014 , url=https://www.youtube.com/user/osfacademy Free content management systems Free software programmed in PHP Knowledge management Ontology (information science) Semantic Web Web frameworks