HOME

TheInfoList



OR:

Open Mind Common Sense (OMCS) is an
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
project based at the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the ...
(MIT) Media Lab whose goal is to build and utilize a large commonsense knowledge base from the contributions of many thousands of people across the Web. It has been active from 1999 to 2016. Since its founding, it has accumulated more than a million English facts from over 15,000 contributors in addition to knowledge bases in other languages. Much of OMCS's software is built on three interconnected representations: the natural language corpus that people interact with directly, a semantic network built from this corpus called
ConceptNet Open Mind Common Sense (OMCS) is an artificial intelligence project based at the Massachusetts Institute of Technology (MIT) Media Lab whose goal is to build and utilize a large commonsense knowledge base from the contributions of many thousands ...
, and a matrix-based representation of ConceptNet called AnalogySpace that can infer new knowledge using
dimensionality reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
. The knowledge collected by Open Mind Common Sense has enabled research projects at MIT and elsewhere.


History

The project was the brainchild of
Marvin Minsky Marvin Lee Minsky (August 9, 1927 – January 24, 2016) was an American cognitive and computer scientist concerned largely with research of artificial intelligence (AI), co-founder of the Massachusetts Institute of Technology's AI laboratory, an ...
, Push Singh,
Catherine Havasi Catherine Havasi (born 1981) is an American scientist who specialises in artificial intelligence (AI) at MIT Media Lab. She is co-founder and CEO of AI company Luminoso. Havasi was a member of the MIT group engaged in the Open Mind Common Se ...
, and others. Development work began in September 1999, and the project opened to the Internet a year later. Havasi described it in her dissertation as "an attempt to ... harness some of the distributed human computing power of the Internet, an idea which was then only in its early stages." The original OMCS was influenced by the website
Everything2 Everything2 (styled Everything2 or E2 for short) is a collaborative Web-based community consisting of a database of interlinked user-submitted written material. E2 is moderated for quality, but has no formal policy on subject matter. Writing on ...
and its predecessor, and presents a minimalist interface that is inspired by
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
. Push Singh would have become a professor at the
MIT Media Lab The MIT Media Lab is a research laboratory at the Massachusetts Institute of Technology, growing out of MIT's Architecture Machine Group in the School of Architecture. Its research does not restrict to fixed academic disciplines, but draws from ...
and lead the Common Sense Computing group in 2007, but committed suicide on February 28, 2006. The project is currently run by the Digital Intuition Group at the MIT Media Lab under Catherine Havasi.


Database and website

There are many different types of knowledge in OMCS. Some statements convey relationships between objects or events, expressed as simple phrases of natural language: some examples include "A coat is used for keeping warm", "The sun is very hot", and "The last thing you do when you cook dinner is wash your dishes". The database also contains information on the emotional content of situations, in such statements as "Spending time with friends causes happiness" and "Getting into a car wreck makes one angry". OMCS contains information on people's desires and goals, both large and small, such as "People want to be respected" and "People want good coffee". Originally, these statements could be entered into the Web site as unconstrained sentences of text, which had to be parsed later. The current version o
the Web site
collects knowledge only using more structured fill-in-the-blank templates. OMCS also makes use of data collected by the Game With a Purpose
Verbosity
. In its native form, the OMCS database is simply a collection of these short sentences that convey some common knowledge. In order to use this knowledge computationally, it has to be transformed into a more structured representation.


ConceptNet

ConceptNet is a
semantic network A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, ...
based on the information in the OMCS database. ConceptNet is expressed as a directed graph whose nodes are concepts, and whose edges are assertions of common sense about these concepts. Concepts represent sets of closely related natural language phrases, which could be noun phrases, verb phrases, adjective phrases, or clauses. ConceptNet is created from the natural-language assertions in OMCS by matching them against patterns using a shallow parser. Assertions are expressed as relations between two concepts, selected from a limited set of possible relations. The various relations represent common sentence patterns found in the OMCS corpus, and in particular, every "fill-in-the-blanks" template used on the knowledge-collection Web site is associated with a particular relation. The data structures that make up ConceptNet were significantly reorganized in 2007, and published as ConceptNet 3. The Software Agents group currently distributes a database and API for the new version 4.0. In 2010, OMCS co-founder and director Catherine Havasi, with Robyn Speer, Dennis Clark and Jason Alonso, created
Luminoso Luminoso is a Cambridge, MA-based text analytics and artificial intelligence company. It spun out of the MIT Media Lab and its crowd-sourced Open Mind Common Sense (OMCS) project. The company has raised $20.6 million in financing, and its client ...
, a text analytics software company that builds on ConceptNet. It uses ConceptNet as its primary lexical resource in order to help businesses make sense of and derive insight from vast amounts of qualitative data, including surveys, product reviews and social media.


Machine learning tools

The information in ConceptNet can be used as a basis for
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms. One representation, called AnalogySpace, uses
singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is related ...
to generalize and represent patterns in the knowledge in ConceptNet, in a way that can be used in AI applications. Its creators distribute a Python machine learning toolkit called Divisi for performing machine learning based on text corpora, structured knowledge bases such as ConceptNet, and combinations of the two.


Comparison to other projects

Other similar projects include
Never-Ending Language Learning Never-Ending Language Learning system (NELL) is a semantic machine learning system developed by a research team at Carnegie Mellon University, and supported by grants from DARPA, Google, NSF, and CNPq with portions of the system running on a superc ...
, Mindpixel (discontinued),
Cyc Cyc (pronounced ) is a long-term artificial intelligence project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge, Cyc f ...
, Learner, SenticNet,
Freebase Freebase may refer to: *Free base or freebase, the pure basic form of an amine, as opposed to its salt form *Freebase (database), a former online database service *Freebase (mixtape), ''Freebase'' (mixtape), 2014 mixtape by 2 Chainz *An original ...
, YAGO,
DBpedia DBpedia (from "DB" for " database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semanti ...
, and Open Mind 1001 Questions, which have explored alternative approaches to collecting knowledge and providing incentive for participation. The Open Mind Common Sense project differs from Cyc because it has focused on representing the common sense knowledge it collected as English sentences, rather than using a formal logical structure. ConceptNet is described by one of its creators, Hugo Liu, as being structured more like
WordNet WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into '' synsets'' with short definition ...
than Cyc, due to its "emphasis on informal conceptual-connectedness over formal linguistic-rigor". There is also the Brazilian initiative, named Open Mind Common Sense in Brazil (OMCS-Br), led by the Advanced Interaction Lab at Federal University of São Carlos (
LIA-UFSCar {{Unreferenced, date=December 2013 LIA – Advanced Interaction Laboratory (in Portuguese: Laboratório de Interação Avançada) was founded in 2003 as a Human-Computer Interaction (HCI) research lab in the Department of Computer Science at UFSC ...
). This project started in 2005, in collaboration with the Software Agents Group at the MIT Media Lab, the main goal is to collect common sense stated in Brazilian Portuguese and use it to develop
culturally sensitive Cultural sensitivity, also referred to as cross-cultural sensitivity or cultural awareness, is the knowledge, awareness, and acceptance of other cultures and others' cultural identities. It is related to cultural competence (the skills needed for ...
software applications based on extracting cultural profiles' knowledge from ConceptNet. This is intended to help developers and users with a culturally contextualized content software, making the final applications more flexible, adaptive, accessible and usable. The main applications' focuses are education and healthcare.


See also

*
Attempto Controlled English Attempto Controlled English (ACE) is a controlled natural language, i.e. a subset of standard English with a restricted syntax and restricted semantics described by a small set of construction and interpretation rules. It has been under developmen ...
(ACE), a
controlled natural language Controlled natural languages (CNLs) are subsets of natural languages that are obtained by restricting the grammar and vocabulary in order to reduce or eliminate ambiguity and complexity. Traditionally, controlled languages fall into two major typ ...
*
Never-Ending Language Learning Never-Ending Language Learning system (NELL) is a semantic machine learning system developed by a research team at Carnegie Mellon University, and supported by grants from DARPA, Google, NSF, and CNPq with portions of the system running on a superc ...
* Mindpixel * Semantic Web *
DBpedia DBpedia (from "DB" for " database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semanti ...
*
Freebase (database) Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contribution ...
*
YAGO (database) YAGO (Yet Another Great Ontology) is an open source knowledge base developed at the Max Planck Institute for Computer Science in Saarbrücken. It is automatically extracted from Wikipedia and other sources. As of 2019, YAGO3 has knowledge of mor ...


References


External links


Open Mind Common Sense

Open Mind Common Sense meta-repository Github

ConceptNet

AnalogySpace

The Divisi inference toolkit

Commonsense Computing Initiative's Webpage
(Site doesn't exist)
The Open Mind Initiative
(Site doesn't exist)
OMCSNetCPP - Open source C++ inference engine using the OMCSNet data

Open Mind Common Sense in Brazil
(Site broken)

(Legacy page)
Advanced Interaction Laboratory
{{Semantic Web Open-source artificial intelligence Knowledge bases Creative Commons-licensed databases