Cypher (query Language)
   HOME

TheInfoList



OR:

Cypher is a declarative graph query language that allows for expressive and efficient data querying in a property graph. Cypher was largely an invention of Andrés Taylor while working for Neo4j, Inc. (formerly Neo Technology) in 2011. Cypher was originally intended to be used with the graph database
Neo4j Neo4j is a graph database management system developed by Neo4j, Inc. Described by its developers as an ACID-compliant transactional database with native graph storage and processing, Neo4j is available in a non-open-source "community edition" ...
, but was opened up through th
openCypher
project in October 2015. The language was designed with the power and capability of SQL (standard query language for the relational database model) in mind, but Cypher was based on the components and needs of a database built upon the concepts of
graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
. In a graph model, data is structured as nodes ( vertices in math and network science) and relationships (edges in math and network science) to focus on how entities in the data are connected and related to one another.


Graph model

Cypher is based on the Property Graph Model, which organizes data into nodes and edges (called “relationships” in Cypher). In addition to those standard
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
elements of nodes and relationships, the property graph model adds labels and properties for describing finer categories and attributes of the data. Nodes are the entities in the graph. They can hold any number of attributes (key-value pairs) called properties. Nodes can be tagged with zero or more labels (like tags or categories), representing their different roles in a domain. Relationships provide directed, named, semantically-relevant connections between two node entities. A relationship always has a direction, a start node, an end node, and exactly one relationship type. Like nodes, relationships can also have properties. Labels can group similar nodes together by assigning zero or more node labels. Labels are kind of like tags and allow you to specify certain types of entities to look for or create. Properties are key-value pairs with a binding of a string key and some value from the Cypher type system. Cypher queries are assembled with patterns of nodes and relationships with any specified filtering on labels and properties to create, read, update, delete data found in the specified pattern.


Type system

The Cypher type system includes many of the common types used in other programming and query languages. Supported types include scalar value types such as boolean, string, number, integer, and floating-point numbers. It also supports temporal types like datetime, localdatetime, date, time, localtime, and duration. Container types for maps and lists are available, along with graph types for node, relationship, and path, and a void type.


Syntax

The Cypher query language depicts patterns of nodes and relationships and filters those patterns based on labels and properties. Cypher’s syntax is based on ASCII art, which is text-based visual art for computers. This makes the language very visual and easy to read because it both visually and structurally represents the data specified in the query. For instance, nodes are represented with parentheses around the attributes and information regarding the entity. Relationships are depicted with an arrow (either directed or undirected) with the relationship type in brackets. //node (variable:Label ) //relationship - ariable:RELATIONSHIP_TYPE> //Cypher pattern (node1:LabelA)- el1:RELATIONSHIP_TYPE>(node2:LabelB)


Keywords

Similar to other query languages, Cypher contains a variety of keywords for specifying patterns, filtering patterns, and returning results. Among those most common are: MATCH, WHERE, and RETURN. These operate slightly differently than the SELECT and WHERE in SQL; however, they have similar purposes. MATCH is used before describing the search pattern for finding nodes, relationships, or combinations of nodes and relationships together. WHERE in Cypher is used to add additional constraints to patterns and filter out any unwanted patterns. Cypher’s RETURN formats and organizes how the results should be outputted. Just as with other query languages, you can return the results with specific properties, lists, ordering, and more. Using the keywords with the pattern syntax shown above, the example query below will search for the pattern of the node (Actor label and property called name with value of 'Nicole Kidman') connected by a relationship (ACTED_IN type and outgoing direction away from the first node) to another node (Movie label). The WHERE clause then filters to only keep patterns where the Movie node in the match clause has a year property that is less than the value of the parameter passed in. In the return, the query specifies to output the movie nodes that fit the pattern and filtering from the match and where clauses. MATCH (nicole:Actor )- ACTED_IN>(movie:Movie) WHERE movie.year < $yearParameter RETURN movie Cypher also contains keywords to specify clauses for writing, updating, and deleting data. CREATE and DELETE are used to create and delete nodes and relationships. SET and REMOVE are used to set values to properties and add labels on nodes. MERGE is used to create nodes uniquely without duplicates. Nodes can only be deleted when they have no other relationships still existing. For example: MATCH (startContent:Content)- elationship:IS_RELATED_TO>(endContent:Content) WHERE endContent.source = 'user' OPTIONAL MATCH (endContent)- () DELETE relationship, endContent


Cypher implementations

Cypher is implemented in Neo4j's database, in Amazon Neptune, in SAP's
HANA Hana or HANA may refer to: Places Europe * Haná, an ethnic region in Moravia, Czech Republic * Traianoupoli, Greece, called Hana during the Ottoman period * Hana, Norway, a borough in the city of Sandnes, Norway West Asia * Hana, Iran, a ci ...
Graph, by Redis Graph., by Cambridge Semantics' Anzograph, by Bitnine's AgensGraph, by Memgraph, and in open source projects Cypher for
Gremlin A gremlin is a mischievous folkloric creature invented at the beginning of the 20th century to originally explain malfunctions in aircraft and later in other machinery and processes and their operators. Depictions of these creatures vary widel ...
maintained by Neueda Labs in Riga, and Cypher for Apache Spark (now renamed to Morpheus), as well as in research projects such as Cypher.PL and Ingraph.


Standardization

With the openCypher project, an effort began to standardize Cypher as the query language for graph processing. As part of this process there have been five face-to-face openCypher Implementers Meetings (oCIMs). The first meeting took place in February 2017 at SAP's headquarters in Walldorf in Germany, coincident with a meeting of the Linked Data Benchmark Council. The most recent OCIM took place in Berlin, coincident with the W3C Workshop on Web Standards for Graph Data Management, in March 2019. At that meeting, there was a consensus to work towards Cypher becoming a significant input into a wider project for an international standardized Graph Query Language called GQL. In September 2019, a proposal for a GQL standard project was approved by a vote of national standards bodies which are members of ISO/IEC Joint Technical Committee 1 (responsible for information technology standards). The GQL project proposal states the following: Currently, the GQL Standard is being written and not yet publicly available. However, the first open-source implementation of a subset of the language is already available. Aside from the implementation, one can also find a formalization and read the syntax of the specific subset of GQL.


See also

*
Neo4j Neo4j is a graph database management system developed by Neo4j, Inc. Described by its developers as an ACID-compliant transactional database with native graph storage and processing, Neo4j is available in a non-open-source "community edition" ...
, a popular graph database for the Cypher Query Language *
Graph database A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the '' graph'' (or ''edge'' or ''relationship''). The graph rel ...
, the background, data models, components, and providers for this database category *
SPARQL SPARQL (pronounced " sparkle" , a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description ...
, a W3C standard declarative query language for querying graph data *
Gremlin A gremlin is a mischievous folkloric creature invented at the beginning of the 20th century to originally explain malfunctions in aircraft and later in other machinery and processes and their operators. Depictions of these creatures vary widel ...
, another way to query graph data * GQL( Graph Query Language)


References

{{Query languages Query languages Structured storage Programming languages created in 2007 NoSQL