A citation graph (or citation network), in
information science
Information science (also known as information studies) is an academic field which is primarily concerned with analysis, collection, Categorization, classification, manipulation, storage, information retrieval, retrieval, movement, dissemin ...
and
bibliometrics
Bibliometrics is the use of statistical methods to analyse books, articles and other publications, especially in regard with scientific contents. Bibliometric methods are frequently used in the field of library and information science. Biblio ...
, is a
directed graph
In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs.
Definition
In formal terms, a directed graph is an ordered pa ...
that describes the
citations
A citation is a reference to a source. More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work for the purpose of ...
within a collection of documents.
Each
vertex
Vertex, vertices or vertexes may refer to:
Science and technology Mathematics and computer science
*Vertex (geometry), a point where two or more curves, lines, or edges meet
* Vertex (computer graphics), a data structure that describes the positio ...
(or
node
In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex).
Node may refer to:
In mathematics
*Vertex (graph theory), a vertex in a mathematical graph
*Vertex (geometry), a point where two or more curves, lines, ...
) in the graph represents a document in the collection, and each
edge
Edge or EDGE may refer to:
Technology Computing
* Edge computing, a network load-balancing system
* Edge device, an entry point to a computer network
* Adobe Edge, a graphical development application
* Microsoft Edge, a web browser developed by ...
is directed from one document toward another that it cites (or vice versa depending on the specific implementation).
Citation graphs have been utilised in various ways, including forms of
citation analysis
Citation analysis is the examination of the frequency, patterns, and graphs of citations in documents. It uses the directed graph of citations — links from one document to another document — to reveal properties of the documents. A t ...
, academic search tools and
court judgements. They are predicted to become more relevant and useful in the future as the body of published research grows.
Implementation
There is no standard format for the citations in
bibliographies
Bibliography (from and ), as a discipline, is traditionally the academic study of books as physical, cultural objects; in this sense, it is also known as bibliology (from ). English author and bibliographer John Carter describes ''bibliography ...
, and the
record linkage
Record linkage (also known as data matching, data linkage, entity resolution, and many other terms) is the task of finding records in a data set that refer to the same entity across different data sources (e.g., data files, books, websites, and da ...
of citations can be a time-consuming and complicated process. Furthermore, citation errors can occur at any stage of the publishing process. However, there is a long history of creating citation databases, also known as
citation index
A citation index is a kind of bibliographic index, an index of citations between publications, allowing the user to easily establish which later documents cite which earlier documents. A form of citation index is first found in 12th-century Hebr ...
es, so there is a lot of information about such problems.
In principle, each document should have a unique publication date and can only refer to earlier documents. This means that an ideal citation graph is not only directed but
acyclic; that is, there are no loops in the graph. This is not always the case in practice, since an academic paper goes through several versions in the publishing process. The timing of asynchronous updates to bibliographies may lead to edges that apparently point backward in time. Such "backward" citations seem to constitute less than 1% of the total number of links.
As citation links are meant to be permanent, the bulk of a citation graph should be static, and only the leading edge of the graph should change. Exceptions might occur when papers are withdrawn from circulation.
Background and history
A
citation is a reference to a published or unpublished source (not always the original source). More precisely, a citation is an
abbreviated
An abbreviation (from Latin ''brevis'', meaning ''short'') is a shortened form of a word or phrase, by any method. It may consist of a group of letters or words taken from the full version of the word or phrase; for example, the word ''abbrevia ...
alphanumeric
Alphanumericals or alphanumeric characters are a combination of alphabetical and numerical characters. More specifically, they are the collection of Latin letters and Arabic digits. An alphanumeric code is an identifier made of alphanumeric c ...
expression embedded in the body of an intellectual work that denotes an entry in the
bibliographic
Bibliography (from and ), as a discipline, is traditionally the academic study of books as physical, cultural objects; in this sense, it is also known as bibliology (from ). English author and bibliographer John Carter describes ''bibliography ...
references section of the work. Its purpose is to acknowledge the relevance of the works of others to the topic of discussion at the point where the
citation appears.
Generally the combination of both the in-body citation and the bibliographic entry constitutes what is commonly thought of as a citation (whereas bibliographic entries by themselves are not). References to single,
machine-readable assertions in electronic scientific articles are known as
nanopublications, a form of
micro attributions.
Citation networks are one kind of social network that has been studied quantitatively almost from the moment citation databases first became available. In 1965, Derek J. de Solla Price described the inherent linking characteristic of the
Science Citation Index
The Science Citation Index Expanded – previously entitled Science Citation Index – is a citation index originally produced by the Institute for Scientific Information (ISI) and created by Eugene Garfield. It was officially launched in 1964 and ...
(SCI) in his paper entitled "Networks of Scientific Papers." The links between citing and cited papers became dynamic when the SCI began to be published online. In 1973, Henry Small published his work on co-citation analysis, which became a
self-organizing
Self-organization, also called spontaneous order in the social sciences, is a process where some form of overall order and disorder, order arises from local interactions between parts of an initially disordered system. The process can be spon ...
classification system Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
that led to document clustering experiments and eventually what is called "Research Reviews."
[Structures and Statistics of Citation Networks, Miray Kas](_blank)
/ref>
Applications
Citation Analysis
Citation graphs can be applied to measures of scholarly impact, the impact a particular paper has had on the academic world. While a hard value to quantify, scholarly impact is useful, as having a measure of scholarly impact for many papers can aid in identifying important papers. It can also provide a measure of the relevance of a particular academic community. Citation graphs are very useful in measuring this as the number of connections on the citation graph corresponds with the scholarly impact of an article, as this means it has been cited by many other papers.
Similarity analysis is another area of citation analysis which frequently makes uses of citation graphs. The relationship between two papers in the citation graph has been compared to their text-based similarity, and it is found that closeness in the citation graph can predict a level of text-based similarity. Additionally, it has been found that the two methods – citation graph closeness and traditional content-based similarity – work well in conjunction to produce a more accurate result.
Analyses of citation graphs have also led to the proposal of the citation graph as a way to identify different communities and research areas within the academic world. It has been found that analysing the citation graph for groups of documents in conjunction with keywords can provide an accurate way to identify clusters of similar research. In a similar vein, a way of identifying the main “stream” of an area of research, or the progression of a research idea over time can be identified by using depth first search algorithms on the citation graph. Instead of looking at similarity between two nodes, or clusters of many nodes, this method instead goes through the links between nodes to trace a research idea back to its beginning, and so discover its progression through different papers to where its current status is.
Search Tools
The traditional method used by academic search tools is to check for matches between a search term and keywords in papers to return potential matches. While mostly effective, this method can lead to errors where a paper is recommended from a different discipline because of keyword matches even when the two topics actually have little in common.
Many have argued that this way of searching for relevant papers could be improved and made more accurate if citation graphs were incorporated into academic paper search tools. For example, one system was proposed which used both the keyword system and a popularity system based on how many connections a paper had in the citation graph. In this system, more connected papers were considered more popular and therefore given a higher weighting in the paper recommendation system.
In more recent years, visual search tools have been developed which use citation graphs to provide a visual representation of the connections between papers. A notable pioneer in this concept is the search tool ''Connected Papers,'' which began as a small project between friends and was released to the public in 2020. Given one academic paper, it analyses tens of thousands of other papers, and selecting all those relevant to the origin paper creates a citation graph and returns a visual representation of it to the viewer. This unique way of looking at research allows the viewer to see an entire area of research at a glance and can greatly aid in understanding the state of a research area and quickly identifying key papers that have lots of connections.
Court Judgements
Citation graphs have a history of being used to aid in organising and mapping citations of legal documents. In a similar way to the aforementioned search tools, constructions of citation graphs specific to the types of citations found in legal documents have been used to allow relevant past legal documents to be found when needed for a court decision. As a way of replacing or improving upon traditional search methods, this citation graph aided way of organising legal documents can provide higher efficiency, accuracy, and organisation.
Related networks
There are several other types of network graphs that are closely related to citation networks. The co-citation graph is the graph between documents as nodes, where two documents are connected if they share a common citation (see Co-citation
Co-citation is the frequency with which two documents are ''cited'' together by other documents.. If at least one other document cites two documents in common, these documents are said to be ''co-cited''. The more co-citations two documents recei ...
and Bibliographic coupling). Other related networks are formed using other information present in the document. For instance, in a collaboration graph, known in this context as a co-authorship network, the nodes are the authors of documents, linked if they have co-authored the same document. The link weights between two authors in co-authorship networks can increase over time if they have further collaboration.
Future Developments
While citation graphs have had a noticeable impact on several areas of academia, they are likely to become more relevant in the future. As the body of published research grows, more traditional ways of searching for papers will become less effective in narrowing down relevant papers to a particular topic. For example, text-based similarity can only go so far in selecting which papers are relevant to a topic, whereas the addition of citation graphs could make use of giving higher priority to those papers which have a lot of connections to other papers relevant to the topic.
However, developments like this face similar challenges to that of most applications of citation graphs, which is the face that there is no standardized format or way of citing. This makes the construction of these graphs very difficult, since it requires complex software analysis to extract citations from papers. One solution proposed to this problem is to create open databases of citation information in a format which could be used by anyone and easily converted to a different form, for example a citation graph.
See also
* Collaboration graph, a graph defined by the authors of documents
*Web graph The webgraph describes the directed links between pages of the World Wide Web. A graph, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph ...
, a citation graph of references from one web page to another in the World Wide Web
*Directed Acyclic Graph
In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ve ...
, the formal mathematical structure of a well-constructed citation graph
* Legal citation analysis citation analysis in legal contexts
References
Further reading
*.
*.
*{{citation
, last1 = Lu , first1 = Wangzhong
, last2 = Janssen , first2 = J.
, last3 = Milios , first3 = E.
, last4 = Japkowicz , first4 = N.
, last5 = Zhang , first5 = Yongzheng
, doi = 10.1007/s10115-006-0023-9
, issue = 1
, journal = Knowledge and Information Systems
, pages = 105–129
, title = Node similarity in the citation graph
, volume = 11
, year = 2007, s2cid = 26234247
.
External links
Connected Papers
Application-specific graphs
Citation metrics
Reference
Social networks