SemmleCode
   HOME

TheInfoList



OR:

Semmle Inc is a code-analysis platform with offices in
San Francisco San Francisco (; Spanish language, Spanish for "Francis of Assisi, Saint Francis"), officially the City and County of San Francisco, is the commercial, financial, and cultural center of Northern California. The city proper is the List of Ca ...
,
Seattle Seattle ( ) is a seaport city on the West Coast of the United States. It is the seat of King County, Washington. With a 2020 population of 737,015, it is the largest city in both the state of Washington and the Pacific Northwest regio ...
,
New York New York most commonly refers to: * New York City, the most populous city in the United States, located in the state of New York * New York (state), a state in the northeastern United States New York may also refer to: Film and television * '' ...
,
Oxford Oxford () is a city in England. It is the county town and only city of Oxfordshire. In 2020, its population was estimated at 151,584. It is north-west of London, south-east of Birmingham and north-east of Bristol. The city is home to the ...
,
Valencia Valencia ( va, València) is the capital of the Autonomous communities of Spain, autonomous community of Valencian Community, Valencia and the Municipalities of Spain, third-most populated municipality in Spain, with 791,413 inhabitants. It is ...
and
Copenhagen Copenhagen ( or .; da, København ) is the capital and most populous city of Denmark, with a proper population of around 815.000 in the last quarter of 2022; and some 1.370,000 in the urban area; and the wider Copenhagen metropolitan ar ...
. Semmle was acquired by
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
(itself owned by
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
) on 18 September 2019 for an undisclosed amount. Semmle's LGTM technology automates
code review Code review (sometimes referred to as peer review) is a software quality assurance activity in which one or several people check a program mainly by viewing and reading parts of its source code, and they do so after implementation or as an interru ...
, tracks developer contributions, and flags software security issues. The LGTM platform leverages the CodeQL query engine (formerly QL) to perform semantic analysis on software code bases. GitHub aims to integrate Semmle technology to provide continuous vulnerability detection services. In November 2019, use of CodeQL was made free for research and open source. CodeQL either shares a direct pedigree with
.QL .QL (pronounced "dot-cue-el") is an object-oriented query language used to retrieve data from relational database management systems. It is reminiscent of the standard query language SQL and the object-oriented programming language Java. .QL is ...
(dot-que-ell), which derives from the
Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...
family tree, or is an evolution of similar technology. SemmleCode is an
object-oriented Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pro ...
query language Query languages, data query languages or database query languages (DQL) are computer languages used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL). Types Broadly, query language ...
for deductive
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s developed by Semmle. It is distinguished within this class by its support for
recursive Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
query.


Corporate background

The company is headquartered in
San Francisco San Francisco (; Spanish language, Spanish for "Francis of Assisi, Saint Francis"), officially the City and County of San Francisco, is the commercial, financial, and cultural center of Northern California. The city proper is the List of Ca ...
, with its development operations based in Blue Boar Court,
Alfred Street Alfred Street is a street running between the High Street to the north and the junction with Blue Boar Street and Bear Lane at the southern end, in central Oxford, England.
, central
Oxford Oxford () is a city in England. It is the county town and only city of Oxfordshire. In 2020, its population was estimated at 151,584. It is north-west of London, south-east of Birmingham and north-east of Bristol. The city is home to the ...
,
England England is a country that is part of the United Kingdom. It shares land borders with Wales to its west and Scotland to its north. The Irish Sea lies northwest and the Celtic Sea to the southwest. It is separated from continental Europe b ...
. Semmle's customers include
Credit Suisse Credit Suisse Group AG is a global investment bank and financial services firm founded and based in Switzerland. Headquartered in Zürich, it maintains offices in all major financial centers around the world and is one of the nine global " ...
,
NASA The National Aeronautics and Space Administration (NASA ) is an independent agency of the US federal government responsible for the civil space program, aeronautics research, and space research. NASA was established in 1958, succeeding t ...
and
Dell Dell is an American based technology company. It develops, sells, repairs, and supports computers and related products and services. Dell is owned by its parent company, Dell Technologies. Dell sells personal computers (PCs), servers, data ...
.


SemmleCode background


Academic

SemmleCode builds on academic research on querying the source of software programs. The first such system was Linton's Omega system, where queries were phrased in QUEL. QUEL did not allow for
recursion Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
in queries, making it difficult to inspect hierarchical program structures such as the
call graph A call graph (also known as a call multigraph) is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge ''(f, g)'' indicates that procedure ''f'' cal ...
. The next significant development was therefore the use of
logic programming Logic programming is a programming paradigm which is largely based on formal logic. Any program written in a logic programming language is a set of sentences in logical form, expressing facts and rules about some problem domain. Major logic prog ...
, which does allow such recursive queries, in the XL C++ Browser. The disadvantage of using a full logic programming language is however that it is very difficult to attain acceptable efficiency. The CodeQuest system, developed at the
University of Oxford , mottoeng = The Lord is my light , established = , endowment = £6.1 billion (including colleges) (2019) , budget = £2.145 billion (2019–20) , chancellor ...
, was the first to exploit the observation that
Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...
, a very restrictive version of logic programming, is in the sweet spot between expressive power and efficiency. The QL
query language Query languages, data query languages or database query languages (DQL) are computer languages used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL). Types Broadly, query language ...
is an object-oriented version of Datalog.


Industrial

The early research works on querying the source of software programs spun off a number of industrial applications. In particular it became the cornerstone of systems for application intelligence ( data mining on the source of software systems) and software renovation. In 2007,
Paris Paris () is the capital and most populous city of France, with an estimated population of 2,165,423 residents in 2019 in an area of more than 105 km² (41 sq mi), making it the 30th most densely populated city in the world in 2020. S ...
-based CAST is one of the market leaders in that area, and other significant players include BluePhoenix in
Herzliya Herzliya ( ; he, הֶרְצְלִיָּה ; ar, هرتسليا, Hirtsiliyā) is an affluent city in the central coast of Israel, at the northern part of the Tel Aviv District, known for its robust start-up and entrepreneurial culture. In it h ...
,
Israel Israel (; he, יִשְׂרָאֵל, ; ar, إِسْرَائِيل, ), officially the State of Israel ( he, מְדִינַת יִשְׂרָאֵל, label=none, translit=Medīnat Yīsrāʾēl; ), is a country in Western Asia. It is situated ...
. SemmleCode differs from these systems in its use of an object-oriented query language, which allows programmers to easily formulate new queries that are particular to their own project. A full account of the academic and industrial developments leading up to the creation of SemmleCode can be found in a paper by Hajiyev et al.Elnar Hajiyev, Mathieu Verbaere, and Oege de Moor, CodeQuest: Scalable Source Code Queries with Datalog. In ''ECOOP 2006: Proceedings of the 2006 European Conference on Object-Oriented Programming'', pages 2–27.
Springer Springer or springers may refer to: Publishers * Springer Science+Business Media, aka Springer International Publishing, a worldwide publishing group founded in 1842 in Germany formerly known as Springer-Verlag. ** Springer Nature, a multinationa ...
, 2006.


Sample query in QL

To illustrate the use of QL, consider the well-known rule in
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pr ...
that public fields should be declared final. To find violations of that rule, we should search for fields that are public but not final. In QL, that requirement is expressed as follows: from Field f where f.hasModifier("public") and not(f.hasModifier("final")) select f.getDeclaringType().getPackage(), f.getDeclaringType(), f Here not only is the offending field f selected, but also the package and type in which its declaration occurs.


SemmleCode integration with development environments

SemmleCode provides a
user interface In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine f ...
via the
Eclipse IDE Eclipse is an integrated development environment (IDE) used in computer programming. It contains a base workspace and an extensible plug-in system for customizing the environment. It is the second-most-popular IDE for Java development, and, un ...
to query Java code (both source code and bytecode) as well as XML files, and to edit QL queries. This is however but one application of the technology that underlies it: QL can be used to query any other type of complex data. As part of the fold into the Microsoft/GitHub corporate house, the original
Eclipse An eclipse is an astronomical event that occurs when an astronomical object or spacecraft is temporarily obscured, by passing into the shadow of another body or by having another body pass between it and the viewer. This alignment of three ce ...
-based workflow has been supplanted with a workflow based around Microsoft's
Visual Studio Code Visual Studio Code, also commonly referred to as VS Code, is a source-code editor made by Microsoft with the Electron Framework, for Windows, Linux and macOS. Features include support for debugging, syntax highlighting, intelligent code complet ...
.


See also

*
List of tools for static code analysis This is a list of notable tools for static program analysis (program analysis is a synonym for code analysis). Static code analysis tools Languages Ada * * * * * * * * * * * C, C++ * * * * * * * * * * * * ...
*
.QL .QL (pronounced "dot-cue-el") is an object-oriented query language used to retrieve data from relational database management systems. It is reminiscent of the standard query language SQL and the object-oriented programming language Java. .QL is ...
*
Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...


References


Further reading

* Mark A. Linton. Implementing relational views of programs. In Peter B. Henderson, editor, ''Software Development Environments (SDE)'', pages 132–140, 1984.


External links

* {{DEFAULTSORT:Semmle Companies based in Oxford Software companies of the United Kingdom Software testing tools Java development tools Static program analysis tools