System Package Data Exchange (SPDX, formerly Software Package Data Exchange) is an
open standard
An open standard is a standard that is openly accessible and usable by anyone. It is also a common prerequisite that open standards use an open license that provides for extensibility. Typically, anybody can participate in their development due to ...
capable of representing systems with digital components as
bills of materials (BOMs).
First designed to describe software components, SPDX can describe the components of software systems, AI models, software builds, security data, and other data packages. SPDX allows the expression of components,
licenses, copyrights, security references and other
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
relating to systems.
The original purpose of SPDX was to improve license compliance,
and it has since been expanded to facilitate additional use cases such as supply-chain transparency and security.
SPDX is authored by the community-driven SPDX Project involving key industry experts, organizations, and open-source enthusiasts under the auspices of the
Linux Foundation
The Linux Foundation (LF) is a non-profit organization established in 2000 to support Linux development and open-source software projects.
Background
The Linux Foundation started as Open Source Development Labs in 2000 to standardize and prom ...
.
The SPDX specification is recognized as the international open standard for security, license compliance, and other software supply chain artifacts as ISO/IEC 5962:2021. The current version of the standard is 3.0.
Structure
Version 2.x
The SPDX 2.x standard defines an SBOM document, which contains SPDX metadata about software. The document itself can be expressed in multiple formats, including JSON, YAML, RDF/XML, tag–value, and spreadsheet. Each SPDX document describes one or more elements, which can be a software package, a specific file, or a snippet from a file. Each element is given a unique identifier, and metadata for an element can refer to other elements.
Version 3.0
SPDX 3.0 allows users to communicate information at a much more granular level without having to package it as "envelope" data. A key design principle in SPDX 3.0 is that all elements may be expressed and referenced independent of any other element. This independence is required to support a variety of content exchange and analysis use cases and makes it easier to communicate single elements of interest. The relationship structure has also been updated to be both more expressive and easier to understand compared to older versions of the spec.
The SPDX 3.0 data model is based on the
Resource Description Framework
The Resource Description Framework (RDF) is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium (W3C). It provides a variety of syntax notations and formats, of whi ...
(RDF). Data may be serialized in a variety of formats for storage and transmission, including formats defined in RDF 1.1 such as JSON-LD, Turtle (Terse RDF Triple Language), N-Triples, and RDF/XML.
SPDX 3.0 Profiles
The 3.0 specification introduced ''profiles'' to support the expansion of use cases beyond software, without increasing overall complexity. Profiles allow users to define data for the use cases they need, while also increasing the amount of information that can be gathered directly from the SPDX data. There are eight profiles defined by SPDX 3.0:
* Core: foundational concepts common to all profiles
* Software: concepts related to software artifacts
* Security: security-related metadata specific to a piece of software
* Build: information required to describe an instance of a
software build
A software build is the process of converting source code files into standalone artifact (software development), software artifact(s) that can be run on a computer, or the result of doing so.
In software production, builds optimize software for pe ...
* AI: concepts and data elements related to an AI system and model
* Dataset: concepts related to a dataset, including preparation process, characteristics, and access methods
* Licensing: license information necessary for compliance with typical licensing use cases
* Lite: subset of the SPDX specification aimed at balancing SPDX standard and actual workflows in some industries
Version history
The first version of the SPDX specification was intended to make compliance with
software licenses
Software consists of computer programs that instruct the execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital computers in the mid-20th cen ...
easier,
but subsequent versions of the specification added capabilities intended for other use-cases, such as being able to contain references to known
software vulnerabilities.
Recent versions of SPDX fulfill the
NTIA's 'Minimum Elements For a Software Bill of Materials'.
SPDX 2.2.1 was submitted to the
International Organization for Standardization
The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries.
M ...
(ISO) in October, 2020, and was published as ''ISO/IEC 5962:2021 Information technology — SPDX® Specification V2.2.1'' in August, 2021.
SPDX-License-Identifier
Syntax
Each license is identified by a full name, such as "Mozilla Public License 2.0" and a short identifier, here "MPL-2.0".
Licenses can be combined by operators
AND
and
OR
, and grouping
(
,
)
.
For example,
(Apache-2.0 OR MIT)
means that one can choose between
Apache-2.0
(
Apache License
The Apache License is a permissive free software license written by the Apache Software Foundation (ASF). It allows users to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software ...
) or
MIT
(
MIT license
The MIT License is a permissive software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts very few restrictions on reuse and therefore has high license compatibility.
Unl ...
). On the other hand,
(Apache-2.0 AND MIT)
means that both licenses apply.
There is also a "+" operator which, when applied to a license, means that future versions of the license apply as well. For example,
Apache-1.1+
means that
Apache-1.1
and
Apache-2.0
may apply (and future versions if any).
SPDX describes the exact terms under which a piece of software is licensed. It does not attempt to categorize licenses by type, for instance by describing licenses with similar terms to the
BSD License
BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lic ...
as "BSD-like".
In 2020, the European Commission published its Joinup Licensing Assistant, which makes possible the selection and comparison of more than 50 licenses, with access to their SPDX identifier and full text.
Deprecated license identifiers
The GNU family of licenses (e.g.,
GNU General Public License version 2) have the choice of choosing a later version of the license built in. Sometimes, it was not clear whether the SPDX expression
GPL-2.0
meant "exactly GPL version 2.0" or "GPL version 2.0 or any later version". Thus, since version 3.0 of the SPDX License List, the GNU family of licenses got new names.
GPL-2.0-only
means "exactly version 2.0" and
GPL-2.0-or-later
means "version 2.0 or any later version".
Adoption
For licensing
The SPDX license identifier can be added to the top of source code files as a short string unambiguously declaring the license used. The syntax, pioneered by
Das U-Boot in 2013, became part of SPDX in version 2.1. In 2017, the
FSFE launched REUSE, which provides tools to validate the comment and to efficiently extract copyright information.
The SPDX license identifier is also used in a number of package managers such as
npm, Python, and Rust cargo. SPDX license expressions are used in
RPM
Revolutions per minute (abbreviated rpm, RPM, rev/min, r/min, or r⋅min−1) is a unit of rotational speed (or rotational frequency) for rotating machines.
One revolution per minute is equivalent to hertz.
Standards
ISO 80000-3:2019 def ...
package metadata in
Fedora Linux
Fedora Linux is a Linux distribution developed by the Fedora Project. It was originally developed in 2003 as a continuation of the Red Hat Linux project. It contains software distributed under various free and open-source licenses and aims to b ...
, replacing the earlier use of the Callaway system. Debian uses a slightly different license specification.
See also
*
License proliferation
*
Rights Expression Language
References
External links
*
*
* Nathan Willis
A SPDX case studyLWN.net
{{Linux Foundation
Computer standards
Linux Foundation projects
ISO standards
IEC standards