Cap'n Proto
   HOME

TheInfoList



OR:

Cap’n Proto is a
data serialization In computing, serialization (or serialisation) is the process of translating a data structure or object state into a format that can be stored (e.g. files in secondary storage devices, data buffers in primary storage devices) or transmitted (e ...
format and
Remote Procedure Call In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space (commonly on another computer on a shared network), which is coded as if it were a normal (l ...
(RPC) framework for exchanging data between computer programs. The high-level design focuses on speed and security, making it suitable for network as well as inter-process communication. Cap'n Proto was created by the former maintainer of Google's popular
Protocol Buffers Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a network or for storing data. The method involves an i ...
framework (Kenton Varda) and was designed to avoid some of its perceived shortcomings.


Technical overview


IDL Schema

Like most RPC frameworks dating as far back as
Sun RPC __NOTOC__ Open Network Computing (ONC) Remote Procedure Call (RPC), commonly known as Sun RPC is a remote procedure call system. ONC was originally developed by Sun Microsystems in the 1980s as part of their Network File System project. ONC is b ...
and OSF DCE RPC (and their object-based descendants
CORBA The Common Object Request Broker Architecture (CORBA) is a standard defined by the Object Management Group (OMG) designed to facilitate the communication of systems that are deployed on diverse platforms. CORBA enables collaboration between sys ...
and DCOM), Cap'n Proto uses an
Interface Description Language interface description language or interface definition language (IDL), is a generic term for a language that lets a program or object written in one language communicate with another program written in an unknown language. IDLs describe an inter ...
(IDL) to generate RPC libraries in a variety of programming languages - automating many low level details such as handling network requests, converting between data types, etc. The Cap'n Proto interface schema uses a C-like syntax and supports common primitives data types (booleans, integers, floats, etc.), compound types (structs, lists, enums), as well as generics and
dynamic type In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progra ...
s. Cap'n Proto also supports Object Oriented features such as multiple inheritance, which has been criticized for its complexity. @0xa558ef006c0c123; #Unique identifiers are manually or automatically assigned to files and compound types struct Date @0x5c5a558ef006c0c1 struct Contact @0xf032a54bcb3667e0 Values in Cap'n Proto messages are represented in
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that t ...
, as opposed text encoding used by "
human-readable A human-readable medium or human-readable format is any encoding of data or information that can be naturally read by humans. In computing, ''human-readable'' data is often encoded as ASCII or Unicode text, rather than as binary data. In most c ...
" formats such as
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
or
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
. Cap'n Proto tries to make the storage/network protocol appropriate as an in-memory format, so that no translation step is needed when reading data into memory or writing data out of memory.Unlike Apache Arrow, Cap'n Proto's in-memory values ar
not suited for sharing mutable data
/ref> For example, the representation of numbers (
endianness In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most sig ...
) was chosen to match the representation the most popular CPU architectures. When the in-memory and wire-protocol representations match, Cap'n Proto can avoid copying and encoding data when creating or reading a message and instead
point Point or points may refer to: Places * Point, Lewis, a peninsula in the Outer Hebrides, Scotland * Point, Texas, a city in Rains County, Texas, United States * Point, the NE tip and a ferry terminal of Lismore, Inner Hebrides, Scotland * Point ...
to the location of the value in memory. Cap'n Proto also supports random access to data, meaning that any field can be read without having to read the entire message. Unlike other binary serialization protocols such as XMI, Cap'n Proto considers fine-grained
data validation In computer science, data validation is the process of ensuring data has undergone data cleansing to ensure they have data quality, that is, that they are both correct and useful. It uses routines, often called "validation rules", "validation cons ...
at the RPC level an anti-feature that limits a protocols ability to evolve. This was informed by experiences at Google where simply changing a field from ''mandatory'' to ''optional'' would cause complex operational failures.Marking a field as required was removed fro
Protocol Buffers 3
Cap'n Proto schemas are designed to be flexible as possible and pushes data validation to the application level, allowing arbitrary renaming of fields, adding new fields, and making concrete types generic Cap'n Proto does, however, validate pointer bounds and type check individual values when they are first accessed. Enforcing complex schema constraints would also incur significant overhead,''Assuming the data has already been allocated'' (e.g. in network buffers, read from disk) access becomes
O(1) Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Land ...
. Additional serialization/deserialization steps (as required to inspect values) would limit performance to O(n).
negating the benefits of reusing in-memory data structures and preventing random access to data. Cap'n Proto protocol is ''theoretically'' suitable for very fast
inter-process communication In computer science, inter-process communication or interprocess communication (IPC) refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data. Typically, applications can use IPC, categori ...
(IPC) via immutable shared memory, but as of October 2020 none of the implementations support data passing via shared memory. However, Cap'n Proto is still generally considered faster than Protocol Buffers and similar RPC libraries.


Networking

Cap'n Proto RPC is network aware: supporting both handling of disconnects and promise pipelining, wherein a server pipes the output of one function into another function. This saves a client a round trip per successive call to the server without having to provide a dedicated API for every possible call graph. Cap'n Proto can be layered on top of TLS and support for the Noise Protocol Framework is on the roadmap. Cap'n Proto RPC is transport agnostic, with the mainline implementation supporting WebSockets, HTTP, TCP, and UDP.


Capability security

The Cap'n Proto RPC standard has a rich capability security model based on the CapTP protocol used by the
E programming language E is an object-oriented programming language for secure distributed computing, created by Mark S. Miller, Dan Bornstein, Douglas Crockford, Chip Morningstar and others at Electric Communities in 1997. E is mainly descended from the concurrent ...
. As of October 2020, the reference implementation only supports level 2.


Adoption

Cap'n Proto was originally created for Sandstorm.io, a startup offering a web application hosting platform with capability-based security. After Sandstorm.io failed commercially, the development team was acqui-hired by Cloudflare; which uses Cap'n Proto internally.


Notes


References

{{DEFAULTSORT:Capn Proto Data serialization formats Remote procedure call Inter-process communication