Snowflake ID
   HOME

TheInfoList



OR:

Snowflake IDs, or snowflakes, are a form of
unique identifier A unique identifier (UID) is an identifier that is guaranteed to be unique among all identifiers used for those objects and for a specific purpose. The concept was formalized early in the development of computer science and information systems. ...
used in
distributed computing A distributed system is a system whose components are located on different computer network, networked computers, which communicate and coordinate their actions by message passing, passing messages to one another from any system. Distributed com ...
. The format was created by
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
and is used for the IDs of tweets. The format has been adopted by other companies, including
Discord Discord is a VoIP and instant messaging social platform. Users have the ability to communicate with voice calls, video calls, text messaging, media and files in private chats or as part of communities called "servers".The developer documenta ...
and
Instagram Instagram is a photo and video sharing social networking service owned by American company Meta Platforms. The app allows users to upload media that can be edited with filters and organized by hashtags and geographical tagging. Posts can ...
. The
Mastodon A mastodon ( 'breast' + 'tooth') is any proboscidean belonging to the extinct genus ''Mammut'' (family Mammutidae). Mastodons inhabited North and Central America during the late Miocene or late Pliocene up to their extinction at the end of th ...
social network uses a modified version.


Format

Snowflakes are 64
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
s in binary. (Only 63 are used to fit in a
signed integer In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
.) The first 41 bits are a
timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
, representing milliseconds since the chosen
epoch In chronology and periodization, an epoch or reference epoch is an instant in time chosen as the origin of a particular calendar era. The "epoch" serves as a reference point from which time is measured. The moment of epoch is usually decided by ...
. The next 10 bits represent a machine ID, preventing clashes. Twelve more bits represent a per-machine sequence number, to allow creation of multiple snowflakes in the same millisecond. The final number is generally serialized in decimal. Snowflakes are sortable by time, because they are based on the time they were created. Additionally, the time a snowflake was created can be calculated from the snowflake. This can be used to get snowflakes (and their associated objects) that were created before or after a particular date.


Example

A tweet produced by @Wikipedia in June 2022 has the snowflake ID . The number may be converted to binary as , with pipe symbols denoting the three parts of the ID. * The first 41 (+ 1 top zero bit) bits convert to decimal as . Add the value to the Twitter Epoch of (in
Unix time Current Unix time () Unix time is a date and time representation widely used in computing. It measures time by the number of seconds that have elapsed since 00:00:00 UTC on 1 January 1970, the beginning of the Unix epoch, less adjustments m ...
milliseconds), the Unix time of the tweet is therefore : June 28, 2022 16:07:40.105 UTC. * The middle 10 bits are the machine ID. * The last 12 bits decode to all zero, meaning this tweet is the first tweet processed by the machine at the given millisecond.


Usage

The format was first announced by Twitter in June 2010. Due to implementation challenges, they waited until later in the year to roll out the update. Twitter uses snowflake IDs for tweets, direct messages, users, lists, and all other objects available over the
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
. Discord also uses snowflakes, with their epoch set to the first second of the year 2015. Instagram uses a modified version of the format, with 41 bits for a timestamp, 13 bits for a
shard Shard or sherd is a sharp piece of glass, pottery or stone. Shard may also refer to: Places * Shard End, a place in Birmingham, United Kingdom Architecture * Dresden Shard, a redesign of the Bundeswehr Military History Museum in Dresden, German ...
ID, and 10 bits for a sequence number. Mastodon's modified format has 48 bits for a millisecond-level timestamp, it uses the
UNIX epoch Current Unix time () Unix time is a date and time representation widely used in computing. It measures time by the number of seconds that have elapsed since 00:00:00 UTC on 1 January 1970, the beginning of the Unix epoch, less adjustments m ...
. The remaining 16 bits are for sequence data.Source Code


See also

*
Universally unique identifier A universally unique identifier (UUID) is a 128-bit nominal number, label used for information in computer systems. The term globally unique identifier (GUID) is also used. When generated according to the standard methods, UUIDs are, for practic ...


References


External links

* {{GitHub, https://github.com/twitter-archive/snowflake/tree/b3f6a3c6ca8e1b6847baa6ff42bf72201e2c2231, Twitter's reference implementation Twitter Distributed data structures Unique identifiers