Snowflake ID
   HOME

TheInfoList



OR:

Snowflake IDs, or snowflakes, are a form of
unique identifier A unique identifier (UID) is an identifier that is guaranteed to be unique among all identifiers used for those objects and for a specific purpose. The concept was formalized early in the development of computer science and information systems. ...
used in
distributed computing Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers. The components of a distributed system commu ...
. The format was created by Twitter (now X) and is used for the IDs of tweets. It is popularly believed that every
snowflake A snowflake is a single ice crystal that is large enough to fall through the Earth's atmosphere as snow.Knight, C.; Knight, N. (1973). Snow crystals. Scientific American, vol. 228, no. 1, pp. 100–107.Hobbs, P.V. 1974. Ice Physics. Oxford: C ...
has a unique structure, so they took the name "snowflake ID". The format has been adopted by other companies, including
Discord Discord is an instant messaging and Voice over IP, VoIP social platform which allows communication through Voice over IP, voice calls, Videotelephony, video calls, text messaging, and digital media, media. Communication can be private or take ...
and
Instagram Instagram is an American photo sharing, photo and Short-form content, short-form video sharing social networking service owned by Meta Platforms. It allows users to upload media that can be edited with Social media camera filter, filters, be ...
. The
Mastodon A mastodon, from Ancient Greek μαστός (''mastós''), meaning "breast", and ὀδούς (''odoús'') "tooth", is a member of the genus ''Mammut'' (German for 'mammoth'), which was endemic to North America and lived from the late Miocene to ...
social network uses a modified version.


Format

Snowflakes are 64
bit The bit is the most basic unit of information in computing and digital communication. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as ...
s in binary. (Only 63 are used to fit in a
signed integer In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
.) The first 41 bits are a
timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolu ...
, representing milliseconds since the chosen
epoch In chronology and periodization, an epoch or reference epoch is an instant in time chosen as the origin of a particular calendar era. The "epoch" serves as a reference point from which time is measured. The moment of epoch is usually decided b ...
. The next 10 bits represent a machine ID, preventing clashes. Twelve more bits represent a per-machine sequence number, to allow creation of multiple snowflakes in the same millisecond. The final number is generally serialized in decimal. Snowflakes are sortable by time, because they are based on the time they were created. Additionally, the time a snowflake was created can be calculated from the snowflake. This can be used to get snowflakes (and their associated objects) that were created before or after a particular date.


Example

A tweet produced by @Wikipedia in June 2022 has the snowflake ID . The number may be converted to binary as , with pipe symbols denoting the three parts of the ID. * The first 41 (+ 1 top zero bit) bits convert to decimal as . Add the value to the X Epoch of (in
Unix time Unix time is a date and time representation widely used in computing. It measures time by the number of non-leap seconds that have elapsed since 00:00:00 Coordinated Universal Time, UTC on 1 January 1970, the Unix Epoch (computing), epoc ...
milliseconds), the Unix time of the tweet is therefore : June 28, 2022 16:07:40.105 UTC. * The middle 10 bits are the machine ID. * The last 12 bits decode to all zero, meaning this tweet is the first tweet processed by the machine at the given millisecond.


Usage

The format was first announced by Twitter in June 2010. Due to implementation challenges, they waited until later in the year to roll out the update. * Twitter uses snowflake IDs for tweets, direct messages, users, lists, and all other objects available over the
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
. * The 10-bit machine ID field can be further split into sub-fields by a given implementation. For example, the archived version of the original Twitter snowflake library in Scala splits it into a 5-bit datacenter ID and a 5-bit worker ID. * Discord also uses snowflakes, with their epoch set to , which translates to the zeroth second of the year 2015. * Instagram uses a modified version of the format, with 41 bits for a timestamp, 13 bits for a
shard Shard or sherd is a sharp piece of glass, pottery or stone. Shard may also refer to: Places * Shard End, a place in Birmingham, United Kingdom Architecture * Dresden Shard, a redesign of the Bundeswehr Military History Museum in Dresden, German ...
ID, and 10 bits for a sequence number. * Mastodon's modified format has 48 bits for a millisecond-level timestamp, as it uses the
UNIX epoch Unix time is a date and time representation widely used in computing. It measures time by the number of non-leap seconds that have elapsed since 00:00:00 UTC on 1 January 1970, the Unix epoch. For example, at midnight on 1 January 2010, ...
. The remaining 16 bits are for sequence data.Source Code


See also

*
Universally unique identifier A Universally Unique Identifier (UUID) is a 128-bit label used to uniquely identify objects in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems. When generated according to the standard methods ...


References


External links

* {{GitHub, https://github.com/twitter-archive/snowflake/tree/b3f6a3c6ca8e1b6847baa6ff42bf72201e2c2231, Twitter's reference implementation
Snowflake ID generator Tool
Twitter Distributed data structures Unique identifiers Computer-related introductions in 2010