HOME

TheInfoList



OR:

PetaBox is a storage unit from Capricorn Technologies. It was designed by the staff of the
Internet Archive The Internet Archive is an American digital library with the stated mission of "universal access to all knowledge". It provides free public access to collections of digitized materials, including websites, software applications/games, music, ...
and C. R. Saikley to store and process one
petabyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
(a million gigabytes) of information.


Specifications

* Density: 1.4 PetaBytes/rack * Power consumption: 3 kW/PetaByte * No air conditioning, instead uses excess heat to help heat the building.


Design history

The PetaBox, custom-designed by Internet Archive staff, was originally created to safely store and process one petabyte (a million gigabytes) of information. The goals and design points were: * Low power: 6 kW per rack, 60 kW for the entire storage cluster * High density: 100+ TB/
rack Rack or racks may refer to: Storage and installation * Amp rack, short for amplifier rack, a piece of furniture in which amplifiers are mounted * Bicycle rack, a frame for storing bicycles when not in use * Bustle rack, a type of storage bi ...
* Local computing to process the data (800 low-end PCs) * Multi-OS possible,
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
standard * Colocation friendly * Shipping container friendly: able to be run in a 20' by 8' by 8'
shipping container A shipping container is a container with strength suitable to withstand shipment, storage, and handling. Shipping containers range from large reusable steel boxes used for intermodal shipments to the ubiquitous corrugated boxes. In the context of ...
* Easy maintenance: One
system administrator A system administrator, or sysadmin, or admin is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems, especially multi-user computers, such as servers. The system administrator seeks to en ...
per petabyte * Software to automate full mirroring * Easy to scale * Inexpensive design * Inexpensive storage


History

The first 100 terabyte rack became operational at the European Archive in June 2004. The second 80 terabyte rack became operational in San Francisco that same year. The Internet Archive then spun off its PetaBox production to the newly formed company Capricorn Technologies. Between 2004 and 2007, Capricorn replicated the Internet Archive's deployment of the PetaBox for major
academic institution Academic institution is an educational institution dedicated to education and research, which grants academic degrees. See also academy and university. Types * Primary schools – (from French ''école primaire'') institutions where children ...
s, digital preservationists, government agencies,
high-performance computing High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into a mult ...
(HPC) and major research sites, medical imaging providers, digital image repositories, storage outsourcing sites, and other enterprises. Their largest product uses 750 gigabyte disks. In 2007 the Internet Archive data center housed approximately three petabytes of PetaBox storage technology. As of 2010, the fourth version of the PetaBox was in operation. Its general specifications are: * 24 disks per 4U high rack units * 10 units per rack * running
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
* 240 disks of 2 TB/each per rack As of December 2021, the Petabox contains the following: * 4 data centers * 745 nodes * 28,000 spinning disks The
Wayback Machine The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, a nonprofit based in San Francisco, California. Created in 1996 and launched to the public in 2001, it allows the user to go "back in time" and see ...
contains 57 petabytes of information; book, music and video collections contain an extra 42 petabytes of information, and ''Unique Data'' contains an extra 99 petabytes of information, with everything adding up to a total of 212 petabytes' worth of storage.


References


External link

* {{DEFAULTSORT:PetaBox Internet Archive projects Computer enclosure Data storage servers