MultiPar
   HOME

TheInfoList



OR:

Parchive (a
portmanteau A portmanteau word, or portmanteau (, ) is a blend of wordserasure code system that produces par files for checksum verification of data integrity, with the capability to perform data recovery operations that can repair or regenerate corrupted or missing data. Parchive was originally written to solve the problem of reliable file sharing on
Usenet Usenet () is a worldwide distributed discussion system available on computers. It was developed from the general-purpose Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Ellis conceived the idea in 1979, and it was ...
, but it can be used for protecting any kind of data from data corruption,
disc rot Disc rot is the tendency of compact disc, CD, DVD, or other optical discs to become unreadable because of physical or chemical deterioration. The causes include oxidation of the reflective layer, physical scuffing and abrasion of disc, reactions wi ...
,
bit rot Bit rot may refer to: * " Bit Rot", a short story by Charles Stross * Data rot, the decay of electromagnetic charge in a computer's storage ** Disc rot, the deterioration of optical media such as DVDs and CDs * Software rot Software rot (bit ro ...
, and accidental or malicious damage. Despite the name, Parchive uses more advanced techniques (specifically
error correction code In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
s) than simplistic
parity Parity may refer to: * Parity (computing) ** Parity bit in computing, sets the parity of data for the purpose of error detection ** Parity flag in computing, indicates if the number of set bits is odd or even in the binary representation of the r ...
methods of error detection. As of 2014, PAR1 is obsolete, PAR2 is mature for widespread use, and PAR3 is a discontinued experimental version developed by MultiPar author Yutaka Sawada. The original SourceForge Parchive project has been inactive since April 30, 2015. A new PAR3 specification has been worked on since April 28, 2019 by PAR2 specification author Michael Nahas. An alpha version of the PAR3 specification has been published on January 29, 2022 while the program itself is being developed.


History

Parchive was intended to increase the reliability of transferring files via Usenet
newsgroup A Usenet newsgroup is a repository usually within the Usenet system, for messages posted from users in different locations using the Internet. They are discussion groups and are not devoted to publishing news. Newsgroups are technically distinct ...
s. Usenet was originally designed for informal conversations, and the underlying protocol, NNTP was not designed to transmit arbitrary binary data. Another limitation, which was acceptable for conversations but not for files, was that messages were normally fairly short in length and limited to 7-bit
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
text. Various techniques were devised to send files over Usenet, such as
uuencoding uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at UC Berkeley in 1980, for encoding binary data for transmission in email systems. The name "uuencoding" is deriv ...
and Base64. Later Usenet software allowed 8 bit
Extended ASCII Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
, which permitted new techniques like
yEnc yEnc is a binary-to-text encoding scheme for transferring binary files in messages on Usenet or via e-mail. It reduces the overhead over previous US-ASCII-based encoding methods by using an 8-bit encoding method. yEnc's overhead is often (if ea ...
. Large files were broken up to reduce the effect of a corrupted download, but the unreliable nature of Usenet remained. With the introduction of Parchive, parity files could be created that were then uploaded along with the original data files. If any of the data files were damaged or lost while being propagated between Usenet servers, users could download parity files and use them to reconstruct the damaged or missing files. Parchive included the construction of small index files (*.par in version 1 and *.par2 in version 2) that do not contain any recovery data. These indexes contain file hashes that can be used to quickly identify the target files and verify their integrity. Because the index files were so small, they minimized the amount of extra data that had to be downloaded from Usenet to verify that the data files were all present and undamaged, or to determine how many parity volumes were required to repair any damage or reconstruct any missing files. They were most useful in version 1 where the parity volumes were much larger than the short index files. These larger parity volumes contain the actual recovery data along with a duplicate copy of the information in the index files (which allows them to be used on their own to verify the integrity of the data files if there is no small index file available). In July 2001, Tobias Rieper and Stefan Wehlus proposed the Parity Volume Set specification, and with the assistance of other project members, version 1.0 of the specification was published in October 2001. Par1 used
Reed–Solomon error correction Reed–Solomon codes are a group of error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960. They have many applications, the most prominent of which include consumer technologies such as MiniDiscs, CDs, DVDs, B ...
to create new recovery files. Any of the recovery files can be used to rebuild a missing file from an incomplete download. Version 1 became widely used on Usenet, but it did suffer some limitations: * It was restricted to handle at most 255 files. * The recovery files had to be the size of the largest input file, so it did not work well when the input files were of various sizes. (This limited its usefulness when not paired with the proprietary RAR compression tool.) * The recovery algorithm had a bug, due to a flaw in the academic paper on which it was based. * It was strongly tied to Usenet and it was felt that a more general tool might have a wider audience. In January 2002, Howard Fukada proposed that a new Par2 specification should be devised with the significant changes that data verification and repair should work on blocks of data rather than whole files, and that the algorithm should switch to using 16 bit numbers rather than the 8 bit numbers that PAR1 used. Michael Nahas and Peter Clements took up these ideas in July 2002, with additional input from Paul Nettle and Ryan Gallagher (who both wrote Par1 clients). Version 2.0 of the Parchive specification was published by Michael Nahas in September 2002. Peter Clements then went on to write the first two Par2 implementations, QuickPar and par2cmdline. Abandoned since 2004, Paul Houle created phpar2 to supersede par2cmdline. Yutaka Sawada created MultiPar to supersede QuickPar. MultiPar uses par2j.exe (which is partially based on par2cmdline's optimization techniques) to use as MultiPar's backend engine.


Versions

Versions 1 and 2 of the
file format A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free. Some file formats ...
are incompatible. (However, many clients support both.)


Par1

For Par1, the files ''f1'', ''f2'', ..., ''fn'', the Parchive consists of an index file (''f.par''), which is CRC type file with no recovery blocks, and a number of "parity volumes" (''f.p01'', ''f.p02'', etc.). Given all of the original files except for one (for example, ''f2''), it is possible to create the missing ''f2'' given all of the other original files and any one of the parity volumes. Alternatively, it is possible to recreate two missing files from any two of the parity volumes and so forth. Par1 supports up to a total of 256 source and recovery files.


Par2

Par2 files generally use this naming/extension system: ''filename.vol000+01.PAR2'', ''filename.vol001+02.PAR2'', ''filename.vol003+04.PAR2'', ''filename.vol007+06.PAR2'', etc. The number after the ''"+"'' in the filename indicates how many blocks it contains, and the number after ''"vol"'' indicates the number of the first recovery block within the PAR2 file. If an index file of a download states that 4 blocks are missing, the easiest way to repair the files would be by downloading ''filename.vol003+04.PAR2''. However, due to the redundancy, ''filename.vol007+06.PAR2'' is also acceptable. There is also an index file ''filename.PAR2'', it is identical in function to the small index file used in PAR1. Par2 specification supports up to 32,768 source blocks and up to 65,535 recovery blocks. Input files are split into multiple equal-sized blocks so that recovery files do not need to be the size of the largest input file. Although
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
is mentioned in the PAR2 specification as an option, most PAR2 implementations do not support Unicode. Directory support is included in the PAR2 specification, but most or all implementations do not support it.


Par3

The Par3 specification was originally planned to be published as an enhancement over the Par2 specification. However, to date, it has remained closed source by specification owner Yutaka Sawada. A discussion on a new format started in the GitHub issue section of the maintained fork par2cmdline on January 29, 2019. The discussion led to a new format which is also named as Par3. The new Par3 format's specification i
published on GitHub
but remains being an alpha draft as of January 28, 2022. The specification is written by Michael Nahas, the author of Par2 specification, with the help from Yutaka Sawada, animetosho and malaire. The new format claims to have multiple advantages over the Par2 format, including: * Supports more than 216 files and more than 216 blocks. * Supports packing small files into one block, as well as deduplication when a block appears in multiple files. * Supports
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
file names, file permissions, hard links and soft links. * Supports embedding PAR data inside other formats, like ZIP archives or ISO disk images. * Supports "incremental backups", where a user creates recovery files for some file or folder, change some data, and create new recovery files reusing some of the older files. * Supports more error correction code algorithms (such as
LDPC In information theory, a low-density parity-check (LDPC) code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel. An LDPC code is constructed using a sparse Tanner graph (subclass of the bipa ...
and
sparse random matrix Sparse is a computer software tool designed to find possible coding faults in the Linux kernel. Unlike other such tools, this static analysis tool was initially designed to only flag constructs that were likely to be of interest to kernel de ...
). * Replaced the MD5 hash function in Par2 with
BLAKE3 BLAKE is a cryptographic hash function based on Daniel J. Bernstein's ChaCha stream cipher, but a permuted copy of the input block, XORed with round constants, is added before each ChaCha round. Like SHA-2, there are two variants differing in th ...
. * Supports empty directories. * Supports file permissions. * Supports hard links and symbolic links.


Software


Multi-Platform


par2+tbb
( GPLv2) — a concurrent (multithreaded) version of par2cmdline 0.4 using TBB. Only compatible with
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
based CPUs. It is available in the FreeBSD Ports system a
par2cmdline-tbb

Original par2cmdline
nbsp;— (obsolete). Available in the FreeBSD Ports system a
par2cmdline

par2cmdline
maintained fork by BlackIkeEagle.
par2cmdline-mt
is another multithreaded version of par2cmdline using OpenMP, GPLv2, or later. Currently merged into BlackIkeEagle's fork and maintained there.
ParPar
(
CC0 A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".A "work" is any creative material made by a person. A painting, a graphic, a book, a song/lyrics ...
) is a high performance, multithreaded PAR2 client and
Node.js Node.js is an open-source server environment. Node.js is cross-platform and runs on Windows, Linux, Unix, and macOS. Node.js is a back-end JavaScript runtime environment. Node.js runs on the V8 JavaScript Engine and executes JavaScript code ou ...
library. Does not support verifying or repair, it can currently only create PAR2 archives.
par2deep
( LGPL-3.0) — Produce, verify and repair par2 files recursively, both on the command line as well as with the aid of a graphical user interface. It is available in the Python Package Index system a
par2deep


Windows


MultiPar
(freeware)  — Builds upon QuickPar's features and
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
, and uses Yutaka Sawada's par2j.exe as the PAR2 backend. MultiPar supports multiple languages by Unicode. The name of MultiPar was derived from "multi-lingual PAR client". MultiPar is also verified to work with
Wine Wine is an alcoholic drink typically made from fermented grapes. Yeast consumes the sugar in the grapes and converts it to ethanol and carbon dioxide, releasing heat in the process. Different varieties of grapes and strains of yeasts are m ...
under
TrueOS TrueOS (formerly PC-BSD or PCBSD) is a discontinued Unix-like, server-oriented operating system built upon the most recent releases of FreeBSD-CURRENT. Up to 2018 it aimed to be easy to install by using a graphical installation program, and ea ...
and
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed mostly of free and open-source software. Ubuntu is officially released in three editions: ''Desktop'', ''Server'', and ''Core'' for Internet of things devices and robots. All the ...
, and may work with other operating systems too. Although the Par2 components are (or will be) open source, the MultiPar GUI on top of them is currently not open source. * QuickPar (freeware) — unmaintained since 2004, superseded by MultiPar.
phpar2
 — advanced par2cmdline with multithreading and highly optimized assemblercode (about 66% faster than QuickPar 0.9.1)

nbsp;— First PAR implementation, unmaintained since 2001.


Mac OS X




UnRarX


POSIX

Software for
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interf ...
conforming operating systems:
Par2 for KDE 4

PyPar2 1.4
a frontend for par2.
GPar2 2.03


See also

*
Comparison of file archivers The following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date. Unless ...
– Some
file archivers A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage. File archivers may employ lossless data compression in their archive formats ...
are capable of integrating parity data into their formats for error detection and correction: * RAID – RAID levels at and above RAID 5 make use of parity data to detect and repair errors.


References

{{Reflist, 30em


External links


Parchive project - full specifications and math behind it



Slyck's Guide To The Usenet Newsgroups: PAR & PAR2 Files

Guide to repair files using PAR2

UsenetReviewz's guide to opening par files
Archive formats Data management Usenet