A tab-separated values (TSV) file is a simple text format for storing data in a
tabular structure, e.g., a
database table or
spreadsheet data, and a way of exchanging information between
databases.
Each
record
A record, recording or records may refer to:
An item or collection of data Computing
* Record (computer science), a data structure
** Record, or row (database), a set of fields in a database related to one entity
** Boot sector or boot record, ...
in the table is one line of the
text file. Each field value of a record is separated from the next by a
tab character. The TSV format is thus a variation of the
comma-separated values format.
TSV is a simple file format that is widely supported, so it is often used in
data exchange to move tabular data between different computer programs that support the format. For example, a TSV file might be used to transfer information from a database program to a spreadsheet.
The IANA standard for TSV
achieves simplicity by simply disallowing tabs within fields.
Example
The head of the
Iris flower data set can be stored as a TSV using the following plain text (note that the HTML rendering may convert tabs to spaces):
Sepal length	Sepal width	Petal length	Petal width	Species
5.1	3.5	1.4	0.2	I. setosa
4.9	3.0	1.4	0.2	I. setosa
4.7	3.2	1.3	0.2	I. setosa
4.6	3.1	1.5	0.2	I. setosa
5.0	3.6	1.4	0.2	I. setosa
The TSV plain text above corresponds to the following tabular data:
Conventions for lossless conversion to TSV
Since the values in the TSV format cannot contain literal tabs or newline characters, a convention is necessary for lossless conversion of text values with these characters. A common convention is to perform the following
escapes:
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash.
Another common convention is to use the CSV convention from {{IETF RFC, 4180 and enclose these special characters in double quotes. This can lead to ambiguities.
Another ambiguity is whether records are separated by newlines, as would be typical for lines on UNIX, or a carriage return followed by a newline, as would be typical for Microsoft platforms. Many programs such as LibreOffice expect a carriage return followed by a newline.
See also
*
Comma-separated values
*
Delimiter collision
References
Bibliography
*
IANA
The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Interne ...
, Text Media Types
Definition of tab-separated-values (tsv) Paul Lindner, U of MN Internet Gopher Team, June 1993
Jukka Korpela, created 2000-09-01, last update 2005-02-12.
External links
Gnumeric manual
Spreadsheet file formats
Delimiter-separated format