sync is a standard
system call
In computing, a system call (syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive ...
in the
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
operating system, which commits all data from the
kernel filesystem buffers to
non-volatile
Non-volatile memory (NVM) or non-volatile storage is a type of computer memory that can retain stored information even after power is removed. In contrast, volatile memory needs constant power in order to retain data.
Non-volatile memory typ ...
storage, i.e., data which has been scheduled for writing via low-level
I/O system calls. Higher-level I/O layers such as
stdio
The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header . The functionality descends from a "portable I/O package" written by Mike Lesk at ...
may maintain separate buffers of their own.
As a function in
C, the
sync()
call is typically declared as
void sync(void)
in
. The system call is also available via a
command line
A command-line interface (CLI) is a means of interacting with software via command (computing), commands each formatted as a line of text. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user ...
utility also called ''sync'', and similarly named functions in other languages such as
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
and
Node.js (in the fs module).
The related system call
fsync()
commits just the buffered data relating to a specified
file descriptor
In Unix and Unix-like computer operating systems, a file descriptor (FD, less frequently fildes) is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.
File descriptors typically h ...
.
fdatasync()
is also available to write out just the changes made to the data in the file, and not necessarily the file's related metadata.
Some Unix systems run a kind of ''flush'' or ''update''
daemon
A demon is a malevolent supernatural being, evil spirit or fiend in religion, occultism, literature, fiction, mythology and folklore.
Demon, daemon or dæmon may also refer to:
Entertainment Fictional entities
* Daemon (G.I. Joe), a character ...
, which calls the ''sync'' function on a regular basis. On some systems, the
cron
The cron command-line utility is a job scheduler on Unix-like operating systems. Users who set up and maintain software environments use cron to schedule jobs (commands or shell scripts), also known as cron jobs, to run periodically at fixed t ...
daemon does this, and on
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
it was handled by the
pdflush daemon which was replaced by a new implementation and finally removed from the Linux kernel in 2012. Buffers are also flushed when filesystems are
unmounted or remounted
read-only, for example prior to system shutdown.
Some applications, such as
LibreOffice
LibreOffice () is a free and open-source office productivity software suite developed by The Document Foundation (TDF). It was created in 2010 as a fork of OpenOffice.org, itself a successor to StarOffice. The suite includes applications ...
, also call the ''sync'' function to save recovery information in an interval.
Database use
In order to provide proper
durability
Durability is the ability of a physical product to remain functional, without requiring excessive maintenance or repair, when faced with the challenges of normal operation over its design lifetime. There are several measures of durability in us ...
, databases need to use some form of sync in order to make sure the information written has made it to
non-volatile storage
Non-volatile memory (NVM) or non-volatile storage is a type of computer memory that can retain stored information even after power is removed. In contrast, volatile memory needs constant power in order to retain data.
Non-volatile memory typ ...
rather than just being stored in a memory-based write cache that would be lost if power failed.
PostgreSQL
PostgreSQL ( ) also known as Postgres, is a free and open-source software, free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transaction processing, transactions ...
for example may use a variety of different sync calls, including
fsync()
and
fdatasync()
, in order for commits to be durable. Unfortunately, for any single client writing a series of records, a rotating hard drive can only commit once per rotation, which makes for at best a few hundred such commits per second. Turning off the fsync requirement can therefore greatly improve commit performance, but at the expense of potentially introducing database corruption after a crash.
Databases also employ
transaction log
In the field of databases in computer science, a transaction log (also transaction journal, database log, binary log or audit trail) is a history of actions executed by a database management system used to guarantee ACID properties over crashes ...
files (typically much smaller than the main data files) that have information about recent changes, such that changes can be reliably redone in case of crash; then the main data files can be synced less often.
Error reporting and checking
To avoid any data loss return values of
fsync()
should be checked because when performing I/O operations that are buffered by the library or the kernel, errors may not be reported at the time of using the
write()
The write is one of the most basic Subroutine, routines provided by a Unix-like operating system kernel (operating system), kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to o ...
system call or the
fflush()
call, since the data may not be written to non-volatile storage but only be written to the memory
page cache
In computing, a page cache, sometimes also called disk cache, is a transparent cache for the pages originating from a secondary storage device such as a hard disk drive (HDD) or a solid-state drive (SSD). The operating system keeps a page ca ...
. Errors from writes are instead often reported during system calls to
fsync()
,
msync()
or
close()
. Prior to 2018, Linux's
fsync()
behavior under certain circumstances failed to report error status, change behavior was proposed on 23 April 2018.
Performance controversies
Hard disks may default to using their own volatile write cache to buffer writes, which greatly improves performance while introducing a potential for lost writes. Tools such as
hdparm
hdparm is a command line program for Linux to set and view ATA hard disk drive hardware parameters and test performance. It can set parameters such as drive caches, sleep mode, power management, acoustic management, and DMA settings. GParted ...
-F will instruct the HDD controller to flush the on-drive write cache buffer. The performance impact of turning caching off is so large that even the normally conservative
FreeBSD
FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
community rejected disabling write caching by default in FreeBSD 4.3.
In
SCSI
Small Computer System Interface (SCSI, ) is a set of standards for physically connecting and transferring data between computers and peripheral devices, best known for its use with storage devices such as hard disk drives. SCSI was introduced ...
and in
SATA
SATA (Serial AT Attachment) is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives. Serial ATA succeeded the earlier Parallel ATA (PATA) standard ...
with
Native Command Queuing
In computing, Native Command Queuing (NCQ) is an extension of the Serial ATA protocol allowing hard disk drives to internally optimize the order in which received read and write commands are executed. This can reduce the amount of unnecessary driv ...
(but not in plain ATA, even with TCQ) the host can specify whether it wants to be notified of completion when the data hits the disk's platters or when it hits the disk's buffer (on-board cache). Assuming a correct hardware implementation, this feature allows the disk's on-board cache to be used while guaranteeing correct semantics for system calls like
fsync
. This hardware feature is called
Force Unit Access (FUA) and it allows consistency with less overhead than flushing the entire cache as done for ATA (or SATA non-NCQ) disks.
Although Linux enabled NCQ around 2007, it did not enable SATA/NCQ FUA until 2012, citing lack of support in the early drives.
Firefox 3.0, released in 2008, introduced
fsync
system calls that were found to degrade its performance; the call was introduced in order to guarantee the integrity of the embedded
SQLite
SQLite ( "S-Q-L-ite", "sequel-ite") is a free and open-source relational database engine written in the C programming language. It is not a standalone app; rather, it is a library that software developers embed in their apps. As such, it ...
database.
Linux Foundation
The Linux Foundation (LF) is a non-profit organization established in 2000 to support Linux development and open-source software projects.
Background
The Linux Foundation started as Open Source Development Labs in 2000 to standardize and prom ...
chief technical officer Theodore Ts'o
Theodore Yue Tak Ts'o (; born 1968) is an American software engineer mainly known for his contributions to the Linux kernel, in particular his contributions to file systems. He is the secondary developer and maintainer of e2fsprogs, the usersp ...
claims there is no need to "fear fsync", and that the real cause of Firefox 3 slowdown is the excessive use of
fsync
. He also concedes however (quoting
Mike Shaver) that
On some rather common Linux configurations, especially using the ext3
ext3, or third extended filesystem, is a journaling file system, journaled file system that is commonly used with the Linux kernel. It used to be the default file system for many popular Linux distributions but generally has been supplanted by ...
filesystem in the "data=ordered" mode, calling fsync doesn't just flush out the data for the file it's called on, but rather on all the buffered data for that filesystem.
See also
*
Page cache
In computing, a page cache, sometimes also called disk cache, is a transparent cache for the pages originating from a secondary storage device such as a hard disk drive (HDD) or a solid-state drive (SSD). The operating system keeps a page ca ...
*
inode
An inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attribu ...
References
External links
sync(8) - Linux man page* http://austingroupbugs.net/view.php?id=672
{{Core Utilities commands
C POSIX library
Data synchronization
Standard Unix programs
Unix file system-related software
System calls