HOME

TheInfoList



OR:

Write amplification (WA) is an undesirable phenomenon associated with
flash memory Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for the NOR and NAND logic gates. Both use ...
and
solid-state drive A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is a ...
s (SSDs) where the actual amount of information physically written to the storage media is a multiple of the logical amount intended to be written. Because flash memory must be erased before it can be rewritten, with much coarser granularity of the erase operation when compared to the write operation, the process to perform these operations results in moving (or rewriting) user data and
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
more than once. Thus, rewriting some data requires an already-used-portion of flash to be read, updated, and written to a new location, together with initially erasing the new location if it was previously used at some point in time. Due to the way flash works, much larger portions of flash must be erased and rewritten than actually required by the amount of new data. This multiplying effect increases the number of writes required over the life of the SSD, which shortens the time it can operate reliably. The increased writes also consume bandwidth to the flash memory, which reduces random write performance to the SSD. Many factors will affect the WA of an SSD; some can be controlled by the user and some are a direct result of the data written to and usage of the SSD.
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...
and
SiliconSystems Western Digital Corporation (WDC, commonly known as Western Digital or WD) is an American computer drive manufacturer and data storage company, headquartered in San Jose, California. It designs, manufactures and sells data technology produ ...
(acquired by
Western Digital Western Digital Corporation (WDC, commonly known as Western Digital or WD) is an American computer drive manufacturer and data storage company, headquartered in San Jose, California. It designs, manufactures and sells data technology produ ...
in 2009) used the term ''write amplification'' in their papers and publications as early as 2008. WA is typically measured by the ratio of writes committed to the flash memory to the writes coming from the host system. Without compression, WA cannot drop below one. Using compression,
SandForce SandForce was an American fabless semiconductor company based in Milpitas, California, that designed flash memory controllers for solid-state drives (SSDs). On January 4, 2012, SandForce was acquired by LSI Corporation and became the Flash Compone ...
has claimed to achieve a write amplification of 0.5, with best-case values as low as 0.14 in the SF-2281 controller.


Basic SSD operation

Due to the nature of flash memory's operation, data cannot be directly overwritten as it can in a
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with mag ...
. When data is first written to an SSD, the
cells Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery w ...
all start in an erased state so data can be written directly using
pages Page most commonly refers to: * Page (paper), one side of a leaf of paper, as in a book Page, PAGE, pages, or paging may also refer to: Roles * Page (assistance occupation), a professional occupation * Page (servant), traditionally a young mal ...
at a time ( in size). The
SSD controller A flash memory controller (or flash controller) manages data stored on flash memory (usually NAND flash) and communicates with a computer or electronic device. Flash memory controllers can be designed for operating in low duty-cycle environments ...
on the SSD, which manages the flash memory and
interfaces Interface or interfacing may refer to: Academic journals * ''Interface'' (journal), by the Electrochemical Society * '' Interface, Journal of Applied Linguistics'', now merged with ''ITL International Journal of Applied Linguistics'' * '' Int ...
with the host system, uses a logical-to-physical mapping system known as
logical block addressing Logical block addressing (LBA) is a common scheme used for specifying the location of blocks of data stored on computer storage devices, generally secondary storage systems such as hard disk drives. LBA is a particularly simple linear address ...
(LBA) that is part of the
flash translation layer A flash memory controller (or flash controller) manages data stored on flash memory (usually NAND flash) and communicates with a computer or electronic device. Flash memory controllers can be designed for operating in low duty-cycle environmen ...
(FTL). When new data comes in replacing older data already written, the SSD controller will write the new data in a new location and update the logical mapping to point to the new physical location. The data in the former location is no longer valid, and will need to be erased before that location can be written to again. Flash memory can be programmed and erased only a limited number of times. This is often referred to as the maximum number of program/erase cycles (P/E cycles) it can sustain over the life of the flash memory. Single-level cell (SLC) flash, designed for higher performance and longer endurance, can typically operate between 50,000 and 100,000 cycles. ,
multi-level cell In electronics, a multi-level cell (MLC) is a memory cell capable of storing more than a single bit of information, compared to a single-level cell (SLC), which can store only one bit per memory cell. A memory cell typically consists of a single ...
(MLC) flash is designed for lower cost applications and has a greatly reduced cycle count of typically between 3,000 and 5,000. Since 2013, triple-level cell (TLC) (e.g., 3D NAND) flash has been available, with cycle counts dropping to 1,000 program-erase (P/E) cycles. A lower write amplification is more desirable, as it corresponds to a reduced number of P/E cycles on the flash memory and thereby to an increased SSD life.


Calculating the value

Write amplification was always present in SSDs before the term was defined, but it was in 2008 that both Intel and SiliconSystems started using the term in their papers and publications. All SSDs have a write amplification value and it is based on both what is currently being written and what was previously written to the SSD. In order to accurately measure the value for a specific SSD, the selected test should be run for enough time to ensure the drive has reached a
steady state In systems theory, a system or a process is in a steady state if the variables (called state variables) which define the behavior of the system or the process are unchanging in time. In continuous time, this means that for those properties ''p' ...
condition. A simple formula to calculate the write amplification of an SSD is: :\text = \frac


Factors affecting the value

Many factors affect the write amplification of an SSD. The table below lists the primary factors and how they affect the write amplification. For factors that are variable, the table notes if it has a ''direct'' relationship or an ''inverse'' relationship. For example, as the amount of over-provisioning increases, the write amplification decreases (inverse relationship). If the factor is a toggle (''enabled'' or ''disabled'') function then it has either a ''positive'' or ''negative'' relationship.


Garbage collection

Data is written to the flash memory in units called pages (made up of multiple cells). However, the memory can only be erased in larger units called blocks (made up of multiple pages). If the data in some of the pages of the block are no longer needed (also called stale pages), only the pages with good data in that block are read and rewritten into another previously erased empty block. Then the free pages left by not moving the stale data are available for new data. This is a process called ''
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
'' (GC). All SSDs include some level of garbage collection, but they may differ in when and how fast they perform the process. Garbage collection is a big part of write amplification on the SSD. Reads do not require an erase of the flash memory, so they are not generally associated with write amplification. In the limited chance of a
read disturb Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for the NOR and NAND logic gates. Both use ...
error, the data in that block is read and rewritten, but this would not have any material impact on the write amplification of the drive.


Background garbage collection

The process of garbage collection involves reading and rewriting data to the flash memory. This means that a new write from the host will first require a read of the whole block, a write of the parts of the block which still include valid data, and then a write of the new data. This can significantly reduce the performance of the system. Many SSD controllers implement background garbage collection (BGC), sometimes called idle garbage collection or idle-time garbage collection (ITGC), where the controller uses idle time to consolidate blocks of flash memory before the host needs to write new data. This enables the performance of the device to remain high. If the controller were to background garbage collect all of the spare blocks before it was absolutely necessary, new data written from the host could be written without having to move any data in advance, letting the performance operate at its peak speed. The trade-off is that some of those blocks of data are actually not needed by the host and will eventually be deleted, but the OS did not tell the controller this information (until
TRIM Trim or TRIM may refer to: Cutting * Cutting or trimming small pieces off something to remove them ** Book trimming, a stage of the publishing process ** Pruning, trimming as a form of pruning often used on trees Decoration * Trim (sewing), ...
was introduced). The result is that the soon-to-be-deleted data is rewritten to another location in the flash memory, increasing the write amplification. In some of the SSDs from
OCZ OCZ was a brand of Toshiba that was used for some of its solid-state drives (SSDs) before they were rebranded with Toshiba. OCZ Storage Solutions was a manufacturer of SSDs based in San Jose, California, USA and was the new company formed after ...
the background garbage collection clears up only a small number of blocks then stops, thereby limiting the amount of excessive writes. Another solution is to have an efficient garbage collection system which can perform the necessary moves in parallel with the host writes. This solution is more effective in high write environments where the SSD is rarely idle. The
SandForce SandForce was an American fabless semiconductor company based in Milpitas, California, that designed flash memory controllers for solid-state drives (SSDs). On January 4, 2012, SandForce was acquired by LSI Corporation and became the Flash Compone ...
SSD controllers and the systems from
Violin Memory Violin Systems is a private American company based in Silicon Valley, California, that designs and manufactures computer data storage products. Corporate history The company was founded in 2005 as Violin Technologies by Donpaul Stephens and Jon ...
have this capability.


Filesystem-aware garbage collection

In 2010, some manufacturers (notably Samsung) introduced SSD controllers that extended the concept of BGC to analyze the
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
used on the SSD, to identify recently deleted files and unpartitioned space. Samsung claimed that this would ensure that even systems (operating systems and SATA controller hardware) which do not support
TRIM Trim or TRIM may refer to: Cutting * Cutting or trimming small pieces off something to remove them ** Book trimming, a stage of the publishing process ** Pruning, trimming as a form of pruning often used on trees Decoration * Trim (sewing), ...
could achieve similar performance. The operation of the Samsung implementation appeared to assume and require an
NTFS New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred fil ...
file system. It is not clear if this feature is still available in currently shipping SSDs from these manufacturers. Systemic data corruption has been reported on these drives if they are not formatted properly using MBR and NTFS.


TRIM

TRIM Trim or TRIM may refer to: Cutting * Cutting or trimming small pieces off something to remove them ** Book trimming, a stage of the publishing process ** Pruning, trimming as a form of pruning often used on trees Decoration * Trim (sewing), ...
is a SATA command that enables the operating system to tell an SSD which blocks of previously saved data are no longer needed as a result of file deletions or volume formatting. When an LBA is replaced by the OS, as with an overwrite of a file, the SSD knows that the original LBA can be marked as stale or invalid and it will not save those blocks during garbage collection. If the user or operating system erases a file (not just remove parts of it), the file will typically be marked for deletion, but the actual contents on the disk are never actually erased. Because of this, the SSD does not know that it can erase the LBAs previously occupied by the file, so the SSD will keep including such LBAs in the garbage collection. The introduction of the TRIM command resolves this problem for operating systems that
support Support may refer to: Arts, entertainment, and media * Supporting character Business and finance * Support (technical analysis) * Child support * Customer support * Income Support Construction * Support (structure), or lateral support, a ...
it like
Windows 7 Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on July 22, 2009, and became generally available on October 22, ...
, Mac OS (latest releases of Snow Leopard, Lion, and Mountain Lion, patched in some cases),
FreeBSD FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
since version 8.1, and
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
since version 2.6.33 of the
Linux kernel mainline The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU o ...
. When a file is permanently deleted or the drive is formatted, the OS sends the TRIM command along with the LBAs that no longer contain valid data. This informs the SSD that the LBAs in use can be erased and reused. This reduces the LBAs needing to be moved during garbage collection. The result is the SSD will have more free space enabling lower write amplification and higher performance.


Limitations and dependencies

The TRIM command also needs the support of the SSD. If the
firmware In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide h ...
in the SSD does not have support for the TRIM command, the LBAs received with the TRIM command will not be marked as invalid and the drive will continue to garbage collect the data assuming it is still valid. Only when the OS saves new data into those LBAs will the SSD know to mark the original LBA as invalid. SSD Manufacturers that did not originally build TRIM support into their drives can either offer a firmware upgrade to the user, or provide a separate utility that extracts the information on the invalid data from the OS and separately TRIMs the SSD. The benefit would be realized only after each run of that utility by the user. The user could set up that utility to run periodically in the background as an automatically scheduled task. Just because an SSD supports the TRIM command does not necessarily mean it will be able to perform at top speed immediately after a TRIM command. The space which is freed up after the TRIM command may be at random locations spread throughout the SSD. It will take a number of passes of writing data and garbage collecting before those spaces are consolidated to show improved performance. Even after the OS and SSD are configured to support the TRIM command, other conditions might prevent any benefit from TRIM. , databases and RAID systems are not yet TRIM-aware and consequently will not know how to pass that information on to the SSD. In those cases the SSD will continue to save and garbage collect those blocks until the OS uses those LBAs for new writes. The actual benefit of the TRIM command depends upon the free user space on the SSD. If the user capacity on the SSD was 100 GB and the user actually saved 95 GB of data to the drive, any TRIM operation would not add more than 5 GB of free space for garbage collection and wear leveling. In those situations, increasing the amount of over-provisioning by 5 GB would allow the SSD to have more consistent performance because it would always have the additional 5 GB of additional free space without having to wait for the TRIM command to come from the OS.


Over-provisioning

Over-provisioning (sometimes spelled as OP, over provisioning, or overprovisioning) is the difference between the physical capacity of the flash memory and the logical capacity presented through the
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
(OS) as available for the user. During the garbage collection, wear-leveling, and bad block mapping operations on the SSD, the additional space from over-provisioning helps lower the write amplification when the controller writes to the flash memory. Over-provisioning is represented as a percentage ratio of extra capacity to user-available capacity: :\text = \frac Over-provisioning typically comes from three sources: # The computation of the capacity and use of
gigabyte The gigabyte () is a multiple of the unit byte for digital information. The prefix '' giga'' means 109 in the International System of Units (SI). Therefore, one gigabyte is one billion bytes. The unit symbol for the gigabyte is GB. This definit ...
(GB) as the unit instead of
gibibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
(GiB). Both HDD and SSD vendors use the term GB to represent a ''decimal GB'' or 1,000,000,000 (= 109) bytes. Like most other electronic storage, flash memory is assembled in powers of two, so calculating the physical capacity of an SSD would be based on 1,073,741,824 (= 230) per ''binary GB'' or GiB. The difference between these two values is 7.37% (= (230 − 109) / 109 × 100%). Therefore, a 128 GB SSD with 0% additional over-provisioning would provide 128,000,000,000 bytes to the user (out of 137,438,953,472 total). This initial 7.37% is typically not counted in the total over-provisioning number, and the true amount available is usually less as some storage space is needed for the controller to keep track of non-operating system data such as block status flags. The 7.37% figure may extend to 9.95% in the terabyte range, as # Manufacturer decision. This is done typically at 0%, 7% or 28%, based on the difference between the decimal gigabyte of the physical capacity and the decimal gigabyte of the available space to the user. As an example, a manufacturer might publish a specification for their SSD at 100, 120 or 128 GB based on 128 GB of possible capacity. This difference is 28%, 7% and 0% respectively and is the basis for the manufacturer claiming they have 28% of over-provisioning on their drive. This does not count the additional 7.37% of capacity available from the difference between the decimal and binary gigabyte. # Known free user space on the drive, gaining endurance and performance at the expense of reporting unused portions, or at the expense of current or future capacity. This free space can be identified by the operating system using the TRIM command. Alternatively, some SSDs provide a utility that permits the end user to select additional over-provisioning. Furthermore, if any SSD is set up with an overall partitioning layout smaller than 100% of the available space, that unpartitioned space will be automatically used by the SSD as over-provisioning as well. Yet another source of over-provisioning is operating system minimum free space limits; some operating systems maintain a certain minimum free space per drive, particularly on the boot or main drive. If this additional space can be identified by the SSD, perhaps through continuous usage of the TRIM command, then this acts as semi-permanent over-provisioning. Over-provisioning often takes away from user capacity, either temporarily or permanently, but it gives back reduced write amplification, increased endurance, and increased performance.


Free user space

The SSD controller will use free blocks on the SSD for garbage collection and wear leveling. The portion of the user capacity which is free from user data (either already TRIMed or never written in the first place) will look the same as over-provisioning space (until the user saves new data to the SSD). If the user saves data consuming only half of the total user capacity of the drive, the other half of the user capacity will look like additional over-provisioning (as long as the TRIM command is supported in the system).


Secure erase

The ATA Secure Erase command is designed to remove all user data from a drive. With an SSD without integrated encryption, this command will put the drive back to its original out-of-box state. This will initially restore its performance to the highest possible level and the best (lowest number) possible write amplification, but as soon as the drive starts garbage collecting again the performance and write amplification will start returning to the former levels. Many tools use the ATA Secure Erase command to reset the drive and provide a user interface as well. One free tool that is commonly referenced in the industry is called
HDDerase HDDerase is a freeware utility that securely erases data on hard drives using the Secure Erase unit command built into the firmware of Parallel ATA and Serial ATA drives manufactured after 2001. HDDerase was developed by the Center for Magnetic ...
.
GParted GParted (acronym of GNOME Partition Editor) is a GTK front-end to GNU Parted and an official GNOME partition-editing application (alongside Disks). GParted is used for creating, deleting, resizing, moving, checking, and copying disk partitio ...
and
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed mostly of free and open-source software. Ubuntu is officially released in three editions: '' Desktop'', ''Server'', and ''Core'' for Internet of things devices and robots. All ...
live CDs provide a bootable Linux system of disk utilities including secure erase. Drives which encrypt all writes on the fly ''can'' implement ATA Secure Erase in another way. They simply zeroize and generate a new random encryption key each time a secure erase is done. In this way the old data cannot be read anymore, as it cannot be decrypted. Some drives with an integrated encryption will physically clear all blocks after that as well, while other drives may require a TRIM command to be sent to the drive to put the drive back to its original out-of-box state (as otherwise their performance may not be maximized).


ATA Secure Erase – failure to erase data

Some drives may either completely or partially fail to erase the data with the ATA Secure Erase, and the data will remain recoverable from such drives.


Wear leveling

If a particular block was programmed and erased repeatedly without writing to any other blocks, that block would wear out before all the other blocks – thereby prematurely ending the life of the SSD. For this reason, SSD controllers use a technique called wear leveling to distribute writes as evenly as possible across all the flash blocks in the SSD. In a perfect scenario, this would enable every block to be written to its maximum life so they all fail at the same time. Unfortunately, the process to evenly distribute writes requires data previously written and not changing (cold data) to be moved, so that data which are changing more frequently (hot data) can be written into those blocks. Each time data are relocated without being changed by the host system, this increases the write amplification and thus reduces the life of the flash memory. The key is to find an optimal algorithm which maximizes them both.


Separating static and dynamic data

The separation of static (cold) and dynamic (hot) data to reduce write amplification is not a simple process for the SSD controller. The process requires the SSD controller to separate the LBAs with data which is constantly changing and requiring rewriting (dynamic data) from the LBAs with data which rarely changes and does not require any rewrites (static data). If the data is mixed in the same blocks, as with almost all systems today, any rewrites will require the SSD controller to rewrite both the dynamic data (which caused the rewrite initially) and static data (which did not require any rewrite). Any garbage collection of data that would not have otherwise required moving will increase write amplification. Therefore, separating the data will enable static data to stay at rest and if it never gets rewritten it will have the lowest possible write amplification for that data. The drawback to this process is that somehow the SSD controller must still find a way to wear level the static data because those blocks that never change will not get a chance to be written to their maximum P/E cycles.


Performance implications


Sequential writes

When an SSD is writing large amounts of data sequentially, the write amplification is equal to one meaning there is no write amplification. The reason is as the data is written, the entire block is filled sequentially with data related to the same file. If the OS determines that file is to be replaced or deleted, the entire block can be marked as invalid, and there is no need to read parts of it to garbage collect and rewrite into another block. It will need only to be erased, which is much easier and faster than the ''read–erase–modify–write'' process needed for randomly written data going through garbage collection.


Random writes

The peak random write performance on an SSD is driven by plenty of free blocks after the SSD is completely garbage collected, secure erased, 100% TRIMed, or newly installed. The maximum speed will depend upon the number of parallel flash channels connected to the SSD controller, the efficiency of the firmware, and the speed of the flash memory in writing to a page. During this phase the write amplification will be the best it can ever be for random writes and will be approaching one. Once the blocks are all written once, garbage collection will begin and the performance will be gated by the speed and efficiency of that process. Write amplification in this phase will increase to the highest levels the drive will experience.


Impact on performance

The overall performance of an SSD is dependent upon a number of factors, including write amplification. Writing to a flash memory device takes longer than reading from it. An SSD generally uses multiple flash memory components connected in parallel as channels to increase performance. If the SSD has a high write amplification, the controller will be required to write that many more times to the flash memory. This requires even more time to write the data from the host. An SSD with a low write amplification will not need to write as much data and can therefore be finished writing sooner than a drive with a high write amplification.


Product statements

In September 2008,
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...
announced the X25-M SATA SSD with a reported WA as low as 1.1. In April 2009,
SandForce SandForce was an American fabless semiconductor company based in Milpitas, California, that designed flash memory controllers for solid-state drives (SSDs). On January 4, 2012, SandForce was acquired by LSI Corporation and became the Flash Compone ...
announced the SF-1000 SSD Processor family with a reported WA of 0.5 which appears to come from some form of data compression. Before this announcement, a write amplification of 1.0 was considered the lowest that could be attained with an SSD.


See also

*
Flash file system A flash file system is a file system designed for storing files on flash memory–based storage devices. While flash file systems are closely related to file systems in general, they are optimized for the nature and characteristics of flash me ...
* Partition alignment * Wear leveling


Notes


References


External links

* {{Good article Solid-state computer storage