Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an

economic An economy is an area of the production, distribution and trade, as well as consumption of goods and services. In general, it is defined as a social domain that emphasize the practices, discourses, and material expressions associated with t ...

context, a scalable

business model A business model describes how an organization creates, delivers, and captures value,''Business Model Generation'', Alexander Osterwalder, Yves Pigneur, Alan Smith, and 470 practitioners from 45 countries, self-published, 2010 in economic, soci ...

implies that a company can increase sales given increased resources. For example, a package delivery system is scalable because more packages can be delivered by adding more delivery vehicles. However, if all packages had to first pass through a single warehouse for sorting, the system would not be as scalable, because one warehouse can handle only a limited number of packages. In computing, scalability is a characteristic of computers, networks,

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s, networking protocols, programs and applications. An example is a

search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...

, which must support increasing numbers of users, and the number of topics it

indexes Index (or its plural form indices) may refer to: Arts, entertainment, and media Fictional entities * Index (''A Certain Magical Index''), a character in the light novel series ''A Certain Magical Index'' * The Index, an item on a Halo megastru ...

. Webscale is a computer architectural approach that brings the capabilities of large-scale cloud computing companies into enterprise data centers. In mathematics, scalability mostly refers to closure under

scalar multiplication In mathematics, scalar multiplication is one of the basic operations defining a vector space in linear algebra (or more generally, a module in abstract algebra). In common geometrical contexts, scalar multiplication of a real Euclidean vector ...

Examples

The

Incident Command System The Incident Command System (ICS) is a standardized approach to the command, control, and coordination of emergency response providing a common hierarchy within which responders from multiple agencies can be effective. ICS was initially devel ...

(ICS) is used by

emergency response Emergency services and rescue services are organizations that ensure public safety and health by addressing and resolving different emergencies. Some of these agencies exist solely for addressing certain types of emergencies, while others deal wi ...

agencies in the United States. ICS can scale resource coordination from a single-engine roadside brushfire to an interstate wildfire. The first resource on scene establishes command, with authority to order resources and delegate responsibility (managing five to seven officers, who will again delegate to up to seven, and on as the incident grows). As an incident expands, more senior officers assume command.

Dimensions

Scalability can be measured over multiple dimensions, such as: *''Administrative scalability'': The ability for an increasing number of organizations or users to access a system. *''Functional scalability'': The ability to enhance the system by adding new functionality without disrupting existing activities. *''Geographic scalability'': The ability to maintain effectiveness during expansion from a local area to a larger region. *''Load scalability'': The ability for a

distributed system A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...

to expand and contract to accommodate heavier or lighter loads, including, the ease with which a system or component can be modified, added, or removed, to accommodate changing loads. *''Generation scalability'': The ability of a system to scale by adopting new generations of components. * ''Heterogeneous scalability'' is the ability to adopt components from different vendors.

Domains

* A

routing protocol A routing protocol specifies how routers communicate with each other to distribute information that enables them to select routes between nodes on a computer network. Routers perform the traffic directing functions on the Internet; data packets ...

is considered scalable with respect to network size, if the size of the necessary

routing table In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with t ...

on each node grows as O(log ''N''), where ''N'' is the number of nodes in the network. Some early

peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...

(P2P) implementations of

Gnutella Gnutella is a peer-to-peer network protocol. Founded in 2000, it was the first decentralized peer-to-peer network of its kind, leading to other, later networks adopting the model. In June 2005, Gnutella's population was 1.81 million computer ...

had scaling issues. Each node query

flooded A flood is an overflow of water ( or rarely other fluids) that submerges land that is usually dry. In the sense of "flowing water", the word may also be applied to the inflow of the tide. Floods are an area of study of the discipline hydrolog ...

its requests to all nodes. The demand on each peer increased in proportion to the total number of peers, quickly overrunning their capacity. Other P2P systems like BitTorrent scale well because the demand on each peer is independent of the number of peers. Nothing is centralized, so the system can expand indefinitely without any resources other than the peers themselves. * A scalable

online transaction processing In online transaction processing (OLTP), information systems typically facilitate and manage transaction-oriented applications. This is contrasted with online analytical processing. The term "transaction" can have two different meanings, both of wh ...

system or

database management system In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...

is one that can be upgraded to process more transactions by adding new processors, devices and storage, and which can be upgraded easily and transparently without shutting it down. * The distributed nature of the

Domain Name System The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to ...

(DNS) allows it to work efficiently, serving billions of hosts on the worldwide

Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, p ...

Horizontal (scale out) and vertical scaling (scale up)

Resources fall into two broad categories: horizontal and vertical.

Horizontal or scale out

Scaling horizontally (out/in) means adding more nodes to (or removing nodes from) a system, such as adding a new computer to a distributed software application. An example might involve scaling out from one web server to three.

High-performance computing High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into a multi ...

applications, such as

seismic analysis Seismic analysis is a subset of structural analysis and is the calculation of the response of a building (or nonbuilding) structure to earthquakes. It is part of the process of structural design, earthquake engineering or structural assessmen ...

and

biotechnology Biotechnology is the integration of natural sciences and engineering sciences in order to achieve the application of organisms, cells, parts thereof and molecular analogues for products and services. The term ''biotechnology'' was first used by ...

, scale workloads horizontally to support tasks that once would have required expensive

supercomputer A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...

s. Other workloads, such as large social networks, exceed the capacity of the largest supercomputer and can only be handled by scalable systems. Exploiting this scalability requires software for efficient resource management and maintenance.

Vertical or scale up

Scaling vertically (up/down) means adding resources to (or removing resources from) a single node, typically involving the addition of CPUs, memory or storage to a single computer. Larger numbers of elements increases management complexity, more sophisticated programming to allocate tasks among resources and handle issues such as throughput and latency across nodes, while some applications do not scale horizontally.

Network scalability

Network function virtualization Network functions virtualization (NFV) is a network architecture concept that leverages the IT virtualization technologies to virtualize entire classes of network node functions into building blocks that may connect, or chain together, to create and ...

defines these terms differently: scaling out/in is the ability to scale by adding/removing resource instances (e.g., virtual machine), whereas scaling up/down is the ability to scale by changing allocated resources (e.g., memory/CPU/storage capacity).

Database scalability

Scalability for databases requires that the database system be able to perform additional work given greater hardware resources, such as additional servers, processors, memory and storage. Workloads have continued to grow and demands on databases have followed suit. Algorithmic innovations have include row-level locking and table and index partitioning. Architectural innovations include shared-nothing and shared-everything architectures for managing multi-server configurations.

Strong versus eventual consistency (storage)

In the context of scale-out

data storage Data storage is the recording (storing) of information ( data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are c ...

, scalability is defined as the maximum storage cluster size which guarantees full data consistency, meaning there is only ever one valid version of stored data in the whole cluster, independently from the number of redundant physical data copies. Clusters which provide "lazy" redundancy by updating copies in an asynchronous fashion are called 'eventually consistent'. This type of scale-out design is suitable when availability and responsiveness are rated higher than consistency, which is true for many web file-hosting services or web caches (''if you want the latest version, wait some seconds for it to propagate''). For all classical transaction-oriented applications, this design should be avoided. Many open-source and even commercial scale-out storage clusters, especially those built on top of standard PC hardware and networks, provide eventual consistency only. Idem some NoSQL databases like

CouchDB Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang. CouchDB uses multiple formats and protocols to store, transfer, and process its data. It uses JSON to store data, JavaScript as its query language using ...

and others mentioned above. Write operations invalidate other copies, but often don't wait for their acknowledgements. Read operations typically don't check every redundant copy prior to answering, potentially missing the preceding write operation. The large amount of metadata signal traffic would require specialized hardware and short distances to be handled with acceptable performance (i.e., act like a non-clustered storage device or database). Whenever strong data consistency is expected, look for these indicators: * the use of InfiniBand, Fibrechannel or similar low-latency networks to avoid performance degradation with increasing cluster size and number of redundant copies. * short cable lengths and limited physical extent, avoiding signal runtime performance degradation. * majority / quorum mechanisms to guarantee data consistency whenever parts of the cluster become inaccessible. Indicators for eventually consistent designs (not suitable for transactional applications!) are: * write performance increases linearly with the number of connected devices in the cluster. * while the storage cluster is partitioned, all parts remain responsive. There is a risk of conflicting updates.

Performance tuning versus hardware scalability

It is often advised to focus system design on hardware scalability rather than on capacity. It is typically cheaper to add a new node to a system in order to achieve improved performance than to partake in

performance tuning Performance tuning is the improvement of system performance. Typically in computer systems, the motivation for such activity is called a performance problem, which can be either real or anticipated. Most systems will respond to increased load wit ...

to improve the capacity that each node can handle. But this approach can have diminishing returns (as discussed in

performance engineering Performance engineering encompasses the techniques applied during a systems development life cycle to ensure the non-functional requirements for performance (such as throughput, latency, or memory usage) will be met. It may be alternatively refe ...

). For example: suppose 70% of a program can be sped up if parallelized and run on multiple CPUs instead of one. If

\alpha

is the fraction of a calculation that is sequential, and

1-\alpha

is the fraction that can be parallelized, the maximum

speedup In computer architecture, speedup is a number that measures the relative performance of two systems processing the same problem. More technically, it is the improvement in speed of execution of a task executed on two similar architectures with d ...

that can be achieved by using P processors is given according to Amdahl's Law: :

\frac 1 .

Substituting the value for this example, using 4 processors gives :

\frac 1  = 2.105.

Doubling the computing power to 8 processors gives :

\frac 1  = 2.581.

Doubling the processing power has only sped up the process by roughly one-fifth. If the whole problem was parallelizable, the speed would also double. Therefore, throwing in more hardware is not necessarily the optimal approach.

Weak versus strong scaling

High performance computing has two common notions of scalability: * ''Strong scaling'' is defined as how the solution time varies with the number of processors for a fixed ''total'' problem size. * ''Weak scaling'' is defined as how the solution time varies with the number of processors for a fixed problem size ''per processor''.

References

External links

Links to diverse learning resources
– page curated by the

memcached Memcached (pronounced variously ''mem-cash-dee'' or ''mem-cashed'') is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of ...

project.
Scalable Definition
– by The Linux Information Project (LINFO)
Scale in Distributed Systems
B. Clifford Neuman, In: ''Readings in Distributed Computing Systems'', IEEE Computer Society Press, 1994 {{Software quality Computer architecture Computational resources Computer systems Engineering concepts Software quality