HOME

TheInfoList



OR:

Operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s use lock managers to organise and serialise the access to resources. A distributed lock manager (DLM) runs in every machine in a cluster, with an identical copy of a cluster-wide lock database. In this way a DLM provides
software application Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
s which are
distributed Distribution may refer to: Mathematics *Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a varia ...
across a cluster on multiple machines with a means to synchronize their accesses to
shared resource In computing, a shared resource, or network share, is a computer resource made available from one host to other hosts on a computer network. It is a device or piece of information on a computer that can be remotely accessed from another compu ...
s. DLMs have been used as the foundation for several successful
clustered file system A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
s, in which the machines in a
cluster may refer to: Science and technology Astronomy * Cluster (spacecraft), constellation of four European Space Agency spacecraft * Asteroid cluster, a small asteroid family * Cluster II (spacecraft), a European Space Agency mission to study th ...
can use each other's storage via a unified
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
, with significant advantages for performance and
availability In reliability engineering, the term availability has the following meanings: * The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...
. The main performance benefit comes from solving the problem of disk cache coherency between participating computers. The DLM is used not only for
file locking File locking is a mechanism that restricts access to a computer file, or to a region of a file, by allowing only one user or process to modify or delete it at a specific time and to prevent reading of the file while it's being modified or deleted ...
but also for coordination of all disk access. VMScluster, the first clustering system to come into widespread use, relied on the
OpenVMS OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Ope ...
DLM in just this way.


Resources

The DLM uses a generalized concept of a resource, which is some entity to which shared access must be controlled. This can relate to a file, a record, an area of shared memory, or anything else that the
application Application may refer to: Mathematics and computing * Application software, computer software designed to help the user to perform specific tasks ** Application layer, an abstraction layer that specifies protocols and interface methods used in a c ...
designer chooses. A hierarchy of resources may be defined, so that a number of levels of locking can be implemented. For instance, a hypothetical
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
might define a resource hierarchy as follows: * Database * Table * Record * Field A process can then acquire locks on the database as a whole, and then on particular parts of the database. A lock must be obtained on a parent resource before a subordinate resource can be locked.


Lock modes

A process running within a VMSCluster may obtain a lock on a resource. There are six lock modes that can be granted, and these determine the level of exclusivity being granted, it is possible to convert the lock to a higher or lower level of lock mode. When all processes have unlocked a resource, the system's information about the resource is destroyed. * Null (NL). Indicates interest in the resource, but does not prevent other processes from locking it. It has the advantage that the resource and its lock value block are preserved, even when no processes are locking it. * Concurrent Read (CR). Indicates a desire to read (but not update) the resource. It allows other processes to read or update the resource, but prevents others from having exclusive access to it. This is usually employed on high-level resources, in order that more restrictive locks can be obtained on subordinate resources. * Concurrent Write (CW). Indicates a desire to read and update the resource. It also allows other processes to read or update the resource, but prevents others from having exclusive access to it. This is also usually employed on high-level resources, in order that more restrictive locks can be obtained on subordinate resources. * Protected Read (PR). This is the traditional ''share lock'', which indicates a desire to read the resource but prevents other from updating it. Others can however also read the resource. * Protected Write (PW). This is the traditional ''update lock'', which indicates a desire to read and update the resource and prevents others from updating it. Others with Concurrent Read access can however read the resource. * Exclusive (EX). This is the traditional ''exclusive lock'' which allows read and update access to the resource, and prevents others from having any access to it. The following
truth table A truth table is a mathematical table used in logic—specifically in connection with Boolean algebra, boolean functions, and propositional calculus—which sets out the functional values of logical expressions on each of their functional arg ...
shows the compatibility of each lock mode with the others:


Obtaining a lock

A process can obtain a lock on a resource by ''enqueueing'' a lock request. This is similar to the QIO technique that is used to perform I/O. The enqueue lock request can either complete synchronously, in which case the process waits until the lock is granted, or asynchronously, in which case an AST occurs when the lock has been obtained. It is also possible to establish a ''blocking AST'', which is triggered when a process has obtained a lock that is preventing access to the resource by another process. The original process can then optionally take action to allow the other access (e.g. by demoting or releasing the lock).


Lock value block

A lock value block is associated with each resource. This can be read by any process that has obtained a lock on the resource (other than a null lock) and can be updated by a process that has obtained a protected update or exclusive lock on it. It can be used to hold any information about the resource that the application designer chooses. A typical use is to hold a ''version number'' of the resource. Each time the associated entity (e.g. a database record) is updated, the holder of the lock increments the lock value block. When another process wishes to read the resource, it obtains the appropriate lock and compares the current lock value with the value it had last time the process locked the resource. If the value is the same, the process knows that the associated entity has not been updated since last time it read it, and therefore it is unnecessary to read it again. Hence, this technique can be used to implement various types of
cache Cache, caching, or caché may refer to: Places United States * Cache, Idaho, an unincorporated community * Cache, Illinois, an unincorporated community * Cache, Oklahoma, a city in Comanche County * Cache, Utah, Cache County, Utah * Cache County ...
in a database or similar application.


Deadlock detection

When one or more processes have obtained locks on resources, it is possible to produce a situation where each is preventing another from obtaining a lock, and none of them can proceed. This is known as a
deadlock In concurrent computing, deadlock is any situation in which no member of some group of entities can proceed because each waits for another member, including itself, to take action, such as sending a message or, more commonly, releasing a loc ...
( E. W. Dijkstra originally called it a deadly embrace). A simple example is when Process 1 has obtained an exclusive lock on Resource A, and Process 2 has obtained an exclusive lock on Resource B. If Process 1 then tries to lock Resource B, it will have to wait for Process 2 to release it. But if Process 2 then tries to lock Resource A, both processes will wait forever for each other. The OpenVMS DLM periodically checks for deadlock situations. In the example above, the second lock enqueue request of one of the processes would return with a deadlock status. It would then be up to this process to take action to resolve the deadlock—in this case by releasing the first lock it obtained.


Linux clustering

Both
Red Hat Red Hat, Inc. is an American software company that provides open source software products to enterprises. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina, with other offices worldwide. Red Hat has become a ...
and
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word ...
have developed clustering software for
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
. OCFS2, the Oracle Cluster File System was added to the official
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
with version 2.6.16, in January 2006. The alpha-quality code warning on OCFS2 was removed in 2.6.19. Red Hat's cluster software, including their DLM and GFS2 was officially added to the Linux kernel with version 2.6.19, in November 2006. Both systems use a DLM modeled on the venerable VMS DLM.The OCFS2 filesystem
Lwn.net (2005-05-24). Retrieved on 2013-09-18. Oracle's DLM has a simpler API. (the core function, dlmlock(), has eight parameters, whereas the VMS SYS$ENQ service and Red Hat's dlm_lock both have 11.)


Other implementations

Other DLM implementations include the following: *
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
has developed ''Chubby'', a lock service for loosely coupled distributed systems.Google Research Publication: Chubby Distributed Lock Service
Research.google.com. Retrieved on 2013-09-18.
It is designed for coarse-grained locking and also provides a limited but reliable distributed file system. Key parts of Google's infrastructure, including Google File System,
Bigtable Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio. History Bigtable development began in 2004.. It is now used by a number of Googl ...
, and
MapReduce MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a ''map'' procedure, which performs filtering ...
, use Chubby to synchronize accesses to shared resources. Though Chubby was designed as a lock service, it is now heavily used inside Google as a
name server A name server refers to the server component of the Domain Name System (DNS), one of the two principal namespaces of the Internet. The most important function of DNS servers is the translation (resolution) of human-memorable domain names (example ...
, supplanting DNS. * Apache ZooKeeper, which was created at
Yahoo Yahoo! (, styled yahoo''!'' in its logo) is an American web services provider. It is headquartered in Sunnyvale, California and operated by the namesake company Yahoo! Inc. (2017–present), Yahoo Inc., which is 90% owned by investment funds ma ...
, is open-source software and can be used to perform distributed locks
Zookeeper.apache.org. Retrieved on 2013-09-18.
as well. *
Etcd Container Linux (formerly CoreOS Linux) is a discontinued open-source lightweight operating system based on the Linux kernel and designed for providing infrastructure to clustered deployments, while focusing on automation, ease of application ...
is open-source software, developed at CoreOS under the Apache License. It can be used to perform distributed locks as well. *
Redis Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, s ...
is an open source, BSD licensed, advanced key-value cache and store. Redis can be used to implement the Redlock Algorithm for distributed lock management. * HashiCorp's
Consul Consul (abbrev. ''cos.''; Latin plural ''consules'') was the title of one of the two chief magistrates of the Roman Republic, and subsequently also an important title under the Roman Empire. The title was used in other European city-states throu ...
,Consul Overview
Retrieved on 2015-02-19.
which was created by HashiCorp, is open-source software and can be used to perform distributed locks as well. * Taooka distributed lock managerTaooka Description
Retrieved on 2017-05-04.
uses the "try lock" methods to avoid
deadlock In concurrent computing, deadlock is any situation in which no member of some group of entities can proceed because each waits for another member, including itself, to take action, such as sending a message or, more commonly, releasing a loc ...
s. It can also specify a TTL for each lock with nanosecond precision. * A DLM is also a key component of more ambitious single system image (SSI) projects such as
OpenSSI OpenSSI is an open-source single-system image clustering system. It allows a collection of computers to be treated as one large system, allowing applications running on any one machine access to the resources of all the machines in the cluster. ...
.


References


HP OpenVMS Systems Services Reference Manual – $ENQ

Officer - A simple distributed lock manager written in Ruby

FLoM - A free open source distributed lock manager that can be used to synchronize shell commands, scripts and custom developed C, C++, Java, PHP and Python software
{{DEFAULTSORT:Distributed Lock Manager Distributed computing architecture