GridFTP
   HOME

TheInfoList



OR:

GridFTP is an extension of the File Transfer Protocol (FTP) for
grid computing Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from co ...
. The protocol was defined within the GridFTP working group of the
Open Grid Forum The Open Grid Forum (OGF) is a community of users, developers, and vendors for standardization of grid computing. It was formed in 2006 in a merger of the Global Grid Forum and the Enterprise Grid Alliance. The OGF models its process on the In ...
. There are multiple implementations of the protocol; the most widely used is that provided by the
Globus Toolkit The Globus Toolkit is an open-source toolkit for grid computing developed and provided by the Globus Alliance. On 25 May 2017 it was announced that the open source support for the project would be discontinued in January 201 due to a lack of fin ...
. The aim of GridFTP is to provide a more reliable and high performance file transfer, for example to enable the transmission of very large files. GridFTP is used extensively within large science projects such as the
Large Hadron Collider The Large Hadron Collider (LHC) is the world's largest and highest-energy particle collider. It was built by the European Organization for Nuclear Research (CERN) between 1998 and 2008 in collaboration with over 10,000 scientists and hundred ...
and by many supercomputer centers and other scientific facilities. GridFTP also addresses the problem of incompatibility between storage and access systems. Previously, each data provider would make their data available in their own specific way, providing a library of access functions. This made it difficult to obtain data from multiple sources, requiring a different access method for each, and thus dividing the total available data into partitions. GridFTP provides a uniform way of accessing the data, encompassing functions from all the different modes of access, building on and extending the universally accepted FTP standard. FTP was chosen as a basis for it because of its widespread use, and because it has a well defined architecture for extensions to the protocol (which may be dynamically discovered). Numerous GridFTP clients have been developed. The Globus Online software-as-a-service system is particularly popular.


Features of GridFTP

GridFTP integrates with the
Grid Security Infrastructure The Grid Security Infrastructure (GSI), formerly called the Globus Security Infrastructure, is a specification for secret, tamper-proof, delegatable communication between software in a grid computing environment. Secure, authenticatable communicat ...
, which provides authentication and encryption to file transfers, with user-specified levels of confidentiality and data integrity, also for cross-server transfers (what FTP calls the
File eXchange Protocol File eXchange Protocol (FXP or FXSP) is a method of data transfer which uses FTP to transfer data from one remote server to another (inter-server) without routing this data through the client's connection. Conventional FTP involves a single serve ...
, FXP). GridFTP achieves much greater use of bandwidth than conventional data stream technology by using multiple simultaneous TCP streams. Files can be downloaded in pieces simultaneously from multiple sources; or even in separate parallel streams from the same source, which is still able to make better use of the bandwidth. Striped and interleaved transfers, again either from multiple or single sources, allow further speed increases. Although FTP has the ability to resume an interrupted file transfer from a specific point in a file, it does not support the transmission of only a certain portion of a file. GridFTP allows a subset of a file to be sent. Such a feature is useful in applications where only small sections of a very large data file are required for processing (a motivating example being the processing of data from a high energy physics experiment, a traditional use of Grid technology). GridFTP provides a fault tolerant implementation of FTP, to handle network unavailability and server problems. Transfers can also be automatically restarted if a problem occurs. The underlying TCP connection in FTP has numerous settings such as window size and buffer size. GridFTP allows automatic (or manual) negotiation of these settings to provide optimal transfer speeds and reliability (optimal settings are likely to be different with large ''files'' and for large ''groups'' of files).


References

{{DEFAULTSORT:Gridftp Grid computing File Transfer Protocol Network file transfer protocols