DeepSpeed is an
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
deep learning
Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.
...
optimization library for
PyTorch
PyTorch is a machine learning framework based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is free and open ...
. The library is designed to reduce computing power and
memory use and to train large
distributed Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
*Probability distribution, the probability of a particular value or value range of a varia ...
models with better
parallelism on existing
computer hardware
Computer hardware includes the physical parts of a computer, such as the case, central processing unit (CPU), random access memory (RAM), monitor, mouse, keyboard, computer data storage, graphics card, sound card, speakers and motherboard.
...
. DeepSpeed is optimized for low latency, high throughput training. It includes the Zero Redundancy Optimizer (ZeRO) for training models with 1 trillion or more parameters. Features include mixed precision training, single-GPU, multi-GPU, and multi-node training as well as custom model parallelism. The DeepSpeed source code is licensed under
MIT License
The MIT License is a permissive free software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts only very limited restriction on reuse and has, therefore, high license comp ...
and available on
GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, cont ...
.
The team claimed to achieve up to a 6.2x throughput improvement, 2.8x faster convergence, and 4.6x less communication.
See also
*
Comparison of deep learning software
The following table compares notable software frameworks, libraries and computer programs for deep learning.
Deep-learning software by name
Comparison of compatibility of machine learning models
See also
*Comparison of numerical-analy ...
*
Deep learning
Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.
...
*
Machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
*
TensorFlow
TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learnin ...
References
Further reading
*
External links
AI at Scale - Microsoft ResearchGitHub - microsoft/DeepSpeedZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters - Microsoft Research
C++ libraries
Python (programming language) libraries
Free and open-source software
Microsoft development tools
Microsoft free software
Microsoft Research
Software using the MIT license
2020 software
Deep learning software
{{Microsoft-software-stub