BLOOM (language Model)
   HOME

TheInfoList



OR:

BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is a
transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...
-based large language model (LLM). It is a free LLM available to the public. It was trained on approximately 366 billion tokens from March to July 2022. Initiated by a co founder of HuggingFace the BLOOM project involved six main groups: HuggingFace's BigScience team, Microsoft DeepSpeed team, NVIDIA Megatron-LM team, IDRIS/GENCI team, PyTorch team, and volunteers in the BigScience Engineering workgroup. The training data encompasses 46 natural languages and 13 programming languages amounting to 1.6 terabytes of pre-processed text converted into 350 billion unique tokens to BLOOM's training datasets.


References

Large language models Generative pre-trained transformers {{compu-ling-stub