Chinchilla (language Model)
   HOME





Chinchilla (language Model)
Chinchilla is a family of large language models (LLMs) developed by the research team at Google DeepMind, presented in March 2022. Models It is named "chinchilla" because it is a further development over a previous model family named Gopher. Both model families were trained in order to investigate the Neural scaling law, scaling laws of large language models. It claimed to outperform GPT-3. It considerably simplifies downstream utilization because it requires much less computer power for inference and fine-tuning. Based on the training of previously employed language models, it has been determined that if one doubles the model size, one must also have twice the number of training tokens. This hypothesis has been used to train Chinchilla by DeepMind. Similar to Gopher in terms of cost, Chinchilla has 70B parameters and four times as much data. Chinchilla has an average accuracy of 67.5% on the Measuring Massive Multitask Language Understanding (MMLU) benchmark, which is 7% higher ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Large Language Models
A large language model (LLM) is a language model trained with Self-supervised learning, self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially Natural language generation, language generation. The largest and most capable LLMs are Generative pre-trained transformer, generative pretrained transformers (GPTs), which are largely used in Generative artificial intelligence, generative Chatbot, chatbots such as ChatGPT or Gemini (chatbot), Gemini. LLMs can be Fine-tuning (deep learning), fine-tuned for specific tasks or guided by prompt engineering. These models acquire Predictive learning, predictive power regarding syntax, semantics, and Ontology (information science), ontologies inherent in human Text corpus, language corpora, but they also inherit inaccuracies and Algorithmic bias, biases present in the Training, validation, and test data sets, data they are trained in. History Before the emergence of transformer-bas ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  



MORE