Huawei PanGu, PanGu, PanGu-Σ or PanGu-π ( zh, s=盘古大模型, p=pángǔ dà móxíng) is a
multimodal large language model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation.
The largest and most capable LLMs are g ...
developed by
Huawei
Huawei Technologies Co., Ltd. ("Huawei" sometimes stylized as "HUAWEI"; ; zh, c=华为, p= ) is a Chinese multinational corporationtechnology company in Longgang, Shenzhen, Longgang, Shenzhen, Guangdong. Its main product lines include teleco ...
. It was announced on July 7, 2023.
The name of the large learning language model, ''PanGu'', was derived from the Chinese mythology and folklore of
Pangu
Pangu or Pan Gu (also sometimes spelled Peng Gu and P’an-ku)
( zh, t=盤古, ) is a primordial being and creation figure in Chinese mythology and in Taoism. According to legend, Pangu separated heaven and earth, and his body later became ge ...
, a primordial character related to the creation of the world.
History
Early development
In April 2023, Huawei released a paper detailing the development of PanGu-Σ, a colossal language model featuring 1.085 trillion parameters. Developed within Huawei's
MindSpore 5 framework, PanGu-Σ underwent training for over 100 days on a cluster system equipped with 512 Ascend 910 AI accelerator chips, processing 329 billion tokens in more than 40
natural
Nature is an inherent character or constitution, particularly of the ecosphere or the universe as a whole. In this general sense nature refers to the laws, elements and phenomena of the physical world, including life. Although humans are part ...
and
programming languages.
PanGu-Σ incorporates Random Routed Experts (RRE) and the Transformer decoder architecture, allowing easy extraction of sub-models for various applications like conversation, translation, code production, and natural language interpretation. The model achieves 6.3 times faster training throughput compared to
MoE models with the same hyper-parameters. In the Chinese domain, it outperforms previous state-of-the-art models across 16 tasks in a zero-shot setting. Trained on datasets from 40 domains, including Chinese, English, Bilingual, and code, PanGu-Σ excels in
few-shot natural-language understanding
Natural language understanding (NLU) or natural language interpretation (NLI) is a subset of natural language processing in artificial intelligence that deals with machine reading comprehension. NLU has been considered an AI-hard problem.
The ...
, open-domain discussion, question answering, machine translation, and code creation.
Launch
During the Huawei Developer Conference on July 7, 2023, Huawei introduced PanGu 3.0, a large language model (LLM), tailored for sectors like government, finance, manufacturing, mining, and meteorology utilizing solutions. In the subsequent month, Huawei launched the
Celia Virtual Assistant with advanced AI features, capable of generating long text replies based on user voice commands and set to release with
HarmonyOS
HarmonyOS (HMOS) ( zh, s=鸿蒙, p=Hóngméng, tr=Vast Mist) is a distributed operating system developed by Huawei for smartphones, tablet computer, tablets, smart TVs, smart watches, personal computers and other smart devices. It has a microk ...
4.0 for eligible devices.
The LLM was designed for enterprises seeking advantages in the AI industry, focusing on task execution over creative work, unlike traditional models used for general purposes like chatbots, poetry, and visual content creation.
Using the same technology as
ChatGPT
ChatGPT is a generative artificial intelligence chatbot developed by OpenAI and released on November 30, 2022. It uses large language models (LLMs) such as GPT-4o as well as other Multimodal learning, multimodal models to create human-like re ...
, Huawei's LLM features a hierarchical architecture, allowing customers to adapt the model to various tasks and train it on their own datasets, making it versatile across various industries.
Updates
On August 5, 2023,
Huawei
Huawei Technologies Co., Ltd. ("Huawei" sometimes stylized as "HUAWEI"; ; zh, c=华为, p= ) is a Chinese multinational corporationtechnology company in Longgang, Shenzhen, Longgang, Shenzhen, Guangdong. Its main product lines include teleco ...
partnered with
European Centre for Medium-Range Weather Forecasts
The European Centre for Medium-Range Weather Forecasts (ECMWF) is an independent intergovernmental organisation supported by most of the nations of Europe. It is based at three sites: Shinfield Park, Reading, United Kingdom; Bologna, Italy; a ...
(ECMWF) to launch a global weather forecasting AI model. This model used Huawei Cloud solutions and the PanGu-Weather Model with
MindSpore
MindSpore is a open-source software framework for deep learning, machine learning and artificial intelligence developed by Huawei.
Overview
It has support for custom OpenHarmony-based HarmonyOS NEXT single core framework system built for Harm ...
. It is accessible on the ECMWF website and aims to provide accurate weather data.
On December 19, 2023, Huawei announced its financial services on the PanGu-powered AI Finance platform for the global market. The tech giant introduced this product at the 2023 Huawei Cloud Fintech Summit, aiming to reshape the digital finance industry with efficient features to boost Fintech firms worldwide. The platform incorporated a variety of advanced technologies, including AI, big data analytics, and blockchain.
On June 21, 2024, at HDC 2024, Huawei announced upgraded PanGu 5.0 alongside
HarmonyOS NEXT. This version integrated with
Harmony Intelligence, which features a smarter
Celia (Xiaoyi) and focuses on generative AI updates to its
LLM
A large language model (LLM) is a language model trained with Self-supervised learning, self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially Natural language generation, language g ...
platform for creating new content, such as text, code, or images. Aiming to make PanGu accessible to a wide range of developers and businesses, it offered scalable options: smaller models requiring less computational power for those with limited resources, and larger models with increased capacities for complex tasks requiring more processing power.
Technical specifications
PanGu Large Model 3.0, designed for industry use, was structured with a 5+N+X three-tier architecture.
* First Layer (L0): Comprises PanGu's five basic large models to provide a variety of capabilities for different industry scenarios. These include Natural Language Processing (NLP) models, Visual models, Multimodal models, Prediction models, and Scientific Computing models.
* Second Layer (L1): Consists of N large industry-specific models. These models are trained using public data from various industries, such as government, finance, manufacturing, mining, and weather. Additionally, it uses customers' own data from L0 and L1 to train proprietary models tailored for each customer.
* Third Layer (L2): Provides customers with detailed scenario-specific models. This layer focuses on specific applications or business needs, offering ready-to-use model services.
The updated Huawei PanGu Model 5.0 by Huawei Cloud business division offered three key features: adaptability for different business scenarios, multi-style modeling, and advanced intelligence. Huawei divided the AI model platform into four series, each with different parameter scales:
* PanGu E Series: The Embedded version supports smart apps on phones, tablets, PCs, and other devices, with a parameter scale of 1 billion.
* PanGu P Series: The Professional version features a 10-billion parameter scale, ideal for low-latency and low-cost reasoning conditions.
* PanGu U Series: The Ultra version comes in two variants, with 135 billion and 230 billion parameters, capable of handling complex tasks and serving as a base for large models.
* PanGu S Series: The Super PanGu is the top-tier edition, featuring trillion-level parameters, designed to manage advanced AI technology scenarios such as cross-domain or multi-tasking applications.
See also
*
Large Language Model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation.
The largest and most capable LLMs are g ...
*
Gemini
*
GPT-4
Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model trained and created by OpenAI and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the p ...
References
{{Huawei, phones=yes, tablets=yes, laptops=yes, wearables=yes, cpu=yes, os=yes, services=yes, people=yes, other=yes, below=yes
2023 software
Huawei products
Large language models
Multimodal interaction