HOME

TheInfoList



OR:

Imagen is a series of text-to-image models developed by
Google DeepMind DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British–American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Goo ...
. They were developed by
Google Brain Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence ...
until the company's merger with DeepMind in April 2023. Imagen is primarily used to generate images from text prompts, similar to
Stability AI Stability AI Ltd is a UK-based artificial intelligence company, best known for its text-to-image model Stable Diffusion. History and founding Stability AI was founded in 2019 by Emad Mostaque and by Cyrus Hodes. In August 2022 Stability AI r ...
's
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...
,
OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
's DALL-E, or Midjourney. The original version of the model was first discussed in a paper from May 2022. The tool produces high-quality images and is available to all users with a Google account through services including Gemini, ImageFX, and Vertex AI.


History

Imagen's original version was first presented in a paper published in May 2022. It featured the ability to generate high-fidelity images from natural language. The second version, Imagen 2 was released in December 2023. The standout feature was text and logo generation. Imagen 3 was released in August 2024. Google claims that the newest version provides better detail and lighting on generated images. On 20 May 2025 at
Google I/O Google I/O, or simply I/O, is an annual developer conference held by Google in Mountain View, California. The name "I/O" is taken from the number googol, with the "I" representing the first digit "1" in a googol and the "O" representing the s ...
2025 the company released an improved model, Imagen 4.


Technology

Imagen uses two key technologies. The first is the use of
transformer In electrical engineering, a transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple Electrical network, circuits. A varying current in any coil of the transformer produces ...
-based
large language model A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are g ...
s, notably T5, to understand text and subsequently encode text for image synthesis. The second is the use of cascaded diffusion models providing high-fidelity image generation. It generates image in three stages, starting from a base of 64x64, then upsampled to 256x256 and 1024x1024.


Capabilities

Imagen can generate photorealistic images from text prompts. It can also create various styles, such as cinematic, 35mm film, illustration, and surreal. The model can generate images in five aspect ratios, namely 9:16, 3:4, 1:1, 4:3, and 16:9. Imagen can also refine already generated images by editing existing text prompts.


See also

*
Artificial intelligence art Artificial intelligence visual art means visual artwork generated (or enhanced) through the use of artificial intelligence (AI) programs. Artists began to create AI art in the mid to late 20th century, when the discipline was founded. Throug ...
* Computer art *
Generative art Generative art is post-conceptual art that has been created (in whole or in part) with the use of an autonomous system. An ''autonomous system'' in this context is generally one that is non-human and can independently determine features of an ...
* DALL-E * Midjourney *
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...


References


External links


Imagen website
{{Artificial intelligence navbox Text-to-image generation Artificial intelligence art Deep learning software applications 2022 software Google DeepMind