Veo (text-to-video Model)
   HOME

TheInfoList



OR:

Veo is a text-to-video model developed by
Google DeepMind DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British–American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Goo ...
and announced in May 2024. As a
generative AI Generative artificial intelligence (Generative AI, GenAI, or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models learn the underlying patterns and str ...
model, it creates videos based on user prompts. Veo 3, released in May 2025, can also generate accompanying audio.


Development

In May 2024, a multimodal video generation model called Veo was announced at
Google I/O Google I/O, or simply I/O, is an annual developer conference held by Google in Mountain View, California. The name "I/O" is taken from the number googol, with the "I" representing the first digit "1" in a googol and the "O" representing the s ...
2024. Google claimed that it could generate
1080p 1080p (1920 × 1080 progressively displayed pixels; also known as Full HD or FHD, and BT.709) is a set of HDTV high-definition video modes characterized by 1,920 pixels displayed across the screen horizontally and 1,080 pixels down the sc ...
videos over a minute long. In December 2024,
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
released Veo 2, available via VideoFX. It supports
4K resolution 4K resolution refers to a horizontal display resolution of approximately 4,000 pixels. Digital television and digital cinematography commonly use several different 4K resolutions. In television and consumer media, 38402160 (4K UHD) with a 16:9 asp ...
video generation and has an improved understanding of physics. In April 2025, Google announced that Veo 2 became available for advanced users on the Gemini app. In May 2025, Google released Veo 3, which not only generates videos but also creates synchronized audio — including dialogue, sound effects, and ambient noise — to match the visuals. Google also announced Flow, a video-creation tool powered by Veo and Imagen. A key innovation of the May 2025 release of Veo 3 was that it generated audio, including music and voices, to match well with the video. Google DeepMind CEO
Demis Hassabis Sir Demis Hassabis (born 27 July 1976) is a British artificial intelligence (AI) researcher, and entrepreneur. He is the chief executive officer and co-founder of Google DeepMind, and Isomorphic Labs, and a UK Government AI Adviser. In 2024, Ha ...
described the release as the moment when AI video generation left the era of the
silent film A silent film is a film without synchronized recorded sound (or more generally, no audible dialogue). Though silent films convey narrative and emotion visually, various plot elements (such as a setting or era) or key lines of dialogue may, w ...
.


Reactions

A reporter for ''
Gizmodo ''Gizmodo'' () is a design, technology, science, and science fiction website. It was originally launched as part of the Gawker Media network run by Nick Denton. ''Gizmodo'' also includes the sub-blogs ''io9'' and ''Earther'', which focus on pop ...
'' reacted to the release of Veo 3 by observing that users directed the model to generate low-quality content, such as
man on the street ( ) is a List of Latin phrases, Latin phrase (originally ''Vox populi, vox Dei'' – "The voice of the people is the voice of God") that literally means "voice of the people." It is used in English in the meaning "the opinion of the majority of ...
interviews or haul videos of people unboxing products. Another media commentator reported that the tool tended to repeat the same joke in response to different prompts. Commentators speculated that Google had trained the service on YouTube videos or
Reddit Reddit ( ) is an American Proprietary software, proprietary social news news aggregator, aggregation and Internet forum, forum Social media, social media platform. Registered users (commonly referred to as "redditors") submit content to the ...
posts. Google itself had not stated the source of its training content.


References


External links

* {{Artificial intelligence navbox 2024 software Applications of artificial intelligence Film and video technology Google DeepMind Text-to-video generation Video processing