
Human image synthesis is technology that can be applied to make believable and even
photorealistic renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using
computer generated imagery
Computer-generated imagery (CGI) is the use of computer graphics to create or contribute to images in art, printed media, video games, simulators, and visual effects in films, television programs, shorts, commercials, and videos. The ima ...
have featured synthetic images of human-like characters
digitally composited onto the real or other simulated film material. Towards the end of the 2010s
deep learning artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machine
A machine is a physical system using Power (physics), power to apply Force, forces and control Motion, moveme ...
has been applied to
synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work.
Timeline of human image synthesis
* In 1971
Henri Gouraud made the first
CG geometry
Geometry (; ) is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is c ...
capture
Capture may refer to:
*Asteroid capture, a phenomenon in which an asteroid enters a stable orbit around another body
*Capture, a software for lighting design, documentation and visualisation
*"Capture" a song by Simon Townshend
*Capture (band), an ...
and representation of a human face. Modeling was his wife Sylvie Gouraud. The 3D model was a simple
wire-frame model
A wire-frame model, also wireframe model, is a visual representation of a three-dimensional (3D) physical object used in 3D computer graphics. It is created by specifying each edge of the physical object where two mathematically continuou ...
and he applied
the Gouraud shader he is most known for to produce the first known representation of human-likeness on compute
(view images)
* The 1972 short film ''
A Computer Animated Hand
''A Computer Animated Hand'' is the title of a 1972 American computer-animated short film produced by Edwin Catmull and Fred Parke. Produced during Catmull's tenure at the University of Utah, the short was created for a graduate course project. ...
'' by
Edwin Catmull
Edwin Earl "Ed" Catmull (born March 31, 1945) is an American computer scientist who is the co-founder of Pixar and was the President of Walt Disney Animation Studios. He has been honored for his contributions to 3D computer graphics, including th ...
and
Fred Parke was the first time that
computer-generated imagery
Computer-generated imagery (CGI) is the use of computer graphics to create or contribute to images in art, printed media, video games, simulators, and visual effects in films, television programs, shorts, commercials, and videos. The image ...
was used in film to simulate moving human appearance. The film featured a computer simulated hand and fac
(watch film here)
* The 1976 film ''
Futureworld
''Futureworld'' is a 1976 American science fiction thriller film directed by Richard T. Heffron and written by Mayo Simon and George Schenck. It is a sequel to the 1973 Michael Crichton film '' Westworld'', and is the second installment in ...
'' reused parts of ''A Computer Animated Hand'' on the big screen.
* The 1983
music video for song Musique Non-Stop by German band
Kraftwerk
Kraftwerk (, "power station") is a German band formed in Düsseldorf in 1970 by Ralf Hütter and Florian Schneider. Widely considered innovators and pioneers of electronic music, Kraftwerk were among the first successful acts to popularize t ...
aired in 1986. Created by the artist
Rebecca Allen, it features non-realistic looking, but clearly recognizable computer simulations of the band members.
* The 1994 film
The Crow
The Crow is a supernatural superhero comic book series created by James O'Barr revolving around the titular character of the same name. The series, which was originally created by O'Barr as a means of dealing with the death of his fiancée at t ...
was the first film production to make use of digital compositing of a computer simulated representation of a face onto scenes filmed using a
body double
In filmmaking, a double is a person who substitutes FOR another actor such that the person's face is not shown. There are various terms associated with a double based on the specific body part or ability they serve as a double for, such as stunt ...
. Necessity was the muse as the actor
Brandon Lee
Brandon Bruce Lee (February 1, 1965 – March 31, 1993) was an American actor and martial artist. Establishing himself as a rising action star in the early 1990s, he landed his breakthrough role as Eric Draven in the dark fantasy film ''Th ...
portraying the protagonist was tragically killed accidentally on-stage.
* In 1999
Paul Debevec et al. of
USC captured the reflectance field of a human face with their first version of a
light stage
A light stage or light cage is equipment used for 3D modeling, shape, texture mapping, texture, reflectance and motion capture often with structured light and a multi-camera setup.
Reflectance capture
The reflectance field over a human face was ...
. They presented their method at the
SIGGRAPH 2000

* In 2003
audience
An audience is a group of people who participate in a show or encounter a work of art, literature (in which they are called "readers"), theatre, music (in which they are called "listeners"), video games (in which they are called "players" ...
debut of photo realistic human-likenesses in the 2003 films ''
The Matrix Reloaded'' in
the burly brawl sequence where up-to-100
Agent Smiths fight
Neo
Neo or NEO may refer to:
Arts and entertainment Fictional entities
* Neo (''The Matrix''), the alias of Thomas Anderson, a hacker and the protagonist of the Matrix film series
* Neo (''Marvel Comics'' species), a fictional race of superhumans
* ...
and in ''
The Matrix Revolutions
''The Matrix Revolutions'' is a 2003 American science fiction action film written and directed by the Wachowskis. It is the third installment in ''The Matrix'' film series, released six months following '' The Matrix Reloaded''. The film st ...
'' where at the start of the end showdown Agent Smith's
cheekbone
In the human skull, the zygomatic bone (from grc, ζῠγόν, zugón, yoke), also called cheekbone or malar bone, is a paired irregular bone which articulates with the maxilla, the temporal bone, the sphenoid bone and the frontal bone. It ...
gets punched in by Neo leaving the digital look-alike unnaturally unhurt. The Matrix Revolutions bonus DVD documents and depicts the process in some detail and the techniques used, including
facial motion capture and
limb
Limb may refer to:
Science and technology
* Limb (anatomy), an appendage of a human or animal
*Limb, a large or main branch of a tree
*Limb, in astronomy, the curved edge of the apparent disk of a celestial body, e.g. lunar limb
*Limb, in botany, ...
al
motion capture
Motion capture (sometimes referred as mo-cap or mocap, for short) is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robo ...
, and
projection onto models.
* In 2003
''The Animatrix: Final Flight of the Osiris'' a
state-of-the-art
The state of the art (sometimes cutting edge or leading edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contexts it can also refer to a level ...
want-to-be human likenesses not quite fooling the watcher made by
Square Pictures.
* In 2003 digital likeness of
Tobey Maguire
Tobias Vincent Maguire (born June 27, 1975) is an American actor and film producer. He is best known for playing the title character from Sam Raimi's ''Spider-Man'' trilogy (2002–2007), a role he later reprised in '' Spider-Man: No Way Hom ...
was made for movies ''
Spider-man 2
''Spider-Man 2'' is a 2004 American superhero film directed by Sam Raimi and written by Alvin Sargent from a story by Alfred Gough, Miles Millar and Michael Chabon. Based on the fictional Marvel Comics character of the same name, it is t ...
'' and ''
Spider-man 3'' by
Sony Pictures Imageworks.
[
]
* In 2005 the
Face of the Future project was an established.
[
] by the
University of St Andrews
(Aien aristeuein)
, motto_lang = grc
, mottoeng = Ever to ExcelorEver to be the Best
, established =
, type = Public research university
Ancient university
, endowment ...
and Perception Lab, funded by the
EPSRC.
[
] The website contains a "Face Transformer", which enables users to transform their face into any
ethnicity
An ethnic group or an ethnicity is a grouping of people who identify with each other on the basis of shared attributes that distinguish them from other groups. Those attributes can include common sets of traditions, ancestry, language, history, ...
and
age as well as the ability to transform their face into a painting (in the style of either
Sandro Botticelli
Alessandro di Mariano di Vanni Filipepi ( – May 17, 1510), known as Sandro Botticelli (, ), was an Italian painter of the Early Renaissance. Botticelli's posthumous reputation suffered until the late 19th century, when he was rediscovered ...
or
Amedeo Modigliani
Amedeo Clemente Modigliani (, ; 12 July 1884 – 24 January 1920) was an Italian painter and sculptor who worked mainly in France. He is known for portraits and nudes in a modern style characterized by a surreal elongation of faces, necks, a ...
). This process is achieved by combining the user's photograph with an
average
In ordinary language, an average is a single number taken as representative of a list of numbers, usually the sum of the numbers divided by how many numbers are in the list (the arithmetic mean). For example, the average of the numbers 2, 3, 4, 7, ...
face.
[
* In 2009 Debevec et al. presented new digital likenesses, made by Image Metrics, this time of actress Emily O'Brien whose reflectance was captured with the USC light stage 5][In this TED talk video](_blank)
at 00:04:59 you can see ''two clips, one with the real Emily shot with a real camera and one with a digital look-alike of Emily, shot with a simulation of a camera – Which is which is difficult to tell''. Bruce Lawmen was scanned using USC light stage 6 in still position and also recorded running there on a treadmill
A treadmill is a device generally used for walking, running, or climbing while staying in the same place. Treadmills were introduced before the development of powered machines to harness the power of animals or humans to do work, often a type o ...
. Many, many digital look-alikes of Bruce are seen running fluently and natural looking at the ending sequence of the TED talk video. Motion looks fairly convincing contrasted to the clunky run in the ''Animatrix: Final Flight of the Osiris'' which was state-of-the-art
The state of the art (sometimes cutting edge or leading edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contexts it can also refer to a level ...
in 2003 if photorealism was the intention of the animators.
* In 2009 a digital look-alike of a younger Arnold Schwarzenegger was made for the movie '' Terminator Salvation'' though the end result was critiqued as unconvincing. Facial geometry was acquired from a 1984 mold of Schwarzenegger.
* In 2010 Walt Disney Pictures
Walt Disney Pictures is an American Film studio, film production company and subsidiary of Walt Disney Studios (division), Walt Disney Studios, which is owned by The Walt Disney Company. The studio is the flagship producer of live-action featur ...
released a sci-fi sequel entitled '' Tron: Legacy'' with a digitally rejuvenated digital look-alike of actor Jeff Bridges playing the antagonist CLU The term CLU can refer to:
Organizations
* California Lutheran University
* Claremont Lincoln University
* Communion and Liberation – University
* Czech Lacrosse Union
Other uses
* CLU (gene), the gene for clusterin
* CLU (programming l ...
.
*In SIGGGRAPH 2013 Activision
Activision Publishing, Inc. is an American video game publisher based in Santa Monica, California. It serves as the publishing business for its parent company, Activision Blizzard, and consists of several subsidiary studios. Activision is one ...
and USC presented a real time "Digital Ira" a digital face look-alike of Ari Shapiro, an ICT USC research scientist,[
] utilizing the USC light stage X by Ghosh et al. for both reflectance field and motion capture.[
] The end result both precomputed and real-time rendering with the modernest game GPU shown here and looks fairly realistic.
* In 2014 The Presidential Portrait by USC ICT
ICT may refer to:
Sciences and technology
* Information and communications technology
* Image Constraint Token, in video processing
* Immunochromatographic test, a rapid immunoassay used to detect diseases such as anthrax
* In-circuit test, in ...
in conjunction with the Smithsonian Institution
The Smithsonian Institution ( ), or simply the Smithsonian, is a group of museums and education and research centers, the largest such complex in the world, created by the U.S. government "for the increase and diffusion of knowledge". Founded ...
was made using the latest USC mobile light stage wherein President Barack Obama
Barack Hussein Obama II ( ; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party (United States), Democratic Party, Obama was the first Af ...
had his geometry, textures and reflectance captured.[
]
* In 2014 Ian Goodfellow et al. presented the principles of a generative adversarial network. GANs made the headlines in early 2018 with the deepfakes controversies.
* For the 2015 film '' Furious 7'' a digital look-alike of actor Paul Walker who died in an accident during the filming was done by Weta Digital to enable the completion of the film.[
]
* In 2016 techniques which allow near real-time counterfeiting
To counterfeit means to imitate something authentic, with the intent to steal, destroy, or replace the original, for use in illegal transactions, or otherwise to deceive individuals into believing that the fake is of equal or greater value tha ...
of facial expressions in existing 2D video have been believably demonstrated.[
]
* In 2016 a digital look-alike of Peter Cushing was made for the '' Rogue One'' film where its appearance would appear to be of same age as the actor was during the filming of the original 1977 ''Star Wars
''Star Wars'' is an American epic space opera multimedia franchise created by George Lucas, which began with the eponymous 1977 film and quickly became a worldwide pop-culture phenomenon. The franchise has been expanded into various film ...
'' film.
* In SIGGRAPH 2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from University of Washington
The University of Washington (UW, simply Washington, or informally U-Dub) is a public research university in Seattle, Washington.
Founded in 1861, Washington is one of the oldest universities on the West Coast; it was established in Seat ...
(view)
It was driven only by a voice track as source data for the animation after the training phase to acquire lip sync
Lip sync or lip synch (pronounced , the same as the word ''sink'', short for lip synchronization) is a technical term for matching a speaking or singing person's lip movements with sung or spoken vocals.
Audio for lip syncing is generated th ...
and wider facial information from training material
Training is teaching, or developing in oneself or others, any skills and knowledge or fitness that relate to specific useful competencies. Training has specific goals of improving one's capability, capacity, productivity and performance. It ...
consisting 2D videos with audio had been completed.[
]
* Late 2017 and early 2018 saw the surfacing of the deepfakes controversy where porn videos were doctored using deep machine learning so that the face of the actress was replaced by the software's opinion of what another persons face would look like in the same pose and lighting.
* In 2018 GDC Epic Games
Epic Games, Inc. is an American video game and software developer and publisher based in Cary, North Carolina. The company was founded by Tim Sweeney as Potomac Computer Systems in 1991, originally located in his parents' house in Potomac, ...
and Tencent Games
Tencent Games () is the video game publishing division of Tencent Interactive Entertainment, itself a division of Tencent Holdings. It has five internal studio groups, including TiMi Studio Group. Tencent Games was founded in 2003 to focus on on ...
demonstrated "Siren", a digital look-alike of the actress Bingjie Jiang. It was made possible with the following technologies: CubicMotion's computer vision
Computer vision is an Interdisciplinarity, interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate t ...
system, 3Lateral
Epic Games, Inc. is an American video game and software developer and publisher based in Cary, North Carolina. The company was founded by Tim Sweeney as Potomac Computer Systems in 1991, originally located in his parents' house in Potomac, Ma ...
's facial rigging system and Vicon's motion capture system. The demonstration ran in near real time at 60 frames per second in the Unreal Engine 4
Unreal Engine (UE) is a 3D computer graphics game engine developed by Epic Games, first showcased in the 1998 first-person shooter game ''Unreal (1998 video game), Unreal''. Initially developed for Personal computer, PC first-person shooters, i ...
.[
]
* In 2018 at the World Internet Conference in Wuzhen the Xinhua News Agency
Xinhua News Agency (English pronunciation: )J. C. Wells: Longman Pronunciation Dictionary, 3rd ed., for both British and American English, or New China News Agency, is the official state news agency of the People's Republic of China. Xinhua ...
presented two digital look-alikes made to the resemblance of its real news anchors Qiu Hao (Chinese language)[
] and Zhang Zhao (English language). The digital look-alikes were made in conjunction with Sogou.[
] Neither the speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
used nor the gesturing of the digital look-alike anchors were good enough to deceive the watcher to mistake them for real humans imaged with a TV camera.
* In September 2018 Google added "involuntary synthetic pornographic imagery" to its ban list, allowing anyone to request the search engine block results that falsely depict them as "nude or in a sexually explicit situation."[
]
* In February 2019 Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
s StyleGAN
StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.
StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, w ...
, a novel generative adversarial network.[
] Right after this Phillip Wang made the website ThisPersonDoesNotExist.com with StyleGAN to demonstrate that unlimited amounts of often photo-realistic looking facial portraits of no-one can be made automatically using a GAN.[
] Nvidia's StyleGAN was presented in a not yet peer review
Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work ( peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer revie ...
ed paper in late 2018.
* At the June 2019 CVPR the MIT CSAIL presented a system titled ''"Speech2Face: Learning the Face Behind a Voice"'' that synthesizes likely faces based on just a recording of a voice. It was trained with massive amounts of video of people speaking.
* Since 1 July 2019 Virginia
Virginia, officially the Commonwealth of Virginia, is a state in the Mid-Atlantic and Southeastern regions of the United States, between the East Coast of the United States, Atlantic Coast and the Appalachian Mountains. The geography an ...
has criminalized the sale and dissemination of unauthorized synthetic pornography, but not the manufacture.,[
] a
§ 18.2–386.2 titled 'Unlawful dissemination or sale of images of another; penalty.'
became part of the Code of Virginia. The law text states: "''Any person who, with the intent to coerce, harass, or intimidate, malicious
Malicious may refer to:
Films and video games
* ''Malicious'' (1973 film) (''Malizia''), an Italian comedy starring Laura Antonelli
* ''Malicious'' (1995 film), an American thriller starring Molly Ringwald
* ''Malicious'' (2018 film), an Americ ...
ly disseminates or sells any videographic or still image created by any means whatsoever that depicts another person who is totally nude, or in a state of undress so as to expose the genitals
A sex organ (or reproductive organ) is any part of an animal or plant that is involved in sexual reproduction. The reproductive organs together constitute the reproductive system. In animals, the testis in the male, and the ovary in the female, ...
, pubic area, buttocks
The buttocks (singular: buttock) are two rounded portions of the exterior anatomy of most mammals, located on the posterior of the pelvic region. In humans, the buttocks are located between the lower back and the perineum. They are compose ...
, or female breast
The breast is one of two prominences located on the upper ventral region of a primate's torso. Both females and males develop breasts from the same embryological tissues.
In females, it serves as the mammary gland, which produces and s ...
, where such person knows or has reason to know that he is not license
A license (or licence) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit).
A license is granted by a party (licensor) to another party (licensee) as an element of an agreeme ...
d or authorized to disseminate or sell such videographic or still image is guilty of a Class 1 misdemeanor
A misdemeanor (American English, spelled misdemeanour elsewhere) is any "lesser" criminal act in some common law legal systems. Misdemeanors are generally punished less severely than more serious felonies, but theoretically more so than ad ...
.''". The identical bills were House Bill 2678 presented by Delegate Marcus Simon to the Virginia House of Delegates
The Virginia House of Delegates is one of the two parts of the Virginia General Assembly, the other being the Senate of Virginia. It has 100 members elected for terms of two years; unlike most states, these elections take place during odd-numbe ...
on 14 January 2019 and three-day later an identical Senate bill 1736 was introduced to the Senate of Virginia by Senator Adam Ebbin.
* Since 1 September 2019 Texas
Texas (, ; Spanish language, Spanish: ''Texas'', ''Tejas'') is a state in the South Central United States, South Central region of the United States. At 268,596 square miles (695,662 km2), and with more than 29.1 million residents in 2 ...
senate bill SB 751 amendments to the election code came into effect, giving candidates in elections
An election is a formal group decision-making process by which a population chooses an individual or multiple individuals to hold public office.
Elections have been the usual mechanism by which modern representative democracy has opera ...
a 30-day protection period to the elections during which making and distributing digital look-alikes or synthetic fakes of the candidates is an offense. The law text defines the subject of the law as "''a video, created with the intent to deceive, that appears to depict a real person performing an action that did not occur in reality''"[
]
* In September 2019 Yle, the Finnish public broadcasting company
Public broadcasting involves radio, television and other electronic media outlets whose primary mission is public service. Public broadcasters receive funding from diverse sources including license fees, individual contributions, public financ ...
, aired a result of experimental journalism, a deepfake of the President in office Sauli Niinistö
Sauli Väinämö Niinistö (; born 24 August 1948) is a Finnish politician who has served as president of Finland since March 2012, the 12th person to hold that office.
A lawyer by education, Niinistö was Chairman of the National Coalition Pa ...
in its main news broadcast for the purpose of highlighting the advancing disinformation technology and problems that arise from it.
* 1 January 2020[
] California the state law AB-602 came into effect banning the manufacturing and distribution Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
*Probability distribution, the probability of a particular value or value range of a varia ...
of synthetic pornography without the consent
Consent occurs when one person voluntarily agrees to the proposal or desires of another. It is a term of common speech, with specific definitions as used in such fields as the law, medicine, research, and sexual relationships. Consent as und ...
of the people depicted. AB-602 provides victims of synthetic pornography with injunctive relief and poses legal threats of statutory
A statute is a formal written enactment of a legislative authority that governs the legal entities of a city, state, or country by way of consent. Typically, statutes command or prohibit something, or declare policy. Statutes are rules made by ...
and punitive damages on criminals making or distributing synthetic pornography without consent. The bill AB-602 was signed into law by California Governor
A governor is an administrative leader and head of a polity or political region, ranking under the head of state and in some cases, such as governors-general, as the head of state's official representative. Depending on the type of political ...
Gavin Newsom
Gavin Christopher Newsom (born October 10, 1967) is an American politician and businessman who has been the 40th governor of California since 2019. A member of the Democratic Party, he served as the 49th lieutenant governor of California fro ...
on 3 October 2019 and was authored by California State Assembly member Marc Berman.[
]
* 1 January 2020, Chinese law requiring that synthetically faked footage should bear a clear notice about its fakeness came into effect. Failure to comply could be considered a crime the Cyberspace Administration of China stated on its website. China announced this new law in November 2019.[
] The Chinese government seems to be reserving the right to prosecute both users and online video platform
An online video platform (OVP), provided by a video hosting service, enables users to upload, convert, store and play back video content on the Internet, often via a structured, large-scale system that may generate revenue. Users will generally u ...
s failing to abide by the rules.[
]
*
Key breakthrough to photorealism: reflectance capture
In 1999 Paul Debevec et al. of USC did the first known reflectance capture
The reflectance of the surface of a material is its effectiveness in reflecting radiant energy. It is the fraction of incident electromagnetic power that is reflected at the boundary. Reflectance is a component of the response of the electroni ...
over the human face with their extremely simple light stage
A light stage or light cage is equipment used for 3D modeling, shape, texture mapping, texture, reflectance and motion capture often with structured light and a multi-camera setup.
Reflectance capture
The reflectance field over a human face was ...
. They presented their method and results in SIGGRAPH 2000.[
]
The scientific breakthrough required finding the subsurface light component (the simulation models are glowing from within slightly) which can be found using knowledge that light that is reflected from the oil-to-air layer retains its polarization
Polarization or polarisation may refer to:
Mathematics
*Polarization of an Abelian variety, in the mathematics of complex manifolds
*Polarization of an algebraic form, a technique for expressing a homogeneous polynomial in a simpler fashion by ...
and the subsurface light loses its polarization. So equipped only with a movable light source, movable video camera, 2 polarizers and a computer program doing extremely simple math and the last piece required to reach photorealism was acquired.
For a believable result both light reflected Reflection or reflexion may refer to:
Science and technology
* Reflection (physics), a common wave phenomenon
** Specular reflection, reflection from a smooth surface
*** Mirror image, a reflection in a mirror or in water
** Signal reflection, in ...
from skin ( BRDF) and within the skin (a special case of BTDF) which together make up the BSDF must be captured and simulated.
Capture
* The 3D geometry
Geometry (; ) is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is c ...
and textures are captured onto a 3D model by a 3D reconstruction
In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects.
This process can be accomplished either by active or passive methods. If the model is allowed to change its shape i ...
method, such as sampling the target by means of 3D scanning with an RGB XYZ scanner such as Arius3d or Cyberware (textures from photos, not pure RGB XYZ scanner), stereophotogrammetrically from synchronized photos or even from enough repeated non-simultaneous photos. Digital sculpting can be used to make up models of the body parts for which data cannot be acquired e.g. parts of the body covered by clothing.
* For believable results also the reflectance field
The reflectance of the surface of a material is its effectiveness in reflecting radiant energy. It is the fraction of incident electromagnetic power that is reflected at the boundary. Reflectance is a component of the response of the electron ...
must be captured or an approximation must be picked from the libraries to form a 7D reflectance model of the target.
Synthesis
The whole process of making digital look-alikes i.e. characters so lifelike and realistic that they can be passed off as pictures of humans is a very complex task as it requires photorealistically modeling, animating, cross-mapping, and rendering the soft body dynamics of the human appearance.
Synthesis with an actor and suitable algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s is applied using powerful computers. The actor's part in the synthesis is to take care of mimicking human expressions in still picture synthesizing and also human movement in motion picture synthesizing. Algorithms are needed to simulate laws of physics
Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which rel ...
and physiology
Physiology (; ) is the scientific study of functions and mechanisms in a living system. As a sub-discipline of biology, physiology focuses on how organisms, organ systems, individual organs, cells, and biomolecules carry out the chemic ...
and to map the models and their appearance, movements and interaction accordingly.
Often both physics
Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which rel ...
/physiology
Physiology (; ) is the scientific study of functions and mechanisms in a living system. As a sub-discipline of biology, physiology focuses on how organisms, organ systems, individual organs, cells, and biomolecules carry out the chemic ...
based (i.e. skeletal animation) and image-based modeling and rendering are employed in the synthesis part. Hybrid models employing both approaches have shown best results in realism and ease-of-use. Morph target animation reduces the workload by giving higher level control, where different facial expressions are defined as deformations of the model, which facial allows expressions to be tuned intuitively. Morph target animation can then morph the model between different defined facial expressions or body poses without much need for human intervention.
Using displacement mapping plays an important part in getting a realistic result with fine detail of skin such as pores and wrinkles as small as 100 µm
The micrometre ( international spelling as used by the International Bureau of Weights and Measures; SI symbol: μm) or micrometer (American spelling), also commonly known as a micron, is a unit of length in the International System of Unit ...
.
Machine learning approach
In the late 2010s, machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
, and more precisely generative adversarial networks (GAN), were used by NVIDIA
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
to produce random yet photorealistic human-like portraits. The system, named StyleGAN
StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.
StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, w ...
, was trained on a database of 70,000 images from the images depository website Flickr
Flickr ( ; ) is an American image hosting and video hosting service, as well as an online community, founded in Canada and headquartered in the United States. It was created by Ludicorp in 2004 and was a popular way for amateur and professiona ...
. The source code was made public on GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, co ...
in 2019. Outputs of the generator network from random input were made publicly available on a number of websites.
Similarly, since 2018, deepfake technology has allowed GANs to swap faces between actors; combined with the ability to fake voices, GANs can thus generate fake videos that seem convincing.
Applications
Main applications fall within the domains of stock photography
Stock photography is the supply of photographs which are often licensed for specific uses. The stock photo industry, which began to gain hold in the 1920s, has established models including traditional macrostock photography, midstock photography, ...
, synthetic datasets, virtual cinematography
Virtual cinematography is the set of cinematographic techniques performed in a computer graphics environment. It includes a wide variety of subjects like photographing real objects, often with stereo or multi-camera setup, for the purpose of ...
, computer and video games
Video games, also known as computer games, are electronic games that involves interaction with a user interface or input device such as a joystick, controller, keyboard, or motion sensing device to generate visual feedback. This feedb ...
and covert
Secrecy is the practice of hiding information from certain individuals or groups who do not have the "need to know", perhaps while sharing it with other individuals. That which is kept hidden is known as the secret.
Secrecy is often controvers ...
disinformation
Disinformation is false information deliberately spread to deceive people. It is sometimes confused with misinformation, which is false information but is not deliberate.
The English word ''disinformation'' comes from the application of the ...
attack
Attack may refer to:
Warfare and combat
* Offensive (military)
* Charge (warfare)
* Attack (fencing)
* Strike (attack)
* Attack (computing)
* Attack aircraft
Books and publishing
* ''The Attack'' (novel), a book
* '' Attack No. 1'', comic an ...
s.
Furthermore, some research suggests that it can have therapeutic effects as "psychologist
A psychologist is a professional who practices psychology and studies mental states, perceptual, cognitive, emotional, and social processes and behavior. Their work often involves the experimentation, observation, and interpretation of how ...
s and counselors have also begun using avatars to deliver therapy to clients who have phobias, a history of trauma, addictions, Asperger’s syndrome or social anxiety
Social anxiety is the anxiety and fear specifically linked to being in social settings (i.e., interacting with others). Some categories of disorders associated with social anxiety include anxiety disorders, mood disorders, autism spectrum diso ...
." The strong memory imprint and brain activation effects caused by watching a digital look-alike avatar of yourself is dubbed the Doppelgänger
A doppelgänger (), a compound noun formed by combining the two nouns (double) and (walker or goer) (), doppelgaenger or doppelganger is a biologically unrelated look-alike, or a double, of a living person.
In fiction and mythology, a doppel ...
effect.[
] The doppelgänger effect can heal when covert disinformation attack is exposed as such to the targets of the attack.
Related issues
The speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
has been verging on being completely indistinguishable from a recording of a real human's voice since the 2016 introduction of the voice editing and generation software Adobe Voco, a prototype slated to be a part of the Adobe Creative Suite and DeepMind
DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc, after Google's restru ...
WaveNet, a prototype from Google.[
]
Ability to steal and manipulate other peoples voices raises obvious ethical concerns.
[
]
At the 2018 Conference on Neural Information Processing Systems (NeurIPS) researchers from Google presented the work 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis', which transfers learning from speaker verification
Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...
to achieve text-to-speech synthesis, that can be made to sound almost like anybody from a speech sample of only 5 second
(listen)
[
]
Sourcing images for AI training raises a question of privacy as people who are used for training didn't consent.
Digital sound-alikes technology found its way to the hands of criminals as in 2019 Symantec Symantec may refer to:
*An American consumer software company now known as Gen Digital Inc.
*A brand of enterprise security software purchased by Broadcom Inc.
Broadcom Inc. is an American designer, developer, manufacturer and global supplier ...
researchers knew of 3 cases where technology has been used for crime.[
][
]
This coupled with the fact that (as of 2016) techniques which allow near real-time counterfeiting
To counterfeit means to imitate something authentic, with the intent to steal, destroy, or replace the original, for use in illegal transactions, or otherwise to deceive individuals into believing that the fake is of equal or greater value tha ...
of facial expressions in existing 2D video have been believably demonstrated increases the stress on the disinformation situation.[
]
See also
* Motion-capture acting
* Internet manipulation
* Media synthesis
Synthetic media (also known as AI-generated media, generative AI, personalized media, and colloquially as deepfakes) is a catch-all term for the artificial production, manipulation, and modification of data and media by automated means, especia ...
* Propaganda techniques
* 3D data acquisition and object reconstruction
* 3D reconstruction from multiple images
3D reconstruction from multiple images is the creation of three-dimensional models from a set of images. It is the reverse process of obtaining 2D images from 3D scenes.
The essence of an image is a projection from a 3D scene onto a 2D pla ...
* 3D pose estimation
3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose or transformation of an object can be used for ...
in general and articulated body pose estimation especially to do with capturing human likeness.
* 4D reconstruction
In computer vision and computer graphics, 4D reconstruction is the process of capturing the shape and appearance of real objects along a temporal dimension.Dou, Mingsong, et al.Fusion4d: Real-time performance capture of challenging scenes" ACM Tran ...
* Finger tracking
* Gesture recognition
Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. It is a subdiscipline of computer vision. Gestures can originate from any bodily motion or sta ...
* StyleGAN
StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.
StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, w ...
References
{{Differentiable computing
Simulation
Computer graphics
Pornography
Forgery controversies
Propaganda techniques
Special effects
Applications of computer vision