Human image synthesis is technology that can be applied to make believable and even
photorealistic
Photorealism is a genre of art that encompasses painting, drawing and other graphic media, in which an artist studies a photograph and then attempts to reproduce the image as realistically as possible in another medium. Although the term can be ...
renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using
computer generated imagery
Computer-generated imagery (CGI) is the use of computer graphics to create or contribute to images in art, printed media, video games, simulators, and visual effects in films, television programs, shorts, commercials, and videos. The images ma ...
have featured synthetic images of human-like characters
digitally composited onto the real or other simulated film material. Towards the end of the 2010s
deep learning
Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.
De ...
artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
has been applied to
synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work.
Timeline of human image synthesis
* In 1971
Henri Gouraud made the first
CG geometry
Geometry (; ) is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is c ...
capture
Capture may refer to:
*Asteroid capture, a phenomenon in which an asteroid enters a stable orbit around another body
*Capture, a software for lighting design, documentation and visualisation
*"Capture" a song by Simon Townshend
*Capture (band), an ...
and representation of a human face. Modeling was his wife Sylvie Gouraud. The 3D model was a simple
wire-frame model
A wire-frame model, also wireframe model, is a visual representation of a three-dimensional (3D) physical object used in 3D computer graphics. It is created by specifying each edge of the physical object where two mathematically continuous ...
and he applied
the Gouraud shader he is most known for to produce the first known representation of human-likeness on compute
(view images)
* The 1972 short film ''
A Computer Animated Hand
''A Computer Animated Hand'' is the title of a 1972 American computer-animated short film produced by Edwin Catmull and Fred Parke. Produced during Catmull's tenure at the University of Utah, the short was created for a graduate course project. A ...
'' by
Edwin Catmull
Edwin Earl "Ed" Catmull (born March 31, 1945) is an American computer scientist who is the co-founder of Pixar and was the President of Walt Disney Animation Studios. He has been honored for his contributions to 3D computer graphics (computer sci ...
and
Fred Parke
Frederic Ira Parke is an American computer graphics researcher and academic. He did early work on animated computer renderings of human faces.
Parke graduated from the University of Utah with a BS degree in physics in 1965. He was then a gradua ...
was the first time that
computer-generated imagery
Computer-generated imagery (CGI) is the use of computer graphics to create or contribute to images in art, printed media, video games, simulators, and visual effects in films, television programs, shorts, commercials, and videos. The images may ...
was used in film to simulate moving human appearance. The film featured a computer simulated hand and fac
(watch film here)
* The 1976 film ''
Futureworld
''Futureworld'' is a 1976 American science fiction thriller film directed by Richard T. Heffron and written by Mayo Simon and George Schenck. It is a sequel to the 1973 Michael Crichton film ''Westworld'', and is the second installment in the ...
'' reused parts of ''A Computer Animated Hand'' on the big screen.
* The 1983
music video for song Musique Non-Stop by German band
Kraftwerk
Kraftwerk (, "power station") is a German band formed in Düsseldorf in 1970 by Ralf Hütter and Florian Schneider. Widely considered innovators and pioneers of electronic music, Kraftwerk were among the first successful acts to popularize the ...
aired in 1986. Created by the artist
Rebecca Allen Rebecca Allen may refer to:
*Rebecca Allen (artist) (born 1954), American international artist
*Rebecca Allen (basketball) (born 1992), Australian basketball player
{{Hndis, Allen, Rebecca ...
, it features non-realistic looking, but clearly recognizable computer simulations of the band members.
* The 1994 film
The Crow
The Crow is a supernatural superhero comic book series created by James O'Barr revolving around the titular character of the same name. The series, which was originally created by O'Barr as a means of dealing with the death of his fiancée at t ...
was the first film production to make use of digital compositing of a computer simulated representation of a face onto scenes filmed using a
body double
In filmmaking, a double is a person who substitutes FOR another actor such that the person's face is not shown. There are various terms associated with a double based on the specific body part or ability they serve as a double for, such as stunt ...
. Necessity was the muse as the actor
Brandon Lee
Brandon Bruce Lee (February 1, 1965 – March 31, 1993) was an American actor and martial artist. Establishing himself as a rising action star in the early 1990s, he landed his breakthrough role as Eric Draven in the dark fantasy film ''The ...
portraying the protagonist was tragically killed accidentally on-stage.
* In 1999
Paul Debevec Paul Ernest Debevec is a researcher in computer graphics at the University of Southern California's Institute for Creative Technologies. He is best known for his work in finding, capturing and synthesizing the bidirectional scattering distribution ...
et al. of
USC
USC most often refers to:
* University of South Carolina, a public research university
** University of South Carolina System, the main university and its satellite campuses
**South Carolina Gamecocks, the school athletic program
* University of ...
captured the reflectance field of a human face with their first version of a
light stage
A light stage or light cage is equipment used for shape, texture, reflectance and motion capture often with structured light and a multi-camera setup.
Reflectance capture
The reflectance field over a human face was first captured in 1999 by ...
. They presented their method at the
SIGGRAPH
SIGGRAPH (Special Interest Group on Computer Graphics and Interactive Techniques) is an annual conference on computer graphics (CG) organized by the ACM SIGGRAPH, starting in 1974. The main conference is held in North America; SIGGRAPH Asia ...
2000
* In 2003
audience
An audience is a group of people who participate in a show or encounter a work of art, literature (in which they are called "readers"), theatre, music (in which they are called "listeners"), video games (in which they are called "players"), or ...
debut of photo realistic human-likenesses in the 2003 films ''
The Matrix Reloaded
''The Matrix Reloaded'' is a 2003 American science-fiction action film written and directed by the Wachowskis. It is a sequel to ''The Matrix'' (1999) and the second installment in the ''Matrix'' film series. The film stars Keanu Reeves, Laure ...
'' in
the burly brawl sequence where up-to-100
Agent Smith
Agent Smith (later simply Smith) is a fictional Character (arts), character and the main antagonist of The Matrix (franchise), ''The Matrix'' franchise. He was primarily portrayed by Hugo Weaving in the first trilogy of films and voiced by Christ ...
s fight
Neo
Neo or NEO may refer to:
Arts and entertainment Fictional entities
* Neo (''The Matrix''), the alias of Thomas Anderson, a hacker and the protagonist of the Matrix film series
* Neo (''Marvel Comics'' species), a fictional race of superhumans
* ...
and in ''
The Matrix Revolutions
''The Matrix Revolutions'' is a 2003 American science fiction action film written and directed by the Wachowskis. It is the third installment in ''The Matrix'' film series, released six months following ''The Matrix Reloaded''. The film stars ...
'' where at the start of the end showdown Agent Smith's
cheekbone
In the human skull, the zygomatic bone (from grc, ζῠγόν, zugón, yoke), also called cheekbone or malar bone, is a paired irregular bone which articulates with the maxilla, the temporal bone, the sphenoid bone and the frontal bone. It is s ...
gets punched in by Neo leaving the digital look-alike unnaturally unhurt. The Matrix Revolutions bonus DVD documents and depicts the process in some detail and the techniques used, including
facial motion capture
Facial motion capture is the process of electronically converting the movements of a person's face into a digital database using cameras or laser scanners. This database may then be used to produce computer graphics (CG), computer animation for mo ...
and
limb
Limb may refer to:
Science and technology
*Limb (anatomy), an appendage of a human or animal
*Limb, a large or main branch of a tree
*Limb, in astronomy, the curved edge of the apparent disk of a celestial body, e.g. lunar limb
*Limb, in botany, ...
al
motion capture
Motion capture (sometimes referred as mo-cap or mocap, for short) is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robo ...
, and
projection
Projection, projections or projective may refer to:
Physics
* Projection (physics), the action/process of light, heat, or sound reflecting from a surface to another in a different direction
* The display of images by a projector
Optics, graphic ...
onto models.
* In 2003
''The Animatrix: Final Flight of the Osiris'' a
state-of-the-art
The state of the art (sometimes cutting edge or leading edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contexts it can also refer to a level ...
want-to-be human likenesses not quite fooling the watcher made by
Square Pictures
(also known under its American brand name SquareSoft) was a Japanese video game development studio and publisher. It was founded in 1986 by Masafumi Miyamoto, who spun off part of his father's electronics company Den-Yu-Sha. Among its early emp ...
.
* In 2003 digital likeness of
Tobey Maguire
Tobias Vincent Maguire (born June 27, 1975) is an American actor and film producer. He is best known for playing Peter Parker (Sam Raimi film series), the title character from Sam Raimi's Spider-Man in film#Sam Raimi films, ''Spider-Man'' tril ...
was made for movies ''
Spider-man 2
''Spider-Man 2'' is a 2004 American superhero film directed by Sam Raimi and written by Alvin Sargent from a story by Alfred Gough, Miles Millar and Michael Chabon. Based on the fictional Marvel Comics character of the same name, it is the ...
'' and ''
Spider-man 3
''Spider-Man 3'' is a 2007 American superhero film based on the Marvel Comics character Spider-Man. It was directed by Sam Raimi from a screenplay by Raimi, his older brother Ivan and Alvin Sargent. It is the final installment in Raimi's ...
'' by
Sony Pictures Imageworks
Sony Pictures Imageworks Inc. is a Canadian visual effects and computer animation studio headquartered in Vancouver, British Columbia, with an additional office on the Sony Pictures Studios lot in Culver City, California. SPI is a unit of Sony Pi ...
.
[
]
* In 2005 the
Face of the Future
Face of the Future was a project established in 2005 by the University of St Andrews and Perception Lab, funded by the EPSRC. The website contained "Face Transformer", which enables users to transform their face into any ethnicity and age as well ...
project was an established.
[
] by the
University of St Andrews
(Aien aristeuein)
, motto_lang = grc
, mottoeng = Ever to ExcelorEver to be the Best
, established =
, type = Public research university
Ancient university
, endowment ...
and Perception Lab, funded by the
EPSRC
The Engineering and Physical Sciences Research Council (EPSRC) is a British Research Council that provides government funding for grants to undertake research and postgraduate degrees in engineering and the physical sciences, mainly to universi ...
.
[
] The website contains a "Face Transformer", which enables users to transform their face into any
ethnicity
An ethnic group or an ethnicity is a grouping of people who identify with each other on the basis of shared attributes that distinguish them from other groups. Those attributes can include common sets of traditions, ancestry, language, history, ...
and
age
Age or AGE may refer to:
Time and its effects
* Age, the amount of time someone or something has been alive or has existed
** East Asian age reckoning, an Asian system of marking age starting at 1
* Ageing or aging, the process of becoming older ...
as well as the ability to transform their face into a painting (in the style of either
Sandro Botticelli
Alessandro di Mariano di Vanni Filipepi ( – May 17, 1510), known as Sandro Botticelli (, ), was an Italian Renaissance painting, Italian painter of the Early Renaissance. Botticelli's posthumous reputation suffered until the late 19th cent ...
or
Amedeo Modigliani
Amedeo Clemente Modigliani (, ; 12 July 1884 – 24 January 1920) was an Italian painter and sculptor who worked mainly in France. He is known for portraits and nudes in a modern style characterized by a surreal elongation of faces, necks, and ...
). This process is achieved by combining the user's photograph with an
average
In ordinary language, an average is a single number taken as representative of a list of numbers, usually the sum of the numbers divided by how many numbers are in the list (the arithmetic mean). For example, the average of the numbers 2, 3, 4, 7, ...
face.
[
* In 2009 Debevec et al. presented new digital likenesses, made by ]Image Metrics
Image Metrics is a 3D facial animation and Virtual Try-on company headquartered in El Segundo, with offices in Las Vegas, and research facilities in Manchester. Image Metrics are the makers of the Live Driver and Portable You SDKs for softw ...
, this time of actress Emily O'Brien
Emily Roya O'Brien (born 28 May 1985) is an English three-time Daytime Emmy-nominated actress and writer who is known for her five-year series regular role of Jana Hawkes Fisher on ''The Young and the Restless'' from 2006–2011. She portrayed ...
whose reflectance was captured with the USC light stage 5[In this TED talk video](_blank)
at 00:04:59 you can see ''two clips, one with the real Emily shot with a real camera and one with a digital look-alike of Emily, shot with a simulation of a camera – Which is which is difficult to tell''. Bruce Lawmen was scanned using USC light stage 6 in still position and also recorded running there on a treadmill
A treadmill is a device generally used for walking, running, or climbing while staying in the same place. Treadmills were introduced before the development of powered machines to harness the power of animals or humans to do work, often a type of ...
. Many, many digital look-alikes of Bruce are seen running fluently and natural looking at the ending sequence of the TED talk video. Motion looks fairly convincing contrasted to the clunky run in the ''Animatrix: Final Flight of the Osiris'' which was state-of-the-art
The state of the art (sometimes cutting edge or leading edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contexts it can also refer to a level ...
in 2003 if photorealism was the intention of the animators.
* In 2009 a digital look-alike of a younger Arnold Schwarzenegger
Arnold Alois Schwarzenegger (born July 30, 1947) is an Austrian and American actor, film producer, businessman, retired professional bodybuilder and politician who served as the 38th governor of California between 2003 and 2011. ''Time'' ...
was made for the movie ''Terminator Salvation
''Terminator Salvation'' is a 2009 American military science fiction action film directed by McG and written by John Brancato and Michael Ferris. It is the fourth installment of the ''Terminator'' franchise and serves as a sequel to '' Termin ...
'' though the end result was critiqued as unconvincing. Facial geometry was acquired from a 1984 mold of Schwarzenegger.
* In 2010 Walt Disney Pictures
Walt Disney Pictures is an American film production company and subsidiary of Walt Disney Studios, which is owned by The Walt Disney Company. The studio is the flagship producer of live-action feature films within the Walt Disney Studios uni ...
released a sci-fi sequel entitled '' Tron: Legacy'' with a digitally rejuvenated digital look-alike of actor Jeff Bridges
Jeffrey Leon Bridges (born December 4, 1949) is an American actor. He has received various accolades throughout his career spanning over seven decades, including an Academy Award and two Golden Globe Awards.
Bridges comes from a prominent a ...
playing the antagonist
An antagonist is a character in a story who is presented as the chief foe of the protagonist.
Etymology
The English word antagonist comes from the Greek ἀνταγωνιστής – ''antagonistēs'', "opponent, competitor, villain, enemy, riv ...
CLU.
*In SIGGGRAPH 2013 Activision
Activision Publishing, Inc. is an American video game publisher based in Santa Monica, California. It serves as the publishing business for its parent company, Activision Blizzard, and consists of several subsidiary studios. Activision is one o ...
and USC presented a real time
Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
"Digital Ira" a digital face look-alike of Ari Shapiro, an ICT USC research scientist,[
] utilizing the USC light stage X by Ghosh et al. for both reflectance field and motion capture.[
] The end result both precomputed and real-time rendering with the modernest game GPU
A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobil ...
shown here and looks fairly realistic.
* In 2014 The Presidential Portrait by USC ICT
ICT may refer to:
Sciences and technology
* Information and communications technology
* Image Constraint Token, in video processing
* Immunochromatographic test, a rapid immunoassay used to detect diseases such as anthrax
* In-circuit test, in ...
in conjunction with the Smithsonian Institution
The Smithsonian Institution ( ), or simply the Smithsonian, is a group of museums and education and research centers, the largest such complex in the world, created by the U.S. government "for the increase and diffusion of knowledge". Founded ...
was made using the latest USC mobile light stage wherein President Barack Obama
Barack Hussein Obama II ( ; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, Obama was the first African-American president of the U ...
had his geometry, textures and reflectance captured.[
]
* In 2014 Ian Goodfellow
Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine lea ...
et al. presented the principles of a generative adversarial network
A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is a ...
. GANs made the headlines in early 2018 with the deepfake
Deepfakes (a portmanteau of "deep learning" and "fake") are synthetic media in which a person in an existing image or video is replaced with someone else's likeness. While the act of creating fake content is not new, deepfakes leverage powerful ...
s controversies.
* For the 2015 film ''Furious 7
''Furious 7'' (also known as ''Fast & Furious 7'') is a 2015 American action film directed by James Wan and written by Chris Morgan. It is the sequel to ''Fast & Furious 6'' (2013) and '' The Fast and the Furious: Tokyo Drift'' (2006), and ser ...
'' a digital look-alike of actor Paul Walker
Paul William Walker IV (September 12, 1973 – November 30, 2013) was an American actor. He was known for his role as Brian O'Conner in the ''Fast & Furious'' franchise.
Walker began his career as a child actor in the 1980s, gaining recogniti ...
who died in an accident during the filming was done by Weta Digital to enable the completion of the film.[
]
* In 2016 techniques which allow near real-time
Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constrai ...
counterfeiting
To counterfeit means to imitate something authentic, with the intent to steal, destroy, or replace the original, for use in illegal transactions, or otherwise to deceive individuals into believing that the fake is of equal or greater value tha ...
of facial expressions
A facial expression is one or more motions or positions of the muscles beneath the skin of the face. According to one set of controversial theories, these movements convey the emotional state of an individual to observers. Facial expressions are a ...
in existing 2D video have been believably demonstrated.[
]
* In 2016 a digital look-alike of Peter Cushing
Peter Wilton Cushing (26 May 1913 – 11 August 1994) was an English actor. His acting career spanned over six decades and included appearances in more than 100 films, as well as many television, stage, and radio roles. He achieved recognition ...
was made for the ''Rogue One
''Rogue One: A Star Wars Story'' (or simply ''Rogue One'') is a 2016 American epic space opera film directed by Gareth Edwards. The screenplay by Chris Weitz and Tony Gilroy is from a story by John Knoll and Gary Whitta. It was produced by Luc ...
'' film where its appearance would appear to be of same age as the actor was during the filming of the original 1977 ''Star Wars
''Star Wars'' is an American epic film, epic space opera multimedia franchise created by George Lucas, which began with the Star Wars (film), eponymous 1977 film and quickly became a worldwide popular culture, pop-culture Cultural impact of S ...
'' film.
* In SIGGRAPH 2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from University of Washington
The University of Washington (UW, simply Washington, or informally U-Dub) is a public research university in Seattle, Washington.
Founded in 1861, Washington is one of the oldest universities on the West Coast; it was established in Seattle a ...
(view)
It was driven only by a voice track as source data for the animation after the training phase to acquire lip sync
Lip sync or lip synch (pronounced , the same as the word ''sink'', short for lip synchronization) is a technical term for matching a speaking or singing person's lip movements with sung or spoken vocals.
Audio for lip syncing is generated thr ...
and wider facial information from training material consisting 2D videos with audio had been completed.[
]
* Late 2017 and early 2018 saw the surfacing of the deepfake
Deepfakes (a portmanteau of "deep learning" and "fake") are synthetic media in which a person in an existing image or video is replaced with someone else's likeness. While the act of creating fake content is not new, deepfakes leverage powerful ...
s controversy where porn video
Pornographic films (pornos), erotic films, sex films, and 18+ films are films that present sexually explicit subject matter in order to arouse and satisfy the viewer. Pornographic films present sexual fantasies and usually include eroticall ...
s were doctored using deep machine learning so that the face of the actress was replaced by the software's opinion of what another persons face would look like in the same pose and lighting.
* In 2018 GDC Epic Games
Epic Games, Inc. is an American video game and software developer and publisher based in Cary, North Carolina. The company was founded by Tim Sweeney as Potomac Computer Systems in 1991, originally located in his parents' house in Potomac, M ...
and Tencent Games
Tencent Games () is the video game publishing Division (business), division of Tencent Interactive Entertainment, itself a division of Tencent, Tencent Holdings. It has five internal studio groups, including TiMi Studio Group. Tencent Games was ...
demonstrated "Siren", a digital look-alike of the actress Bingjie Jiang. It was made possible with the following technologies: CubicMotion's computer vision
Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...
system, 3Lateral
Epic Games, Inc. is an American video game and software developer and publisher based in Cary, North Carolina. The company was founded by Tim Sweeney as Potomac Computer Systems in 1991, originally located in his parents' house in Potomac, Ma ...
's facial rigging system and Vicon
Kverneland Group is an international company developing, producing and distributing agricultural implements, electronic solutions and digital services to the farming community. 's motion capture system. The demonstration ran in near real time at 60 frames per second in the Unreal Engine 4
Unreal Engine (UE) is a 3D computer graphics game engine developed by Epic Games, first showcased in the 1998 first-person shooter game ''Unreal''. Initially developed for PC first-person shooters, it has since been used in a variety of genre ...
.[
]
* In 2018 at the World Internet Conference
The World Internet Conference (WIC, ), also known as the Wuzhen Summit (), is an annual event, first held in 2014, organized by the Chinese government to discuss global Internet issues and policies. It is organized by the Cyberspace Administrat ...
in Wuzhen
Wuzhen (, Wu: Whu-tsen lit. "Wu Town") is a historic scenic town, part of Tongxiang, located in the north of Zhejiang Province, China.
It lies within the triangle formed by Hangzhou, Suzhou and Shanghai. Covering an area of , Wuzhen has a tota ...
the Xinhua News Agency
Xinhua News Agency (English pronunciation: )J. C. Wells: Longman Pronunciation Dictionary, 3rd ed., for both British and American English, or New China News Agency, is the official state news agency of the People's Republic of China. Xinhua ...
presented two digital look-alikes made to the resemblance of its real news anchors Qiu Hao (Chinese language)[
] and Zhang Zhao (English language). The digital look-alikes were made in conjunction with Sogou
Sogou, Inc. () is a Chinese technology company that offers a search engine. It is a subsidiary of Tencent.
The offices of Sogou are located on the southeast corner of Tsinghua University in Beijing. Sogou also has offices in Chengdu co-locate ...
.[
] Neither the speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
used nor the gesturing of the digital look-alike anchors were good enough to deceive the watcher to mistake them for real humans imaged with a TV camera.
* In September 2018 Google added "involuntary synthetic pornographic imagery" to its ban list, allowing anyone to request the search engine block results that falsely depict them as "nude or in a sexually explicit situation."[
]
* In February 2019 Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
s StyleGAN
StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.
StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, w ...
, a novel generative adversarial network
A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is a ...
.[
] Right after this Phillip Wang made the website ThisPersonDoesNotExist.com with StyleGAN to demonstrate that unlimited amounts of often photo-realistic looking facial portraits of no-one can be made automatically using a GAN.[
] Nvidia's StyleGAN was presented in a not yet peer review
Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review ...
ed paper in late 2018.
* At the June 2019 CVPR
The Conference on Computer Vision and Pattern Recognition (CVPR) is an annual conference on computer vision and pattern recognition, which is regarded as one of the most important conferences in its field. According to Google Scholar Metrics (2022 ...
the MIT
The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the mo ...
CSAIL
Computer Science and Artificial Intelligence Laboratory (CSAIL) is a research institute at the Massachusetts Institute of Technology (MIT) formed by the 2003 merger of the Laboratory for Computer Science (LCS) and the Artificial Intelligence Lab ...
presented a system titled ''"Speech2Face: Learning the Face Behind a Voice"'' that synthesizes likely faces based on just a recording of a voice. It was trained with massive amounts of video of people speaking.
* Since 1 July 2019 Virginia
Virginia, officially the Commonwealth of Virginia, is a state in the Mid-Atlantic and Southeastern regions of the United States, between the Atlantic Coast and the Appalachian Mountains. The geography and climate of the Commonwealth ar ...
has criminalized the sale and dissemination of unauthorized synthetic pornography, but not the manufacture.,[
] a
§ 18.2–386.2 titled 'Unlawful dissemination or sale of images of another; penalty.'
became part of the Code of Virginia
The Code of Virginia is the statutory law of the U.S. state of Virginia, and consists of the codified legislation of the Virginia General Assembly. The 1950 Code of Virginia is the revision currently in force. The previous official versions were ...
. The law text states: "''Any person who, with the intent
Intentions are mental states in which the agent commits themselves to a course of action. Having the plan to visit the zoo tomorrow is an example of an intention. The action plan is the ''content'' of the intention while the commitment is the ''a ...
to coerce
Coercion () is compelling a party to act in an involuntary manner by the use of threats, including threats to use force against a party. It involves a set of forceful actions which violate the free will of an individual in order to induce a desi ...
, harass
Harassment covers a wide range of behaviors of offensive nature. It is commonly understood as behavior that demeans, humiliates or embarrasses a person, and it is characteristically identified by its unlikelihood in terms of social and moral r ...
, or intimidate
Intimidation is to "make timid or make fearful"; or to induce fear. This includes intentional behaviors of forcing another person to experience general discomfort such as humiliation, embarrassment, inferiority, limited freedom, etc and the victi ...
, maliciously disseminates or sells any videographic or still image created by any means whatsoever that depicts another person who is totally nude, or in a state of undress so as to expose the genitals
A sex organ (or reproductive organ) is any part of an animal or plant that is involved in sexual reproduction. The reproductive organs together constitute the reproductive system. In animals, the testis in the male, and the ovary in the female, a ...
, pubic area, buttocks
The buttocks (singular: buttock) are two rounded portions of the exterior anatomy of most mammals, located on the posterior of the pelvic region. In humans, the buttocks are located between the lower back and the perineum. They are composed ...
, or female breast
The breast is one of two prominences located on the upper ventral region of a primate's torso. Both females and males develop breasts from the same embryological tissues.
In females, it serves as the mammary gland, which produces and secret ...
, where such person knows or has reason to know that he is not license
A license (or licence) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit).
A license is granted by a party (licensor) to another party (licensee) as an element of an agreeme ...
d or authorized
Authorization or authorisation (see spelling differences) is the function of specifying access rights/privileges to resources, which is related to general information security and computer security, and to access control in particular. More for ...
to disseminate or sell such videographic or still image is guilty of a Class 1 misdemeanor
A misdemeanor (American English, spelled misdemeanour elsewhere) is any "lesser" criminal act in some common law legal systems. Misdemeanors are generally punished less severely than more serious felonies, but theoretically more so than adm ...
.''". The identical bills were House Bill 2678 presented by Delegate
Delegate or delegates may refer to:
* Delegate, New South Wales, a town in Australia
* Delegate (CLI), a computer programming technique
* Delegate (American politics), a representative in any of various political organizations
* Delegate (United ...
Marcus Simon
Marcus Bertram Simon (born July 1, 1970) is an American lawyer and politician from Virginia. A member of the Democratic Party, Simon is the member of the Virginia House of Delegates for the 53rd district, which includes Falls Church and parts ...
to the Virginia House of Delegates
The Virginia House of Delegates is one of the two parts of the Virginia General Assembly, the other being the Senate of Virginia. It has 100 members elected for terms of two years; unlike most states, these elections take place during odd-numbe ...
on 14 January 2019 and three-day later an identical Senate bill 1736 was introduced to the Senate of Virginia
The Senate of Virginia is the upper house of the Virginia General Assembly. The Senate is composed of 40 senators representing an equal number of single-member constituent districts. The Senate is presided over by the lieutenant governor of Virg ...
by Senator Adam Ebbin
Adam Paul Ebbin (born November 10, 1963) is an American politician who is the senator from the 30th District of the Virginia Senate since January 2012. A member of the Democratic Party, he was the Delegate from the 49th District of the Virginia ...
.
* Since 1 September 2019 Texas
Texas (, ; Spanish language, Spanish: ''Texas'', ''Tejas'') is a state in the South Central United States, South Central region of the United States. At 268,596 square miles (695,662 km2), and with more than 29.1 million residents in 2 ...
senate bill SB 751 amendment An amendment is a formal or official change made to a law, contract, constitution, or other legal document. It is based on the verb to amend, which means to change for better. Amendments can add, remove, or update parts of these agreements. They ...
s to the election code came into effect, giving candidates
A candidate, or nominee, is the prospective recipient of an award or honor, or a person seeking or being considered for some kind of position; for example:
* to be election, elected to an official, office — in this case a Preselection, candida ...
in elections
An election is a formal group decision-making process by which a population chooses an individual or multiple individuals to hold public office.
Elections have been the usual mechanism by which modern representative democracy has operate ...
a 30-day protection period to the elections during which making and distributing digital look-alikes or synthetic fakes of the candidates is an offense. The law text defines the subject of the law as "''a video, created with the intent to deceive, that appears to depict a real person performing an action that did not occur in reality''"[
]
* In September 2019 Yle
Yleisradio Oy (Finnish, literally "General Radio Ltd." or "General Broadcast Ltd."; abbr. Yle ; sv, Rundradion Ab, italics=no), translated to English as the Finnish Broadcasting Company, is Finland's national public broadcasting company, founde ...
, the Finnish public broadcasting company, aired a result of experimental journalism, a deepfake of the President in office Sauli Niinistö
Sauli Väinämö Niinistö (; born 24 August 1948) is a Finnish politician who has served as president of Finland since March 2012, the 12th person to hold that office.
A lawyer by education, Niinistö was Chairman of the National Coalition Part ...
in its main news broadcast for the purpose of highlighting the advancing disinformation technology and problems that arise from it.
* 1 January 2020[
] California the state law State law refers to the law of a federated state, as distinguished from the law of the federation of which it is a part. It is used when the constituent components of a federation are themselves called states. Federations made up of provinces, cant ...
AB-602 came into effect banning the manufacturing and distribution Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
* Probability distribution, the probability of a particular value or value range of a vari ...
of synthetic pornography without the consent
Consent occurs when one person voluntarily agrees to the proposal or desires of another. It is a term of common speech, with specific definitions as used in such fields as the law, medicine, research, and sexual relationships. Consent as und ...
of the people depicted. AB-602 provides victims of synthetic pornography with injunctive relief
An injunction is a legal and equitable remedy in the form of a special court order that compels a party to do or refrain from specific acts. ("The court of appeals ... has exclusive jurisdiction to enjoin, set aside, suspend (in whole or in par ...
and poses legal threats of statutory
A statute is a formal written enactment of a legislative authority that governs the legal entities of a city, state, or country by way of consent. Typically, statutes command or prohibit something, or declare policy. Statutes are rules made by le ...
and punitive damages
Punitive damages, or exemplary damages, are damages assessed in order to punish the defendant for outrageous conduct and/or to reform or deter the defendant and others from engaging in conduct similar to that which formed the basis of the lawsuit. ...
on criminals making or distributing synthetic pornography without consent. The bill AB-602 was signed into law by California Governor
A governor is an administrative leader and head of a polity or political region, ranking under the head of state and in some cases, such as governors-general, as the head of state's official representative. Depending on the type of political ...
Gavin Newsom
Gavin Christopher Newsom (born October 10, 1967) is an American politician and businessman who has been the 40th governor of California since 2019. A member of the Democratic Party, he served as the 49th lieutenant governor of California fr ...
on 3 October 2019 and was authored by California State Assembly
The California State Assembly is the lower house of the California State Legislature, the upper house being the California State Senate. The Assembly convenes, along with the State Senate, at the California State Capitol in Sacramento.
The A ...
member Marc Berman
Marc Berman (born October 31, 1980) is a politician and attorney, currently serving as a member of the California State Assembly. He is a Democrat representing the 24th Assembly District, encompassing parts of the San Francisco Peninsula and S ...
.[
]
* 1 January 2020, Chinese law requiring that synthetically faked footage should bear a clear notice about its fakeness came into effect. Failure to comply could be considered a crime the Cyberspace Administration of China
The Cyberspace Administration of China (CAC; ) is the central internet regulator, censor, oversight, and control agency for the People's Republic of China. The office also holds the administrative title of the party's Office of the Central Cy ...
stated on its website. China announced this new law in November 2019.[
] The Chinese government seems to be reserving the right to prosecute both users and online video platform
An online video platform (OVP), provided by a video hosting service, enables users to upload, convert, store and play back video content on the Internet, often via a structured, large-scale system that may generate revenue. Users will generally u ...
s failing to abide by the rules.[
]
*
Key breakthrough to photorealism: reflectance capture
In 1999 Paul Debevec Paul Ernest Debevec is a researcher in computer graphics at the University of Southern California's Institute for Creative Technologies. He is best known for his work in finding, capturing and synthesizing the bidirectional scattering distribution ...
et al. of USC did the first known reflectance capture
The reflectance of the surface of a material is its effectiveness in reflecting radiant energy. It is the fraction of incident electromagnetic power that is reflected at the boundary. Reflectance is a component of the response of the electronic ...
over the human face with their extremely simple light stage
A light stage or light cage is equipment used for shape, texture, reflectance and motion capture often with structured light and a multi-camera setup.
Reflectance capture
The reflectance field over a human face was first captured in 1999 by ...
. They presented their method and results in SIGGRAPH
SIGGRAPH (Special Interest Group on Computer Graphics and Interactive Techniques) is an annual conference on computer graphics (CG) organized by the ACM SIGGRAPH, starting in 1974. The main conference is held in North America; SIGGRAPH Asia ...
2000.[
]
The scientific breakthrough required finding the subsurface light component (the simulation models are glowing from within slightly) which can be found using knowledge that light that is reflected from the oil-to-air layer retains its polarization and the subsurface light loses its polarization. So equipped only with a movable light source, movable video camera, 2 polarizers and a computer program doing extremely simple math and the last piece required to reach photorealism was acquired.
For a believable result both light reflected from skin (BRDF
The bidirectional reflectance distribution function (BRDF; f_(\omega_,\, \omega_) ) is a function of four real variables that defines how light is reflected at an opaque surface. It is employed in the optics of real-world light, in computer ...
) and within the skin (a special case of BTDF) which together make up the BSDF must be captured and simulated.
Capture
* The 3D geometry
Geometry (; ) is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is c ...
and textures are captured onto a 3D model
A model is an informative representation of an object, person or system. The term originally denoted the Plan_(drawing), plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a mea ...
by a 3D reconstruction
In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects.
This process can be accomplished either by active or passive methods. If the model is allowed to change its shape i ...
method, such as sampling the target by means of 3D scanning
3D scanning is the process of analyzing a real-world object or environment to collect data on its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models.
A 3D scanner can be based on ...
with an RGB
The RGB color model is an additive color model in which the red, green and blue primary colors of light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three additiv ...
XYZ scanner such as Arius3d or Cyberware
Cyberware is a relatively new and unknown field (a proto-science, or more adequately a "proto-technology"). In science fiction circles, however, it is commonly known to mean the hardware or machine parts implanted in the human body and acting as ...
(textures from photos, not pure RGB XYZ scanner), stereophotogrammetrically from synchronized photos or even from enough repeated non-simultaneous photos
A photograph (also known as a photo, image, or picture) is an image created by light falling on a photosensitive surface, usually photographic film or an electronic image sensor, such as a CCD or a CMOS chip. Most photographs are now created ...
. Digital sculpting
Digital sculpting, also known as sculpt modeling or 3D sculpting, is the use of software that offers tools to push, pull, smooth, grab, pinch or otherwise manipulate a digital object as if it were made of a real-life substance such as clay.
Sculp ...
can be used to make up models of the body parts for which data cannot be acquired e.g. parts of the body covered by clothing.
* For believable results also the reflectance field must be captured or an approximation must be picked from the libraries to form a 7D reflectance model of the target.
Synthesis
The whole process of making digital look-alikes i.e. characters so lifelike and realistic that they can be passed off as pictures of humans is a very complex task as it requires photorealistically modeling
A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure.
Models c ...
, animating, cross-mapping, and rendering the soft body dynamics
Soft-body dynamics is a field of computer graphics that focuses on visually realistic physical simulations of the motion and properties of deformable objects (or ''soft bodies''). The applications are mostly in video games and films. Unlike in sim ...
of the human appearance.
Synthesis with an actor and suitable algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s is applied using powerful computers. The actor's part in the synthesis is to take care of mimicking human expressions in still picture synthesizing and also human movement in motion picture synthesizing. Algorithms are needed to simulate laws of physics
Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which r ...
and physiology
Physiology (; ) is the scientific study of functions and mechanisms in a living system. As a sub-discipline of biology, physiology focuses on how organisms, organ systems, individual organs, cells, and biomolecules carry out the chemical ...
and to map the models and their appearance, movements and interaction accordingly.
Often both physics
Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which r ...
/physiology
Physiology (; ) is the scientific study of functions and mechanisms in a living system. As a sub-discipline of biology, physiology focuses on how organisms, organ systems, individual organs, cells, and biomolecules carry out the chemical ...
based (i.e. skeletal animation
Skeletal animation or rigging is a technique in computer animation in which a character (or other articulated object) is represented in two parts: a surface representation used to draw the character (called the ''mesh'' or ''skin'') and a hierarc ...
) and image-based modeling and rendering
In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene.
The traditional ...
are employed in the synthesis part. Hybrid models employing both approaches have shown best results in realism and ease-of-use. Morph target animation
Morph target animation, per-vertex animation, shape interpolation, shape keys, or blend shapes is a method of 3D computer animation used together with techniques such as skeletal animation. In a morph target animation, a "deformed" version of a m ...
reduces the workload by giving higher level control, where different facial expressions are defined as deformations of the model, which facial allows expressions to be tuned intuitively. Morph target animation can then morph the model between different defined facial expressions or body poses without much need for human intervention.
Using displacement mapping
Displacement mapping is an alternative computer graphics technique in contrast to bump, normal, and parallax mapping, using a texture or height map to cause an effect where the actual geometric position of points over the textured surface are ' ...
plays an important part in getting a realistic result with fine detail of skin such as pore
Pore may refer to:
Biology Animal biology and microbiology
* Sweat pore, an anatomical structure of the skin of humans (and other mammals) used for secretion of sweat
* Hair follicle, an anatomical structure of the skin of humans (and other m ...
s and wrinkle
A wrinkle, also known as a rhytid, is a fold, ridge or crease in an otherwise smooth surface, such as on skin or fabric. Skin wrinkles typically appear as a result of ageing processes such as glycation, habitual sleeping positions, loss of bo ...
s as small as 100 µm
The micrometre ( international spelling as used by the International Bureau of Weights and Measures; SI symbol: μm) or micrometer (American spelling), also commonly known as a micron, is a unit of length in the International System of Unit ...
.
Machine learning approach
In the late 2010s, machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
, and more precisely generative adversarial networks
A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is a ...
(GAN), were used by NVIDIA
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
to produce random yet photorealistic human-like portraits. The system, named StyleGAN
StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.
StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, w ...
, was trained on a database of 70,000 images from the images depository website Flickr
Flickr ( ; ) is an American image hosting and video hosting service, as well as an online community, founded in Canada and headquartered in the United States. It was created by Ludicorp in 2004 and was a popular way for amateur and professional ...
. The source code was made public on GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
in 2019. Outputs of the generator network from random input were made publicly available on a number of websites.
Similarly, since 2018, deepfake
Deepfakes (a portmanteau of "deep learning" and "fake") are synthetic media in which a person in an existing image or video is replaced with someone else's likeness. While the act of creating fake content is not new, deepfakes leverage powerful ...
technology has allowed GANs to swap faces between actors; combined with the ability to fake voices, GANs can thus generate fake videos that seem convincing.
Applications
Main applications fall within the domains of stock photography
Stock photography is the supply of photographs which are often licensed for specific uses. The stock photo industry, which began to gain hold in the 1920s, has established models including traditional macrostock photography, midstock photography, ...
, synthetic data
Synthetic data is information that's artificially generated rather than produced by real-world events. Typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models.
Data g ...
sets, virtual cinematography
Virtual cinematography is the set of cinematographic techniques performed in a computer graphics environment. It includes a wide variety of subjects like photographing real objects, often with stereo or multi-camera setup, for the purpose of rec ...
, computer and video games
Video games, also known as computer games, are electronic games that involves interaction with a user interface or input device such as a joystick, game controller, controller, computer keyboard, keyboard, or motion sensing device to gener ...
and covert
Secrecy is the practice of hiding information from certain individuals or groups who do not have the "need to know", perhaps while sharing it with other individuals. That which is kept hidden is known as the secret.
Secrecy is often controvers ...
disinformation
Disinformation is false information deliberately spread to deceive people. It is sometimes confused with misinformation, which is false information but is not deliberate.
The English word ''disinformation'' comes from the application of the L ...
attack
Attack may refer to:
Warfare and combat
* Offensive (military)
* Charge (warfare)
* Attack (fencing)
* Strike (attack)
* Attack (computing)
* Attack aircraft
Books and publishing
* ''The Attack'' (novel), a book
* '' Attack No. 1'', comic an ...
s.
Furthermore, some research suggests that it can have therapeutic effects as "psychologist
A psychologist is a professional who practices psychology and studies mental states, perceptual, cognitive, emotional, and social processes and behavior. Their work often involves the experimentation, observation, and interpretation of how indi ...
s and counselor
Counselor or counsellor may refer to:
A professional In diplomacy and government
* Counsellor of State, senior member of the British royal family to whom the Monarch can delegate some functions in case of unavailability
* Counselor (dipl ...
s have also begun using avatars
Avatar (, ; ), is a concept within Hinduism that in Sanskrit literally means "descent". It signifies the material appearance or incarnation of a powerful deity, goddess or spirit on Earth. The relative verb to "alight, to make one's appearance ...
to deliver therapy to clients who have phobias
A phobia is an anxiety disorder defined by a persistent and excessive fear of an object or situation. Phobias typically result in a rapid onset of fear and are usually present for more than six months. Those affected go to great lengths to avoi ...
, a history of trauma
Trauma most often refers to:
*Major trauma, in physical medicine, severe physical injury caused by an external source
*Psychological trauma, a type of damage to the psyche that occurs as a result of a severely distressing event
*Traumatic inju ...
, addictions, Asperger’s syndrome
Asperger syndrome (AS), also known as Asperger's, is a former neurodevelopmental disorder characterized by significant difficulties in social interaction and nonverbal communication, along with restricted and repetitive patterns of behavio ...
or social anxiety
Social anxiety is the anxiety and fear specifically linked to being in social settings (i.e., interacting with others). Some categories of disorders associated with social anxiety include anxiety disorders, mood disorders, autism spectrum disord ...
." The strong memory imprint and brain activation effects caused by watching a digital look-alike avatar of yourself is dubbed the Doppelgänger
A doppelgänger (), a compound noun formed by combining the two nouns (double) and (walker or goer) (), doppelgaenger or doppelganger is a biologically unrelated look-alike, or a double, of a living person.
In fiction and mythology, a doppelg ...
effect.[
] The doppelgänger effect can heal when covert disinformation attack is exposed as such to the targets of the attack.
Related issues
The speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
has been verging on being completely indistinguishable from a recording of a real human's voice since the 2016 introduction of the voice editing and generation software Adobe Voco
Adobe Voco is an unreleased audio editing and generating prototype software by Adobe that enables novel editing and generation of audio. Dubbed "Photoshop-for-voice", it was first previewed at the Adobe MAX event in November 2016. The technology ...
, a prototype slated to be a part of the Adobe Creative Suite
Adobe Creative Suite (CS) is a discontinued software suite of graphic design, video editing, and web development applications developed by Adobe Systems.
The last of the Creative Suite versions, Adobe Creative Suite 6 (CS6), was launched at a re ...
and DeepMind
DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was List of mergers and acquisitions by Google, acquired by Google in 2014 and became a wholly owned subsid ...
WaveNet
WaveNet is a deep neural network for generating raw audio. It was created by researchers at London-based AI firm DeepMind. The technique, outlined in a paper in September 2016, is able to generate relatively realistic-sounding human-like voices ...
, a prototype from Google.[
]
Ability to steal and manipulate other peoples voices raises obvious ethical concerns.
[
]
At the 2018 Conference on Neural Information Processing Systems
The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. The conference is currently a double-track meeti ...
(NeurIPS) researchers from Google presented the work 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis', which transfers learning from speaker verification
Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...
to achieve text-to-speech synthesis, that can be made to sound almost like anybody from a speech sample of only 5 second
(listen)
[
]
Sourcing images for AI training raises a question of privacy as people who are used for training didn't consent.
Digital sound-alikes technology found its way to the hands of criminals as in 2019 Symantec Symantec may refer to:
*An American consumer software company now known as Gen Digital Inc.
*A brand of enterprise security software purchased by Broadcom Inc.
Broadcom Inc. is an American designer, developer, manufacturer and global supplier ...
researchers knew of 3 cases where technology has been used for crime.[
][
]
This coupled with the fact that (as of 2016) techniques which allow near real-time
Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constrai ...
counterfeiting
To counterfeit means to imitate something authentic, with the intent to steal, destroy, or replace the original, for use in illegal transactions, or otherwise to deceive individuals into believing that the fake is of equal or greater value tha ...
of facial expressions
A facial expression is one or more motions or positions of the muscles beneath the skin of the face. According to one set of controversial theories, these movements convey the emotional state of an individual to observers. Facial expressions are a ...
in existing 2D video have been believably demonstrated increases the stress on the disinformation situation.[
]
See also
* Motion-capture acting
Motion-capture acting, also called performance-capture acting and often abbreviated as mo-cap or P-cap, is a type of acting in which an actor wears markers or sensors on a skintight bodysuit or directly on the skin. Hugh Hart, January 24, 2012, W ...
* Internet manipulation
Internet manipulation refers to the co-optation of digital technology, such as social media algorithms and automated scripts, for commercial, social or political purposes. Such tactics may be employed with the explicit intent to manipulate public ...
* Media synthesis
* Propaganda techniques
A number of propaganda techniques based on social psychological research are used to generate propaganda. Many of these same techniques can be classified as logical fallacies, since propagandists use arguments that, while sometimes convincing, are ...
* 3D data acquisition and object reconstruction
3D scanning is the process of analyzing a real-world object or environment to collect data on its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models.
A 3D scanner can be based on m ...
* 3D reconstruction from multiple images
3D reconstruction from multiple images is the creation of three-dimensional models from a set of images. It is the reverse process of obtaining 2D images from 3D scenes.
The essence of an image is a projection from a 3D scene onto a 2D pla ...
* 3D pose estimation
3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose or transformation of an object can be used for ...
in general and articulated body pose estimation
Articulated body pose estimation in computer vision is the study of algorithms and systems that recover the pose of an articulated body, which consists of joints and rigid parts using image-based observations. It is one of the longest-lasting pro ...
especially to do with capturing human likeness.
* 4D reconstruction
In computer vision and computer graphics, 4D reconstruction is the process of capturing the shape and appearance of real objects along a temporal dimension.Dou, Mingsong, et al.Fusion4d: Real-time performance capture of challenging scenes" ACM Tran ...
* Finger tracking
In the field of gesture recognition and image processing, finger tracking is a high-resolution technique developed in 1969 that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D.
In additio ...
* Gesture recognition
Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. It is a subdiscipline of computer vision. Gestures can originate from any bodily motion or sta ...
* StyleGAN
StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.
StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, w ...
References
{{Differentiable computing
Simulation
Computer graphics
Pornography
Forgery controversies
Propaganda techniques
Special effects
Applications of computer vision