Pony Preservation Project
   HOME

TheInfoList



OR:

15.ai is a
non-commercial A non-commercial (also spelled noncommercial) activity is an activity that does not, in some sense, involve commerce, at least relative to similar activities that do have a commercial objective or emphasis. For example, advertising-free community ...
freeware Freeware is software, most often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for the f ...
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
web application A web application (or web app) is application software that is accessed using a web browser. Web applications are delivered on the World Wide Web to users with an active network connection. History In earlier computing models like client-serve ...
that generates natural emotive high-fidelity
text-to-speech Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
voices from an assortment of fictional characters from a variety of media sources. Developed by an anonymous MIT researcher under the eponymous
pseudonym A pseudonym (; ) or alias () is a fictitious name that a person or group assumes for a particular purpose, which differs from their original or true name (orthonym). This also differs from a new name that entirely or legally replaces an individua ...
15, the project uses a combination of
audio synthesis A synthesizer (also spelled synthesiser) is an electronic musical instrument that generates audio signals. Synthesizers typically create sounds by generating Waveform, waveforms through methods including subtractive synthesis, additive synth ...
algorithms,
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
deep neural networks, and sentiment analysis models to generate and serve emotive character voices faster than real-time, even those with a very small amount of data. Launched in early 2020, 15.ai began as a proof of concept of the
democratization Democratization, or democratisation, is the transition to a more democratic political regime, including substantive political changes moving in a democratic direction. It may be a hybrid regime in transition from an authoritarian regime to a ful ...
of voice acting and dubbing using technology. Its gratis and non-commercial nature (with the only stipulation being that the project be properly credited when used), ease of use, and substantial improvements to current text-to-speech implementations have been lauded by users; however, some critics and
voice actor Voice acting is the art of performing voice-overs to present a character or provide information to an audience. Performers are called voice actors/actresses, voice artists, dubbing artists, voice talent, voice-over artists, or voice-over talent ...
s have questioned the
legality Legality, in respect of an act, agreement, or contract is the state of being consistent with the law or of being lawful or unlawful in a given jurisdiction, and the construct of power. According to the Merriam-Webster Dictionary, legality is 1 : ...
and ethicality of leaving such technology publicly available and readily accessible. Credited as the impetus behind the popularization of AI vocal reconstruction technology in content creation, 15.ai has had a significant impact on multiple Internet fandoms, most notably the ''My Little Pony: Friendship Is Magic'', ''
Team Fortress 2 ''Team Fortress 2'' is a 2007 multiplayer first-person shooter, first-person shooter game developed and published by Valve Corporation. It is the sequel to the 1996 ''Team Fortress'' Mod (video gaming), mod for ''Quake (video game), Quake'' and ...
'', and ''
SpongeBob SquarePants ''SpongeBob SquarePants'' (or simply ''SpongeBob'') is an American animated comedy television series created by marine science educator and animator Stephen Hillenburg for Nickelodeon. It chronicles the adventures of the title character a ...
'' fandoms. Several commercial alternatives have spawned with the rising popularity of 15.ai, leading to cases of misattribution and theft. In January 2022, it was discovered that Voiceverse NFT, a company that voice actor
Troy Baker Troy Baker (born April 1, 1976) is an American voice actor and musician. Baker is known for his video game roles, including Joel Miller in ''The Last of Us'' (2013) and its sequel (2020), Booker DeWitt in ''BioShock Infinite'' (2013), Samuel ...
announced his partnership with, had plagiarized 15.ai's work as part of their platform.


Features

Available characters include
GLaDOS GLaDOS (Genetic Lifeform and Disk Operating System) is a fictional artificial intelligence, artificially superintelligent computer, computer system from the video game series ''Portal (video game series), Portal''. GLaDOS later appeared in ''Th ...
and
Wheatley Wheatley may refer to: Places * Wheatley (crater), on Venus * Wheatley, Ontario, Canada * Wheatley, Hampshire, England * Wheatley, Oxfordshire, England ** Wheatley railway station * Wheatley, South Yorkshire, England * Wheatley, now Ben Rhydding, ...
from ''
Portal Portal often refers to: * Portal (architecture), an opening in a wall of a building, gate or fortification, or the extremities (ends) of a tunnel Portal may also refer to: Arts and entertainment Gaming * ''Portal'' (series), two video games ...
'', characters from ''
Team Fortress 2 ''Team Fortress 2'' is a 2007 multiplayer first-person shooter, first-person shooter game developed and published by Valve Corporation. It is the sequel to the 1996 ''Team Fortress'' Mod (video gaming), mod for ''Quake (video game), Quake'' and ...
'',
Twilight Sparkle Princess Twilight Sparkle, commonly known as Twilight Sparkle, is a fictional character who appears in the fourth incarnation (also referred to as the fourth generation or "G4") of Hasbro's My Little Pony toyline and media franchise, beginni ...
and a number of main, secondary, and supporting characters from '' My Little Pony: Friendship Is Magic'',
SpongeBob ''SpongeBob SquarePants'' (or simply ''SpongeBob'') is an American Animated series, animated Television comedy, comedy Television show, television series created by marine science educator and animator Stephen Hillenburg for Nickelodeon. It ...
from ''
SpongeBob SquarePants ''SpongeBob SquarePants'' (or simply ''SpongeBob'') is an American animated comedy television series created by marine science educator and animator Stephen Hillenburg for Nickelodeon. It chronicles the adventures of the title character a ...
'', Daria Morgendorffer and Jane Lane from '' Daria'', the
Tenth Doctor The Tenth Doctor is an incarnation of the Doctor, the main protagonist of the BBC science fiction television franchise ''Doctor Who''. He is played by David Tennant in three series as well as nine specials. As with previous incarnations of the ...
from ''
Doctor Who ''Doctor Who'' is a British science fiction television series broadcast by the BBC since 1963. The series depicts the adventures of a Time Lord called the Doctor, an extraterrestrial being who appears to be human. The Doctor explores the u ...
'',
HAL 9000 HAL 9000 is a fictional artificial intelligence character and the main antagonist in Arthur C. Clarke's ''Space Odyssey'' series. First appearing in the 1968 film '' 2001: A Space Odyssey'', HAL ( Heuristically programmed ALgorithmic computer ...
from '' 2001: A Space Odyssey'', the Narrator from ''
The Stanley Parable ''The Stanley Parable'' is a story-based video game designed and written by developers Davey Wreden and William Pugh. The game carries themes such as choice in video games, the relationship between a game creator and player, and predestination ...
'', the Wii U/3DS/ Switch
Super Smash Bros. ''Super Smash Bros.'' is a Crossover (fiction), crossover fighting game series published by Nintendo. The series was created by Masahiro Sakurai, who has directed every game in the series. The series is known for its unique gameplay objectiv ...
Announcer (formerly), Carl Brutananadilewski from '' Aqua Teen Hunger Force'', Steven Universe and the Crystal Gems from '' Steven Universe'', Dan from ''
Dan Vs. ''Dan Vs.'' is an American animated television series created by Dan Mandel and Chris Pearson. The series spanned three seasons, airing on The Hub from January 1, 2011, to March 9, 2013. 53 episodes were produced. Plot The show is about Dan, a ...
'', Sans from '' Undertale, The Griffin's Family from
Family Guy ''Family Guy'' is an American animated sitcom originally conceived and created by Seth MacFarlane for the Fox Broadcasting Company. The show centers around the Griffin family, Griffins, a dysfunctional family consisting of parents Peter Griff ...
,
Rick and Morty {{Infobox television , image = Rick and Morty title card (cropped).png , alt = , caption = , genre = {{Plainlist, * Animated sitcom * Adult animation * Science fiction * Black comedy * ...
and DC Superhero Girls (2019)'' The
deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...
model used by the application is nondeterministic: each time that speech is generated from the same string of text, the intonation of the speech will be slightly different. The application also supports manually altering the
emotion Emotions are mental states brought on by neurophysiological changes, variously associated with thoughts, feelings, behavioral responses, and a degree of pleasure or displeasure. There is currently no scientific consensus on a definition. ...
of a generated line using ''emotional contextualizers'' (a term coined by this project), a sentence or phrase that conveys the emotion of the take that serves as a guide for the model during inference. Emotional contextualizers are representations of the emotional content of a sentence deduced via transfer learned
emoji An emoji ( ; plural emoji or emojis) is a pictogram, logogram, ideogram or smiley embedded in text and used in electronic messages and web pages. The primary function of emoji is to fill in emotional cues otherwise missing from typed conversat ...
embeddings using DeepMoji, a deep neural network sentiment analysis algorithm developed by the MIT Media Lab in 2017. DeepMoji was trained on 1.2 billion emoji occurrences in
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
data from 2013 to 2017, and has been found to outperform human subjects in correctly identifying sarcasm in Tweets and other online modes of communication. 15.ai uses a ''multi-speaker model''—hundreds of voices are trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to such emotional context. Consequently, the entire lineup of characters in the application is powered by a single trained model, as opposed to multiple single-speaker models trained on different datasets. The
lexicon A lexicon is the vocabulary of a language or branch of knowledge (such as nautical or medical). In linguistics, a lexicon is a language's inventory of lexemes. The word ''lexicon'' derives from Koine Greek language, Greek word (), neuter of () ...
used by 15.ai has been scraped from a variety of Internet sources, including Oxford Dictionaries,
Wiktionary Wiktionary ( , , rhyming with "dictionary") is a multilingual, web-based project to create a free content dictionary of terms (including words, phrases, proverbs, linguistic reconstructions, etc.) in all natural languages and in a number ...
, the
CMU Pronouncing Dictionary The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research. CMUdict provides a mapping orthograp ...
,
4chan 4chan is an anonymous English-language imageboard website. Launched by Christopher "moot" Poole in October 2003, the site hosts boards dedicated to a wide variety of topics, from anime and manga to video games, cooking, weapons, television, ...
,
Reddit Reddit (; stylized in all lowercase as reddit) is an American social news aggregation, content rating, and discussion website. Registered users (commonly referred to as "Redditors") submit content to the site such as links, text posts, images ...
, and
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
. Pronunciations of unfamiliar words are automatically deduced using
phonological rule A phonological rule is a formal way of expressing a systematic phonological or morphophonological process or diachronic sound change in language. Phonological rules are commonly used in generative phonology as a notation to capture sound-related o ...
s learned by the deep learning model. The application supports a simplified version of a set of English phonetic transcriptions known as
ARPABET ARPABET (also spelled ARPAbet) is a set of phonetic transcription codes developed by Advanced Research Projects Agency (ARPA) as a part of their Speech Understanding Research project in the 1970s. It represents phonemes and allophones of General ...
to correct mispronunciations or to account for heteronyms—words that are spelled the same but are pronounced differently (such as the word ''read'', which can be pronounced as either or depending on its tense). While the original ARPABET codes developed in the 1970s by the Advanced Research Projects Agency supports 50 unique symbols to designate and differentiate between English phonemes, the
CMU Pronouncing Dictionary The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research. CMUdict provides a mapping orthograp ...
's ARPABET convention (the set of transcription codes followed by 15.ai) reduces the symbol set to 39 phonemes by combining
allophonic In phonology, an allophone (; from the Greek , , 'other' and , , 'voice, sound') is a set of multiple possible spoken soundsor ''phones''or signs used to pronounce a single phoneme in a particular language. For example, in English, (as in ''s ...
phonetic realizations into a single standard (e.g. AXR/ER; UX/ UW) and using multiple common symbols together to replace syllabic consonants (e.g. EN/AH0 N). ARPABET strings can be invoked in the application by wrapping the string of phonemes in
curly braces A bracket is either of two tall fore- or back-facing punctuation marks commonly used to isolate a segment of text or data from its surroundings. Typically deployed in symmetric pairs, an individual bracket may be identified as a 'left' or 'r ...
within the input box (e.g. to denote , the pronunciation of the word ''ARPABET''). The following is a table of phonemes used by 15.ai and the CMU Pronouncing Dictionary:


Background


Speech synthesis

In 2016, with the proposal of
DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was List of mergers and acquisitions by Google, acquired by Google in 2014 and became a wholly owned subsid ...
's WaveNet, deep-learning-based models for speech synthesis began to gain popularity as a method of modeling waveforms and generating human-like speech. Tacotron2, a neural network architecture for speech synthesis developed by
Google AI Google AI is a division of Google dedicated to artificial intelligence. It was announced at Google I/O 2017 by CEO Sundar Pichai. Projects * Serving cloud-based TPUs (tensor processing units) in order to develop machine learning software. * De ...
, was published in 2018 and required tens of hours of audio data to produce intelligible speech; when trained on 2 hours of speech, the model was able to produce intelligible speech with mediocre quality, and when trained on 36 minutes of speech, the model was unable to produce intelligible speech. For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis. The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.


Copyrighted material in deep learning

A landmark case between
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
and the
Authors Guild The Authors Guild is America's oldest and largest professional organization for writers and provides advocacy on issues of free expression and copyright protection. Since its founding in 1912 as the Authors League of America, it has counted among ...
in 2013 ruled that
Google Books Google Books (previously known as Google Book Search, Google Print, and by its code-name Project Ocean) is a service from Google Inc. that searches the full text of books and magazines that Google has scanned, converted to text using optical c ...
—a service that searches the full text of printed copyrighted books—was
transformative In United States copyright law, transformative use or transformation is a type of fair use that builds on a copyrighted work in a different manner or for a different purpose from the original, and thus does not infringe its holder's copyright. Tr ...
, thus meeting all requirements for fair use. This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a
discriminative model Discriminative models, also referred to as conditional models, are a class of logistical models used for classification or regression. They distinguish decision boundaries through observed data, such as pass/fail, win/lose, alive/dead or healthy/si ...
or a ''non-commercial''
generative model In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is inconsis ...
was deemed legal. The legality of ''commercial'' generative models trained using copyrighted material is still under debate; due to the black-box nature of machine learning models, any allegations of copyright infringement via direct competition would be difficult to prove.


Development

15.ai was designed and created by an anonymous research scientist affiliated with the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the ...
known by the alias ''15''. The project began development while the developer was an undergraduate. The developer has stated that they are capable of paying the high cost of running the site out of pocket. According to posts made by its developer on
Hacker News Hacker News (sometimes abbreviated as HN) is a social news website focusing on computer science and entrepreneurship. It is run by the investment fund and startup incubator Y Combinator. In general, content that can be submitted is defined as "any ...
, 15.ai costs several thousands of dollars per month to operate; they are able to support the project due to a successful startup exit. The developer has stated that during their undergraduate years at MIT, they were paid the minimum hourly rate to work on a related project (approximately $14 an hour in
Massachusetts Massachusetts (Massachusett language, Massachusett: ''Muhsachuweesut assachusett writing systems, məhswatʃəwiːsət'' English: , ), officially the Commonwealth of Massachusetts, is the most populous U.S. state, state in the New England ...
) that eventually evolved into 15.ai. They also stated that the democratization of voice cloning technology is not the only function of the website; in response to a user asking whether the research could be conducted without a public website, the developer wrote: The algorithm used by the project to facilitate the cloning of voices with minimal viable data has been dubbed DeepThroat (a
double entendre A double entendre (plural double entendres) is a figure of speech or a particular way of wording that is devised to have a double meaning, of which one is typically obvious, whereas the other often conveys a message that would be too socially ...
in reference to
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
using deep neural networks and the sexual act of deep-throating). The project and algorithm—initially conceived as part of MIT's
Undergraduate Research Opportunities Program An Undergraduate Research Opportunities Program provides funding and/or credit to undergraduate students who volunteer for faculty-mentored research projects pertaining to all academic disciplines. Participating universities Universities involved ...
—had been in development for years before the first release of the application. The developer has also worked closely with the Pony Preservation Project from /mlp/, the ''
My Little Pony ''My Little Pony'' (''MLP'') is a toy line and media franchise developed by American toy company Hasbro. The first toys were developed by Bonnie Zacherle, Charles Muenchinger, and Steve D'Aguanno, and were produced in 1981. The ponies feature c ...
''
board Board or Boards may refer to: Flat surface * Lumber, or other rigid material, milled or sawn flat ** Plank (wood) ** Cutting board ** Sounding board, of a musical instrument * Cardboard (paper product) * Paperboard * Fiberboard ** Hardboard, a ty ...
of
4chan 4chan is an anonymous English-language imageboard website. Launched by Christopher "moot" Poole in October 2003, the site hosts boards dedicated to a wide variety of topics, from anime and manga to video games, cooking, weapons, television, ...
. The Pony Preservation Project, which began in 2019, is a "collaborative effort by /mlp/ to build and curate pony datasets" with the aim of creating applications in artificial intelligence. The ''Friendship Is Magic'' voices on 15.ai were trained on a large dataset crowdsourced by the Pony Preservation Project: audio and dialogue from the show and related media—including all nine seasons of ''Friendship Is Magic'', the 2017 movie, spinoffs, leaks, and various other content voiced by the same voice actors—were parsed, hand-transcribed, and processed to remove background noise. According to the developer, the collective efforts and constructive criticism from the Pony Preservation Project have been integral to the development of 15.ai. In addition, the developer has stated that the logo of 15.ai, which features a robotic
Twilight Sparkle Princess Twilight Sparkle, commonly known as Twilight Sparkle, is a fictional character who appears in the fourth incarnation (also referred to as the fourth generation or "G4") of Hasbro's My Little Pony toyline and media franchise, beginni ...
, is an homage to the fact that her voice (as originally portrayed by
Tara Strong Tara Lyn Strong (née Charendoff; born February 12, 1973) is a Canadian-American actress. She is known for her voice work in animation, websites, and video games. Strong's voice roles include animated series such as ''The New Batman Adventures ...
) was indispensable to the implementation of emotional contextualizers.


Reception

15.ai has been met with largely positive reviews. Liana Ruppert of ''
Game Informer ''Game Informer'' (''GI'', most often stylized ''gameinformer'' from the 2010s onward) is an American monthly video game magazine featuring articles, news, strategy, and reviews of video games and associated consoles. It debuted in August 1991 w ...
'' described 15.ai as "simplistically brilliant." Lauren Morton of ''
Rock, Paper, Shotgun ''Rock Paper Shotgun'' (also rendered ''Rock, Paper, Shotgun''; short ''RPS'') is a UK-based website for reporting on video games, primarily for PC. Originally launched on 13 July 2007 as an independent site, ''Rock Paper Shotgun'' was acquir ...
'' and Natalia Clayton of '' PCGamer'' called it "fascinating," and José Villalobos of '' LaPS4'' wrote that it "works as easy as it looks." Users praised the ability to easily create audio of popular characters that sound believable to those unaware that the voices had been synthesized by artificial intelligence: Zack Zwiezen of ''
Kotaku ''Kotaku'' is a video game website and blog that was originally launched in 2004 as part of the Gawker Media network. Notable former contributors to the site include Luke Smith, Cecilia D'Anastasio, Tim Rogers, and Jason Schreier. History ...
'' reported that " isgirlfriend was convinced it was a new voice line from
GLaDOS GLaDOS (Genetic Lifeform and Disk Operating System) is a fictional artificial intelligence, artificially superintelligent computer, computer system from the video game series ''Portal (video game series), Portal''. GLaDOS later appeared in ''Th ...
' voice actor, Ellen McLain," while Rionaldi Chandraseta of '' Towards Data Science'' wrote that, upon watching a
YouTube YouTube is a global online video platform, online video sharing and social media, social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by ...
video featuring popular character voices generated by 15.ai, " isfirst thought was the video creator used cameo.com to pay for new dialogues from the original voice actors" and stated that "the quality of voices done by 15.ai is miles ahead of ts competitors" Computer scientist and technology entrepreneur
Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, building ...
commented in his newsletter ''
The Batch ''The'' () is a grammatical Article (grammar), article in English language, English, denoting persons or things already mentioned, under discussion, implied or otherwise presumed familiar to listeners, readers, or speakers. It is the definite ...
'' that the technology behind 15.ai could be "enormously productive" and could "revolutionize the use of virtual actors"; however, he also noted that "synthesizing a human actor's voice without consent is arguably unethical and possibly illegal" and could potentially open up to cases of impersonation and fraud. In his blog '' Marginal Revolution'',
economist An economist is a professional and practitioner in the social sciences, social science discipline of economics. The individual may also study, develop, and apply theories and concepts from economics and write about economic policy. Within this ...
Tyler Cowen deemed 15 one of the "most underrated talents in AI and machine learning."


Impact


Fandom content creation

15.ai has been frequently used for content creation in various fandoms, including the ''My Little Pony: Friendship Is Magic'' fandom, the ''
Team Fortress 2 ''Team Fortress 2'' is a 2007 multiplayer first-person shooter, first-person shooter game developed and published by Valve Corporation. It is the sequel to the 1996 ''Team Fortress'' Mod (video gaming), mod for ''Quake (video game), Quake'' and ...
'' fandom, the ''
Portal Portal often refers to: * Portal (architecture), an opening in a wall of a building, gate or fortification, or the extremities (ends) of a tunnel Portal may also refer to: Arts and entertainment Gaming * ''Portal'' (series), two video games ...
'' fandom, and the ''
SpongeBob SquarePants ''SpongeBob SquarePants'' (or simply ''SpongeBob'') is an American animated comedy television series created by marine science educator and animator Stephen Hillenburg for Nickelodeon. It chronicles the adventures of the title character a ...
'' fandom. Numerous videos and projects containing speech from 15.ai have gone
viral Viral means "relating to viruses" (small infectious agents). Viral may also refer to: Viral behavior, or virality Memetic behavior likened that of a virus, for example: * Viral marketing, the use of existing social networks to spread a marke ...
. However, some videos and projects that contain non-15.ai-generated speech have also gone viral, many of which do not properly credit the source(s) of the synthetic speech featured in them. As a consequence, many videos and projects that have been made with other speech synthesis software have been mistaken as being made with 15.ai, and vice versa. Due to this misattribution and absence of proper credit, 15.ai's terms of service has a rule that forbids having 15.ai-and-non-15.ai-generated speech in the same videos and projects. The ''My Little Pony: Friendship Is Magic'' fandom has seen a resurgence in video and musical content creation as a direct result, inspiring a new genre of fan-created content assisted by artificial intelligence. Some fanfiction have been adapted into fully voiced "episodes": ''The Tax Breaks'' is a 17-minute long animated video rendition of a fan-written story published in 2014 that uses voices generated from 15.ai with
sound effects A sound effect (or audio effect) is an artificially created or enhanced sound, or sound process used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media. Traditi ...
and audio editing, emulating the episodic style of the early seasons of ''Friendship Is Magic''. Viral videos from the ''Team Fortress 2'' fandom that feature voices from 15.ai include ''Spy is a Furry'' (which has gained over 3 million views on YouTube total across multiple videos) and ''The RED Bread Bank'', both of which have inspired Source Filmmaker animated video renditions. Other fandoms have used voices from 15.ai to produce viral videos. , the viral video '' Among Us Struggles'' (which uses voices from ''Friendship Is Magic'') has over 5.5 million views on YouTube; YouTubers,
TikTokers TikTok, known in China as Douyin (), is a short-form video hosting service owned by the Chinese company ByteDance. It hosts user-submitted videos, which can range in duration from 15 seconds to 10 minutes. TikTok is an international version o ...
, and Twitch streamers have also used 15.ai for their videos, such as FitMC's video on the history of
2b2t 2b2t (2builders2tools) is a ''Minecraft'' server founded in December 2010. 2b2t has practically no rules and players are not banned, known within ''Minecraft'' as an "anarchy server". As a result, players commonly engage in harassment, col ...
—one of the oldest running ''
Minecraft ''Minecraft'' is a sandbox game developed by Mojang Studios. The game was created by Markus "Notch" Persson in the Java programming language. Following several early private testing versions, it was first made public in May 2009 before being ...
'' servers—and datpon3's TikTok video featuring the main characters of ''Friendship Is Magic'', which have 1.4 million and 510 thousand views, respectively. Some users have created AI
virtual assistant An intelligent virtual assistant (IVA) or intelligent personal assistant (IPA) is a software agent that can perform tasks or services for an individual based on commands or questions. The term "chatbot" is sometimes used to refer to virtual ...
s using 15.ai and external voice control software. One user on Twitter created their own personal
GLaDOS GLaDOS (Genetic Lifeform and Disk Operating System) is a fictional artificial intelligence, artificially superintelligent computer, computer system from the video game series ''Portal (video game series), Portal''. GLaDOS later appeared in ''Th ...
desktop assistant using the voice control system VoiceAttack that is able to boot up applications, utter corresponding random dialogues, and thank the user in response to actions.


Troy Baker / Voiceverse NFT plagiarism scandal

In December 2021, the developer of 15.ai posted on
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
that they had no interest in incorporating
non-fungible tokens A non-fungible token (NFT) is a unique digital identifier that cannot be copied, substituted, or subdivided, that is recorded in a blockchain, and that is used to certify authenticity and ownership. The ownership of an NFT is recorded in the b ...
(NFTs) into their work. On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game and
anime is Traditional animation, hand-drawn and computer animation, computer-generated animation originating from Japan. Outside of Japan and in English, ''anime'' refers specifically to animation produced in Japan. However, in Japan and in Japane ...
dub
voice actor Voice acting is the art of performing voice-overs to present a character or provide information to an audience. Performers are called voice actors/actresses, voice artists, dubbing artists, voice talent, voice-over artists, or voice-over talent ...
Troy Baker Troy Baker (born April 1, 1976) is an American voice actor and musician. Baker is known for his video game roles, including Joel Miller in ''The Last of Us'' (2013) and its sequel (2020), Booker DeWitt in ''BioShock Infinite'' (2013), Samuel ...
announced his partnership with, had plagiarized voice lines generated from 15.ai as part of their marketing campaign.
Log files In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or lo ...
showed that Voiceverse had generated audio of
Twilight Sparkle Princess Twilight Sparkle, commonly known as Twilight Sparkle, is a fictional character who appears in the fourth incarnation (also referred to as the fourth generation or "G4") of Hasbro's My Little Pony toyline and media franchise, beginni ...
and
Rainbow Dash The ''My Little Pony'' franchise debuted in 1982, as the creation of American illustrator and designer Bonnie Zacherle. Together with sculptor Charles Muenchinger and manager Steve D'Aguanno, Zacherle submitted a design patent in August 1981 fo ...
from the show '' My Little Pony: Friendship Is Magic'' using 15.ai, pitched them up to make them sound unrecognizable from the original voices, and appropriated them without proper credit to falsely market their own platform—a violation of 15.ai's terms of service. A week prior to the announcement of the partnership with Baker, Voiceverse made a (now-deleted) Twitter post directly responding to a (now-deleted) video posted by Chubbiverse—an NFT platform with which Voiceverse had partnered—showcasing an AI-generated voice and claimed that it was generated using Voiceverse's platform, remarking ''"I wonder who created the voice for this? ;)"'' A few hours after news of the partnership broke, the developer of 15.ai—having been alerted by another Twitter user asking for his opinion on the partnership, to which he speculated that it "sounds like a scam"—posted
screenshots screenshot (also known as screen capture or screen grab) is a digital image that shows the contents of a computer display. A screenshot is created by the operating system or software running on the device powering the display. Additionally, s ...
of log files that proved that a user of the website (with their
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
redacted) had submitted inputs of the exact words spoken by the AI voice in the video posted by Chubbiverse, and subsequently responded to Voiceverse's claim directly, tweeting "Certainly not you :)". Following the tweet, Voiceverse admitted to plagiarizing voices from 15.ai as their own platform, claiming that their
marketing Marketing is the process of exploring, creating, and delivering value to meet the needs of a target market in terms of goods and services; potentially including selection of a target audience; selection of certain attributes or themes to emph ...
team had used the project without giving proper credit and that the "Chubbiverse team adno knowledge of this." In response to the admission, 15 tweeted "
Go fuck yourself ''Fuck'' is an English-language expletive. It often refers to the act of sexual intercourse, but is also commonly used as an intensifier or to convey disdain. While its origin is obscure, it is usually considered to be first attested to arou ...
." The final tweet went
viral Viral means "relating to viruses" (small infectious agents). Viral may also refer to: Viral behavior, or virality Memetic behavior likened that of a virus, for example: * Viral marketing, the use of existing social networks to spread a marke ...
, accruing over 75,000 total likes and 13,000 total retweets across multiple reposts. The initial partnership between Baker and Voiceverse was met with severe backlash and universally negative reception. Critics highlighted the environmental impact of and potential for
exit scam An exit scam is a confidence trick where an established business stops shipping orders while receiving payment for new orders. If the entity had a good reputation, it could take some time before it is widely recognized that orders are not shipping ...
s associated with NFT sales. Commentators also pointed out the irony in Baker's initial Tweet announcing the partnership, which ended with "You can hate. Or you can create. What'll it be?", hours before the public revelation that the company in question had resorted to theft instead of creating their own product. Baker responded that he appreciated people sharing their thoughts and their responses were "giving ima lot to think about." He also acknowledged that the "hate/create" part in his initial Tweet might have been "a bit antagonistic," and asked fans on social media to forgive him. Two weeks later, on January 31, Baker announced that he would discontinue his partnership with Voiceverse.


Resistance from voice actors

Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about impersonation and fraud, unauthorized use of an actor's voice in
pornography Pornography (often shortened to porn or porno) is the portrayal of sexual subject matter for the exclusive purpose of sexual arousal. Primarily intended for adults,
, and the potential of AI being used to make voice actors obsolete.


List of voices

All characters available on 15.ai (both currently and formerly) are listed in the table below.


See also

* Audio deepfake * Character.ai *
ChatGPT ChatGPT (Generative Pre-trained Transformer) is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning) with both supervised and ...
* DALL-E *
Deepfakes Deepfakes (a portmanteau of "deep learning" and "fake") are synthetic media in which a person in an existing image or video is replaced with someone else's likeness. While the act of creating fake content is not new, deepfakes leverage powerful ...
* Midjourney * Stable Diffusion * Synthetic media * WaveNet


Notes


References

;Notes ;Tweets ;YouTube (referenced for view counts and usage of 15.ai only) ;TikTok


External links

* *
''The Tax Breaks (Twilight) (15.ai)''
{{My Little Pony: Friendship Is Magic Speech synthesis Deep learning software applications Applications of artificial intelligence Deepfakes Massachusetts Institute of Technology alumni My Little Pony: Friendship Is Magic My Little Pony fandom Computer-related introductions in 2020 Web applications 2020 in Internet culture 2020s fads and trends