Internet linguistics
   HOME

TheInfoList



OR:

Internet linguistics is a domain of
linguistics Linguistics is the science, scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure ...
advocated by the English linguist
David Crystal David Crystal, (born 6 July 1941) is a British linguist, academic, and prolific author best known for his works on linguistics and the English language. Family Crystal was born in Lisburn, Northern Ireland, on 6 July 1941 after his mother had ...
. It studies new language styles and forms that have arisen under the influence of the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
and of other
new media New media describes communication technologies that enable or enhance interaction between users as well as interaction between users and content. In the middle of the 1990s, the phrase "new media" became widely used as part of a sales pitch for ...
, such as Short Message Service (SMS)
text messaging Text messaging, or texting, is the act of composing and sending electronic messages, typically consisting of alphabetic and numeric characters, between two or more users of mobile devices, desktops/laptops, or another type of compatible comput ...
. Since the beginning of
human–computer interaction Human–computer interaction (HCI) is research in the design and the use of computer technology, which focuses on the interfaces between people (users) and computers. HCI researchers observe the ways humans interact with computers and design te ...
(HCI) leading to
computer-mediated communication Computer-mediated communication (CMC) is defined as any human communication that occurs through the use of two or more electronic devices. While the term has traditionally referred to those communications that occur via computer-mediated format ...
(CMC) and Internet-mediated communication (IMC), experts, such as
Gretchen McCulloch Gretchen McCulloch () is a Canadian linguist. On her blog, as well as her podcast Lingthusiasm (which she cohosts with Lauren Gawne) she offers linguistic analysis of online communication such as internet memes, emoji and instant messaging. Sh ...
have acknowledged that linguistics has a contributing role in it, in terms of web interface and usability. Studying the emerging language on the Internet can help improve conceptual organization, translation and web usability. Such study aims to benefit both linguists and web users combined. The study of Internet linguistics can take place through four main perspectives:
sociolinguistics Sociolinguistics is the descriptive study of the effect of any or all aspects of society, including cultural Norm (sociology), norms, expectations, and context (language use), context, on the way language is used, and society's effect on languag ...
,
education Education is a purposeful activity directed at achieving certain aims, such as transmitting knowledge or fostering skills and character traits. These aims may include the development of understanding, rationality, kindness, and honesty ...
, stylistics and applied linguistics. Further dimensions have developed as a result of further technological advances, which include the development of the Web as
corpus Corpus is Latin for "body". It may refer to: Linguistics * Text corpus, in linguistics, a large and structured set of texts * Speech corpus, in linguistics, a large set of speech audio files * Corpus linguistics, a branch of linguistics Music * ...
and the spread and influence of the stylistic variations brought forth by the spread of the Internet, through the
mass media Mass media refers to a diverse array of media technologies that reach a large audience via mass communication. The technologies through which this communication takes place include a variety of outlets. Broadcast media transmit informati ...
and through
literary works Literature is any collection of written work, but it is also used more narrowly for writings specifically considered to be an art form, especially prose fiction, drama, and poetry. In recent centuries, the definition has expanded to include ...
. In view of the increasing number of users connected to the Internet, the linguistics future of the Internet remains to be determined, as new computer-mediated technologies continue to emerge and people adapt their languages to suit these new media. The Internet continues to play a significant role both in encouraging people and in diverting attention away from the usage of languages.


Main perspectives

David Crystal has identified four main perspectives for further investigation: the sociolinguistic perspective, the educational perspective, the stylistic perspective and the applied perspective. The four perspectives are effectively interlinked and affect one another.


Sociolinguistic perspective

This perspective deals with how society views the impact of Internet development on languages. The advent of the Internet has revolutionized communication in many ways; it changed the way people communicate and created new platforms with far-reaching social impact. Significant avenues include but are not limited to SMS text messaging,
e-mails Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant ...
, chatgroups,
virtual worlds A virtual world (also called a virtual space) is a computer-simulated environment which may be populated by many users who can create a personal avatar, and simultaneously and independently explore the virtual world, participate in its activities ...
and the Web. The evolution of these new mediums of communications has raised much concern with regards to the way language is being used. According to Crystal (2005), these concerns are neither without grounds nor unseen in history it surfaces almost always when a new technology breakthrough influences languages; as seen in the 15th century when
printing Printing is a process for mass reproducing text and images using a master form or template. The earliest non-paper products involving printing include cylinder seals and objects such as the Cyrus Cylinder and the Cylinders of Nabonidus. The ...
was introduced, the 19th century when the
telephone A telephone is a telecommunications device that permits two or more users to conduct a conversation when they are too far apart to be easily heard directly. A telephone converts sound, typically and most efficiently the human voice, into e ...
was invented and the 20th century when
broadcasting Broadcasting is the distribution of audio or video content to a dispersed audience via any electronic mass communications medium, but typically one using the electromagnetic spectrum ( radio waves), in a one-to-many model. Broadcasting beg ...
began to penetrate our society. At a personal level, CMC such as SMS text messaging and mobile e-mailing ( push mail) has greatly enhanced instantaneous communication. Some examples include the iPhone and the
BlackBerry The blackberry is an edible fruit produced by many species in the genus ''Rubus'' in the family Rosaceae, hybrids among these species within the subgenus ''Rubus'', and hybrids between the subgenera ''Rubus'' and ''Idaeobatus''. The taxonomy ...
. In schools, it is not uncommon for educators and students to be given personalized school e-mail accounts for communication and interaction purposes. Classroom discussions are increasingly being brought onto the Internet in the form of discussion forums. For instance, at Nanyang Technological University, students engage in collaborative learning at the university's portal edveNTUre, where they participate in discussions on forums and online quizzes and view streaming podcasts prepared by their course instructors among others. iTunes U in 2008 began to collaborate with universities as they converted the
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, where its wild ancestor, ' ...
music service into a store that makes available academic lectures and scholastic materials for free they have partnered more than 600 institutions in 18 countries, including
Oxford Oxford () is a city in England. It is the county town and only city of Oxfordshire. In 2020, its population was estimated at 151,584. It is north-west of London, south-east of Birmingham and north-east of Bristol. The city is home to the ...
,
Cambridge Cambridge ( ) is a College town, university city and the county town in Cambridgeshire, England. It is located on the River Cam approximately north of London. As of the 2021 United Kingdom census, the population of Cambridge was 145,700. Cam ...
and
Yale Yale University is a private research university in New Haven, Connecticut. Established in 1701 as the Collegiate School, it is the third-oldest institution of higher education in the United States and among the most prestigious in the wor ...
Universities. These forms of academic social networking and media are slated to rise as educators from all over the world continue to seek new ways to better engage students. It is commonplace for students in
New York University New York University (NYU) is a private research university in New York City. Chartered in 1831 by the New York State Legislature, NYU was founded by a group of New Yorkers led by then- Secretary of the Treasury Albert Gallatin. In 1832, th ...
to interact with “guest speakers weighing in via
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
, library staffs providing support via
instant messaging Instant messaging (IM) technology is a type of online chat allowing real-time text transmission over the Internet or another computer network. Messages are typically transmitted between two or more parties, when each user inputs text and trigge ...
, and students accessing library resources from off campus”. This will affect the way language is used as students and teachers begin to use more of these CMC platforms. At a professional level, it is a common sight for companies to have their computers and laptops hooked up onto the Internet (via wired and wireless
Internet connection Internet access is the ability of individuals and organizations to connect to the Internet using computer terminals, computers, and other devices; and to access services such as email and the World Wide Web. Internet access is sold by Internet ...
), and for employees to have individual e-mail accounts. This greatly facilitates internal (among staffs of the company) and external (with other parties outside of one's organization) communication. Mobile communications such as
smart phones A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, which ...
are increasingly making their way into the corporate world. For instance, in 2008, Apple announced their intention to actively step up their efforts to help companies incorporate the iPhone into their enterprise environment, facilitated by technological developments in streamlining integrated features (push e-mail, calendar and contact management) using ActiveSync. In general, these new CMCs that are made possible by the Internet have altered the way people use language there is heightened informality and consequently a growing fear of its deterioration. However, as David Crystal puts it, these should be seen positively as it reflects the power of the creativity of a language.


Themes

The sociolinguistics of the Internet may also be examined through five interconnected themes. #
Multilingualism Multilingualism is the use of more than one language, either by an individual speaker or by a group of speakers. It is believed that multilingual speakers outnumber monolingual speakers in the world's population. More than half of all E ...
– It looks at the prevalence and status of various languages on the Internet. # Language change – From a sociolinguistic perspective, language change is influenced by the physical constraints of technology (e.g. typed text) and the shifting social-economic priorities such as globalization. It explores the linguistic changes over time, with emphasis on Internet lingo. # Conversation discourse – It explores the changes in patterns of social interaction and communicative practice on the Internet. # Stylistic diffusion – It involves the study of the spread of Internet jargons and related linguistic forms into common usage. As language changes, conversation discourse and stylistic diffusion overlap with the aspect of language stylistics. #: ''See below: Stylistic perspective''. # Metalanguage and
folk linguistics Folk linguistics consists of statements, beliefs, or practices concerning language which are based on uninformed speculation rather than the scientific method. Folk linguistics sometimes arises when scientific conclusions about language come off as ...
– It involves looking at the way these linguistic forms and changes on the Internet are labelled and discussed (e.g. impact of Internet lingo resulted in the "death" of the apostrophe and loss of capitalization.)


Educational perspective

The educational perspective of internet linguistics examines the Internet's impact on
formal language In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of a formal language consists of sy ...
use, specifically on Standard English, which in turn affects
language education Language education – the process and practice of teaching a second or foreign language – is primarily a branch of applied linguistics, but can be an interdisciplinary field. There are four main learning categories for language educatio ...
. The rise and rapid spread of Internet use has brought about new linguistic features specific only to the Internet platform. These include, but are not limited to, an increase in the use of informal written language, inconsistency in written styles and stylistics and the use of new abbreviations in Internet chats and SMS text messaging, where constraints of technology on word count contributed to the rise of new abbreviations. Such
acronyms An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
exist primarily for practical reasons to reduce the time and effort required to communicate through these mediums apart from technological limitations. Examples of common acronyms include ''
lol LOL, or lol, is an initialism for laughing out loud and a popular element of Internet slang. It was first used almost exclusively on Usenet, but has since become widespread in other forms of computer-mediated communication and even face-to ...
'' (for "laughing out loud"; a general expression of laughter), '' omg'' ("oh my god") and ''gtg'' ("got to go"). The educational perspective has been considerably established in the research on the Internet's impact on language education. It is an important and crucial aspect, as it affects and involves the education of current and future student generations in the appropriate and timely use of informal language that arises from
Internet usage Internet access is the ability of individuals and organizations to connect to the Internet using computer terminals, computers, and other devices; and to access services such as email and the World Wide Web. Internet access is sold by Internet ...
. There are concerns for the growing infiltration of informal language use and incorrect word use into academic or formal situations, such as the usage of casual words like "guy" or the choice of the word "preclude" in place of "precede" in academic papers by students. There are also issues with spellings and grammar occurring at a higher frequency among students' academic works as noted by educators, with the use of abbreviations such as "u" for "you" and "2" for "to" being the most common. Linguists and professors like Eleanor Johnson suspect that widespread mistakes in writing are strongly connected to Internet usage, where educators have similarly reported new kinds of spelling and grammar mistakes in student works. There is, however, no scientific evidence to confirm the proposed connection. Naomi S. Baron argues in ''Always On'' that student writings suffer little impact from the use of Internet-mediated communication (IMC) such as internet chat, SMS text messaging and e-mail. A study in 2009 published by the British Journal of Developmental Psychology found that students who regularly texted (sent messages via SMS using a mobile phone) displayed a wider range of vocabulary, and this may lead to a positive impact on their reading development. Though the use of the Internet resulted in stylistics that are not deemed appropriate in academic and formal language use, Internet use may not hinder language education but instead aid it. The Internet has proven in different ways that it can provide potential benefits in enhancing language learning, especially in second or foreign-language learning. Language education through the Internet in relation to Internet linguistics is, most significantly, applied through the communication aspect (use of e-mails,
discussion forums An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are often longer than one line of text, and are at least temporar ...
, chat messengers,
blog A blog (a truncation of "weblog") is a discussion or informational website published on the World Wide Web consisting of discrete, often informal diary-style text entries (posts). Posts are typically displayed in reverse chronological order s ...
s, etc.). IMC allows greater interaction between language learners and native speakers of the language, providing for greater error corrections and better learning opportunities of standard language, in the process allowing the picking up of specific skills such as negotiation and persuasion.


Stylistic perspective

This perspective examines how the Internet and its related technologies have encouraged new and different forms of creativity in language, especially in literature. It looks at the Internet as a medium through which new language phenomena have arisen. This new mode of language is interesting to study because it is an amalgam of both spoken and written languages. For example, traditional writing is static compared to the dynamic nature of the new language on the Internet, where words can appear in different colors and font sizes on the computer screen. Yet, this new mode of language also contains other elements not found in natural languages. One example is the concept of framing found in e-mails and discussion forums. In replying to e-mails, people generally use the sender's e-mail message as a frame to write their own messages. They can choose to respond to certain parts of an e-mail message while leaving other bits out. In discussion forums, one can start a new thread, and anyone regardless of their physical location can respond to the idea or thought that was set down through the Internet. This is something that is usually not found in written language. Future research also includes new varieties of expressions that the Internet and its various technologies are constantly producing and their effects not only on written languages but also their spoken forms. The communicative style of Internet language is best observed in the CMC channels below, as there are often attempts to overcome technological restraints such as transmission time lags and to re-establish social cues that are often vague in written text.


Mobile phones

Mobile phone A mobile phone, cellular phone, cell phone, cellphone, handphone, hand phone or pocket phone, sometimes shortened to simply mobile, cell, or just phone, is a portable telephone that can make and receive calls over a radio frequency link whi ...
s (also called cell phones) have an expressive potential beyond their basic communicative functions. This can be seen in text-messaging poetry competitions such as the one held by
The Guardian ''The Guardian'' is a British daily newspaper. It was founded in 1821 as ''The Manchester Guardian'', and changed its name in 1959. Along with its sister papers ''The Observer'' and ''The Guardian Weekly'', ''The Guardian'' is part of the Gu ...
. The 160-character limit imposed by the cell phone has motivated users to exercise their linguistic creativity to overcome them. A similar example of new technology with character constraints is
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
, which has a 280-character limit. There have been debates as to whether these new abbreviated forms introduced in users’ Tweets are "lazy" or whether they are creative fragments of communication. Despite the ongoing debate, there is no doubt that Twitter has contributed to the linguistic landscape with new lingoes and also brought about a new dimension of communication. The cell phone has also created a new literary genre cell phone novels. A typical cell phone novel consists of several chapters, which readers download in short installments. These novels are in their "raw" form, as they do not go through editing processes like traditional novels. They are written in short sentences, similar to text messaging. Authors of such novels are also able to receive feedback and new ideas from their readers through e-mails or online feedback channels. Unlike traditional novel writing, readers’ ideas sometimes get incorporated into the storyline, or authors may also decide to change their story's plot according to the demand and popularity of their novel (typically gauged by the number of download hits). Despite their popularity, there has also been criticism regarding the novels’ "lack of diverse vocabulary" and poor grammar.


Blogs

Blog A blog (a truncation of "weblog") is a discussion or informational website published on the World Wide Web consisting of discrete, often informal diary-style text entries (posts). Posts are typically displayed in reverse chronological order s ...
ging has brought about new ways of writing diaries and from a linguistic perspective, the language used in blogs is "in its most 'naked' form", published for the world to see without undergoing the formal editing process. This is what makes blogs stand out because almost all other forms of printed language have gone through some form of editing and standardization. David Crystal stated that blogs were "the beginning of a new stage in the evolution of the written language". Blogs have become so popular that they have expanded beyond written blogs, with the emergence of
photoblog A photoblog (or photolog) is a form of photo sharing and publishing in the format of a blog. It differs from a blog through the predominant use of and focus on photographs rather than text. Photoblogging (the action of posting photos to a photob ...
, videoblog, audioblog and
moblog Mobile blogging (also known as mobloggingIto, M. (2002) 'Mobiles and the appropriation of place', receiver magazine, 8, www.receiver.vodafone.com) is a method of publishing to a website or blog from a mobile phone or other handheld device. A moblog ...
. These developments in interactive blogging have created new linguistic conventions and styles, with more expected to arise in the future.


Virtual worlds

Virtual worlds A virtual world (also called a virtual space) is a computer-simulated environment which may be populated by many users who can create a personal avatar, and simultaneously and independently explore the virtual world, participate in its activities ...
provide insights into how users are adapting the usage of natural language for communication within these new mediums. The Internet language that has arisen through user interactions in text-based chatrooms and computer-simulated worlds has led to the development of slangs within digital communities. Examples of these include "
pwn Leet (or "1337"), also known as eleet or leetspeak, is a system of modified spellings used primarily on the Internet. It often uses character replacements in ways that play on the similarity of their glyphs via reflection or other resemblance. ...
" and "
noob Newbie, newb, noob, noobie, n00b or nub is a slang term for a novice or newcomer, or somebody inexperienced in a profession or activity. Contemporary use can particularly refer to a beginner or new user of computers, often concerning Internet ac ...
". Emoticons are further examples of how users have adapted different expressions to suit the limitations of cyberspace communication, one of which is the "loss of emotivity". Communication in niches such as role-playing games (RPG) of multi-user domains (MUDs) and virtual worlds is highly interactive, with emphasis on speed, brevity and spontaneity. As a result, CMC is generally more vibrant, volatile, unstructured and open. There are often complex organization of sequences and exchange structures evident in the connection of conversational strands and short turns. Some of the CMC strategies used include capitalization for words such as ''EMPHASIS'', usage of symbols such as the asterisk to enclose words as seen in ''*stress*'' and the creative use of punctuation like ''???!?!?!?''. Symbols are also used for discourse functions, such as the asterisk as a conversational repair marker and arrows and carats as deixis and referent markers. Besides contributing to these new forms in language, virtual worlds are also being used to teach languages. Virtual world language learning provides students with simulations of real-life environments, allowing them to find creative ways to improve their language skills. Virtual worlds are good tools for language learning among the younger learners because they already see such places as a "natural place to learn and play".


E-mail

One of the most popular Internet-related technologies to be studied under this perspective is
e-mail Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant ...
, which has expanded the stylistics of languages in many ways. A study done on the linguistic profile of e-mails has shown that there is a hybrid of speech and writing styles in terms of format, grammar and style. E-mail is rapidly replacing traditional letter-writing because of its convenience, speed and spontaneity. It is often related to informality, as it feels temporary and can be deleted easily. However, as this medium of communication matures, e-mail is no longer confined to sending informal messages between friends and relatives. Instead, business correspondences are increasingly being carried out through e-mails. Job seekers are also using e-mails to send their resumes to potential employers. The result of a move towards more formal usages will be a medium representing a range of formal and informal stylistics. While e-mail has been blamed for students’ increased usage of informal language in their written work, David Crystal argues that e-mail is "not a threat, for language education" because e-mail with its array of stylistic expressiveness can act as a domain for language learners to make their own linguistic choices responsibly. Furthermore, the younger generation's high propensity for using e-mail may improve their writing and communication skills because of the efforts they are making to formulate their thoughts and ideas, albeit through a digital medium.


Instant messaging

Like other forms of online communication,
instant messaging Instant messaging (IM) technology is a type of online chat allowing real-time text transmission over the Internet or another computer network. Messages are typically transmitted between two or more parties, when each user inputs text and trigge ...
has also developed its own acronyms and short forms. However, instant messaging is quite different from e-mail and chatgroups because it allows participants to interact with one another in real-time while conversing in private. With instant messaging, there is an added dimension of familiarity among participants. This increased degree of intimacy allows greater informality in language and "typographical idiosyncrasies". There are also greater occurrences of stylistic variation because there can be a very wide age gap between participants. For example, a granddaughter can catch up with her grandmother through instant messaging. Unlike chatgroups where participants come together with shared interests, there is no pressure to conform in language here.


Applied perspective

The applied perspective views the linguistic exploitation of the Internet in terms of its communicative capabilities the good and the bad. The Internet provides a platform where users can experience multilingualism. Although English is still the dominant language used on the Internet, other languages are gradually increasing in their number of users. The
Global Internet usage Global Internet Usage is the number of people who use the Internet worldwide. Internet users In 2015, the International Telecommunication Union estimated about 3.2 billion people, or almost half of the world's population, would be online by the ...
page provides some information on the number of users of the Internet by language, nationality and geography. This multilingual environment continues to increase in diversity as more language communities become connected to the Internet. The Internet is thus a platform where minority and
endangered language An endangered language or moribund language is a language that is at risk of disappearing as its speakers die out or shift to speaking other languages. Language loss occurs when the language has no more native speakers and becomes a "dead langu ...
s can seek to revive their language use and/or create awareness. This can be seen in two instances where it provides these languages opportunities for progress in two important regards
language documentation Language documentation (also: documentary linguistics) is a subfield of linguistics which aims to describe the grammar and use of human languages. It aims to provide a comprehensive record of the linguistic practices characteristic of a given spee ...
and
language revitalization Language revitalization, also referred to as language revival or reversing language shift, is an attempt to halt or reverse the decline of a language or to revive an extinct one. Those involved can include linguists, cultural or community groups, o ...
.


Language documentation

Firstly, the Internet facilitates
language documentation Language documentation (also: documentary linguistics) is a subfield of linguistics which aims to describe the grammar and use of human languages. It aims to provide a comprehensive record of the linguistic practices characteristic of a given spee ...
. Digital archives of media such as audio and video recordings not only help to preserve language documentation, but also allows for global dissemination through the Internet. Publicity about endangered languages, such as has helped to spur a worldwide interest in linguistic documentation. Foundations such as the Hans Rausing Endangered Languages Project (HRELP), funded by Arcadia also help to develop the interest in linguistic documentation. The HRELP is a project that seeks to document endangered languages, preserve and disseminate documentation materials among others. The materials gathered are made available online under its
Endangered Languages Archive The Endangered Languages Archive (ELAR) is a digital archive for materials on endangered languages, based at Berlin-Brandenburg Academy of Sciences and Humanities (BBAW). The Archive preserves digital collections, including audio and video recordi ...
(ELAR) program. Other online materials that support language documentation include the Language Archive Newsletter, which provides news and articles about topics in endangered languages. The web version of Ethnologue also provides brief information of all of the world's known living languages. By making resources and information of endangered languages and language documentation available on the Internet, it allows researchers to build on these materials and hence preserve endangered languages.


Language revitalization

Secondly, the Internet facilitates
language revitalization Language revitalization, also referred to as language revival or reversing language shift, is an attempt to halt or reverse the decline of a language or to revive an extinct one. Those involved can include linguists, cultural or community groups, o ...
. Throughout the years, the digital environment has developed in various sophisticated ways that allow virtual contact. From e-mails, chats to instant messaging, these virtual environments have helped to bridge the spatial distance between communicators. The use of e-mails has been adopted in language courses to encourage students to communicate in various styles such as conference-type formats and also to generate discussions. Similarly, the use of e-mails facilitates language revitalization in the sense that speakers of a minority language who moved to a location where their native language is not being spoken can take advantage of the Internet to communicate with their family and friends, thus maintaining the use of their native language. With the development and increasing use of telephone broadband communication such as
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
, language revitalization through the internet is no longer restricted to literate users. Hawaiian educators have been taking advantage of the Internet in their language revitalization programs. The graphical bulletin board system Leoki (Powerful Voice) was established in 1994. The content, interface and menus of the system are entirely in the Hawaiian language. It is installed throughout the immersion school system and includes components for e-mails, chat, dictionary and online newspaper among others. In higher institutions such as colleges and universities where the Leoki system is not yet installed, the educators make use of other software and Internet tools such as Daedalus Interchange, e-mails and the Web to connect students of Hawaiian language with the broader community. Another use of the Internet includes having students of minority languages write about their native cultures in their native languages for distant audiences. Also, in an attempt to preserve their language and culture,
Occitan Occitan may refer to: * Something of, from, or related to the Occitania territory in parts of France, Italy, Monaco and Spain. * Something of, from, or related to the Occitania administrative region of France. * Occitan language, spoken in parts o ...
speakers have been taking advantage of the Internet to reach out to other Occitan speakers from around the world. These methods provide reasons for using the minority languages by communicating in it. In addition, the use of digital technologies, which the young generation think of as "cool", will appeal to them and in turn maintain their interest and usage of their native languages.


Exploitation of the Internet

The Internet can also be exploited for activities such as
terrorism Terrorism, in its broadest sense, is the use of criminal violence to provoke a state of terror or fear, mostly with the intention to achieve political or religious aims. The term is used in this regard primarily to refer to intentional violen ...
,
internet fraud Internet fraud is a type of cybercrime fraud or deception which makes use of the Internet and could involve hiding of information or providing incorrect information for the purpose of tricking victims out of money, property, and inheritance. Int ...
and
pedophilia Pedophilia ( alternatively spelt paedophilia) is a psychiatric disorder in which an adult or older adolescent experiences a primary or exclusive sexual attraction to prepubescent children. Although girls typically begin the process of puberty ...
. In recent years, there has been an increase in crimes that involved the use of the Internet such as e-mails and
Internet Relay Chat Internet Relay Chat (IRC) is a text-based chat system for instant messaging. IRC is designed for group communication in discussion forums, called '' channels'', but also allows one-on-one communication via private messages as well as chat an ...
(IRC), as it is relatively easy to remain anonymous. These conspiracies carry concerns for security and protection. From a forensic linguistic point of view, there are many potential areas to explore. While developing a chat room child protection procedure based on search terms filtering is effective, there is still minimal linguistically orientated literature to facilitate the task. In other areas, it is observed that the Semantic Web has been involved in tasks such as personal
data protection Information privacy is the relationship between the collection and dissemination of data, technology, the public expectation of privacy, contextual information norms, and the legal and political issues surrounding them. It is also known as data pr ...
, which helps to prevent fraud.


Dimensions

The dimensions covered in this section include looking at the Web as a corpus and issues of language identification and normalization. The impacts of Internet linguistics on everyday life are examined under the spread and influence of Internet stylistics, trends of language change on the Internet and conversation discourse.


The Web as a corpus

With the Web being a huge reservoir of data and resources, language scientists and technologists are increasingly turning to the web for language data. Corpora were first formally mentioned in the field of computational linguistics at the 1989 ACL meeting in Vancouver. It was met with much controversy, as they lacked theoretical integrity leading to much skepticism of their role in the field, until the publication of the journal "Using Large Corpora" in 1993 that the relationship between computational linguistics and corpora became widely accepted. To establish whether the Web is a corpus, it is worthwhile to turn to the definition established by McEnery and Wilson (1996, p. 21): Relating closer to the Web as a corpus, Manning and Schütze (1999, p. 120) further streamlines the definition: Hit counts were used for carefully constructed search engine queries to identify rank orders for word sense frequencies, as an input to a
word sense disambiguation Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to consc ...
engine. This method was further explored with the introduction of the concept of a parallel corpora where the existing Web pages that exist in parallel in local and major languages be brought together. It was demonstrated that it is possible to build a language-specific corpus from a single document in that specific language.


Themes

There has been much discussion about the possible developments in the arena of the Web as a corpus. The development of using the web as a data source for word sense disambiguation was brought forward in The EU MEANING project in 2002. It used the assumption that within a domain, words often have a single meaning, and that domains are identifiable on the Web. This was further explored by using Web technology to gather manual word sense annotations on the Word Expert Web site. In areas of
language modeling A language model is a probability distribution over sequences of words. Given any sequence of words of length , a language model assigns a probability P(w_1,\ldots,w_m) to the whole sequence. Language models generate probabilities by training on ...
, the Web has been used to address data sparseness. Lexical statistics have been gathered for resolving prepositional phrase attachments, while Web document were used to seek a balance in the corpus. In areas of information retrieval, a Web track was integrated as a component in the community's TREC evaluation initiative. The sample of the Web used for this exercise amount to around 100GB, compromising of largely documents in the .gov top level domain.


British National Corpus

The
British National Corpus The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British English of the late 20th century from a wide variety of genres, with the intention ...
contains ample information on the dominant meanings and usage patterns for the 10,000 words that forms the core of English. The number of words in the British National Corpus (about 100 million) is sufficient for many empirical strategies for learning about language for linguists and lexicographers, and is satisfactory for technologies that utilize quantitative information about the behavior of words as input (parsing). However, for some other purposes, it is insufficient, as an outcome of the Zipfian nature of word frequencies. Because the bulk of the lexical stock occurs less than 50 times in the British National Corpus, it is insufficient for statistically stable conclusions about such words. Furthermore, for some rarer words, rare meanings of common words, and combinations of words, no data has been found. Researchers find that probabilistic models of language based on very large quantities of data are better than ones based on estimates from smaller, cleaner data sets.


The multilingual Web

The Web is clearly a multilingual corpus. It is estimated that 71% of the pages (453 million out of 634 million Web pages indexed by the Excite engine) were written in English, followed by Japanese (6.8%), German (5.1%), French (1.8%), Chinese (1.5%), Spanish (1.1%), Italian (0.9%), and Swedish (0.7%). A test to find contiguous words like "deep breath" revealed 868,631 Web pages containing the terms in
AlltheWeb AlltheWeb (sometimes referred to as FAST or FAST Search) was an Internet search engine that made its debut in mid-1999 and was closed in 2011. It grew out of ''FTP Search'', Tor Egge's doctorate thesis at the Norwegian University of Science and Te ...
. The number found through the search engines are more than three times the counts generated by the British National Corpus, indicating the significant size of the English corpus available on the Web. The massive size of text available on the Web can be seen in the analysis of controlled data in which corpora of different languages were mixed in various proportions. The estimated Web size in words by
AltaVista AltaVista was a Web search engine established in 1995. It became one of the most-used early search engines, but lost ground to Google and was purchased by Yahoo! in 2003, which retained the brand, but based all AltaVista searches on its own sear ...
saw English at the top of the list with 76,598,718,000 words. The next is German, with 7,035,850,000 words along with 6 other languages with over a billion hits. Even languages with fewer hits on the Web such as Slovenian, Croatian, Malay, and Turkish have more than 100 million words on the Web. This reveals the potential strength and accuracy of using the Web as a Corpus given its significant size, which warrants much additional research such as the project currently being carried out by the British National Corpus to exploit its scale.


Challenges

In areas of language modeling, there are limitations on the applicability of any language model as the statistics for different types of text will be different. When a language technology application is put into use (applied to a new text type), it is not certain that the language model will fare in the same way as how it would when applied to the training corpus. It is found that there are substantial variations in model performance when the training corpus changes. This lack of theory types limits the assessment of the usefulness of language-modeling work. As Web texts are easily produced (in terms of cost and time) and with many different authors working on them, it often results in little concern for accuracy. Grammatical and typographical errors are regarded as “erroneous” forms that cause the Web to be a dirty corpus. Nonetheless, it may still be useful even with some noise. The issue of whether sublanguages should be included remains unsettled. Proponents of it argue that with all sublanguages removed, it will result in an impoverished view of language. Since language is made up of lexicons, grammar and a wide array of different sublanguages, they should be included. However, it is not until recently that it became a viable option. Striking a middle ground by including some sublanguages is contentious because it's an arbitrary issue of which to include and which not. The decision of what to include in a corpus lies with corpus developers, and it has been done so with pragmatism. The desiderata and criteria used for the British National Corpus serves as a good model for a general-purpose, general-language corpus with the focus of being representative replaced with being balanced. Search engines such as
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
serves as a default means of access to the Web and its wide array of linguistics resources. However, for linguists working in the field of corpora, there presents a number of challenges. This includes the limited instances that are presented by the search engines (1,000 or 5,000 maximum); insufficient context for each instance (Google provides a fragment of around ten words); results selected according to criteria that are distorted (from a linguistic point of view) as search term in titles and headings often occupy the top results slots; inability to allow searches to be specified according to linguistic criteria, such as the citation form for a word, or word class; unreliability of statistics, with results varying according to search engine load and many other factors. At present, in view of the conflicts of priorities among the different stakeholders, the best solution is for linguists to attempt to correct these problems by themselves. This will then lead to a large number of possibilities opening in the area of harnessing the rich potential of the Web.


Representation

Despite the sheer size of the Web, it may still not be representative of all the languages and domains in the world, and neither are other corpora. However, the huge quantities of text, in numerous languages and language types on a huge range of topics makes it a good starting point that opens up to a large number of possibilities in the study of corpora.


Impact of its spread and influence

Stylistics arising from Internet usage has spread beyond the new media into other areas and platforms, including but not limited to, films,
music Music is generally defined as the art of arranging sound to create some combination of form, harmony, melody, rhythm or otherwise expressive content. Exact definitions of music vary considerably around the world, though it is an aspe ...
and
literary works Literature is any collection of written work, but it is also used more narrowly for writings specifically considered to be an art form, especially prose fiction, drama, and poetry. In recent centuries, the definition has expanded to include ...
. The infiltration of Internet stylistics is important as mass audiences are exposed to the works, reinforcing certain Internet specific language styles which may not be acceptable in standard or more formal forms of language. Apart from internet slang, grammatical errors and typographical errors are features of writing on the Internet and other CMC channels. As users of the Internet gets accustomed to these errors, it progressively infiltrates into everyday language use, in both written and spoken forms. It is also common to witness such errors in mass media works, from typographical errors in news articles to grammatical errors in advertisements and even internet slang in drama dialogues. The more the internet is incorporated into daily life, the greater the impact it has on formal language. This is especially true in modern Language Arts classes through the use of smart phones, tablets, and social media. Students are exposed to the language of the internet more than ever, and as such, the grammatical structure and slang of the internet are bleeding into their formal writing. Full immersion into a language is always the best way to learn it. Mark Lester in his book ''Teaching Grammar and Usage'' states: “The biggest single problem that basic writers have in developing successful strategies for coping with errors is simply their lack of exposure to formal written English ... We would think it absurd to expect a student to master a foreign language without extensive exposure to it.” Since students are immersed in internet language, that is the form and structure they are mirroring. In addition, the rise of the Internet and overall immersion of people within it has brought forth a new wave over internet activism that has an impact on the public every day.


Memes

The origin of the term " meme" can be traced back to Richard Dawkins, an
ethologist Ethology is the scientific study of animal behaviour, usually with a focus on behaviour under natural conditions, and viewing behaviour as an evolutionarily adaptive trait. Behaviourism as a term also describes the scientific and objectiv ...
, where he describes it as "a noun that conveys the idea of a unit of cultural transmission, or a unite of imitation". The term was later adapted to the realm of the Internet by David Beskow, Sumeet Kumar, and
Kathleen Carley Kathleen M. Carley is an American social scientist specializing in dynamic network analysis. She is a professor in the School of Computer Science in the Institute for Software Research at Carnegie Mellon University and also holds appointments ...
, wherein they labeled Internet memes as "any digital unit that transfers culture".


Mass media

There has been instances of television advertisements using Internet slang, reinforcing the penetration of Internet stylistics in everyday language use. For example, in the
Cingular AT&T Mobility LLC, also known as AT&T Wireless and marketed as simply AT&T, is an American telecommunications company. It is a wholly owned subsidiary of AT&T Inc. and provides wireless services in the United States. AT&T Mobility is the th ...
advevrtisement in the United States, acronyms such as "BFF Jill" (which means "Best Friend Forever, Jill") were used. More businesses have adopted the use of Internet slang in their advertisements as the more people are growing up using the Internet and other CMC platforms, in an attempt to relate and connect to them better. Such advertisements have received relatively enthusiastic feedback from its audiences. The use of Internet lingo has also spread into the arena of music, significantly seen in
popular music Popular music is music with wide appeal that is typically distributed to large audiences through the music industry. These forms and styles can be enjoyed and performed by people with little or no musical training.Popular Music. (2015). ''Fu ...
. A recent example is Trey Songz's lyrics for , which incorporated many Internet lingo and mentions of Twitter and texting. The spread of Internet linguistics is also present in films made by both commercial and
independent filmmakers An independent film, independent movie, indie film, or indie movie is a feature film or short film that is produced outside the major film studio system, in addition to being produced and distributed by independent entertainment companies (or, i ...
. Though primarily screened at
film festivals A film festival is an organized, extended presentation of films in one or more cinemas or screening venues, usually in a single city or region. Increasingly, film festivals show some films outdoors. Films may be of recent date and, depending upo ...
, DVDs of independent films are often available for purchase over the internet including paid-live-streamings, making access to films more easily available for the public. The very nature of commercial films being screened at public cinemas allows the wide exposure to the mainstream mass audience, resulting in a faster and wider spread of Internet slangs. The latest commercial film is titled "LOL" (acronym for ''Laugh Out Loud'' or ''Laughing Out Loud''), starring
Miley Cyrus Miley Ray Cyrus ( ; born Destiny Hope Cyrus on November 23, 1992) is an American singer, songwriter, and actress. Known for her distinctive raspy voice, her music spans across varied styles and genres, including pop, country, rock, hip ho ...
and
Demi Moore Demi Gene Moore ( ; née Guynes; born November 11, 1962) is an American actress. After making her film debut in 1981, Moore appeared on the soap opera '' General Hospital'' (1982–1984) and subsequently gained recognition as a member of the Br ...
. This movie is a 2011 remake of the Lisa Azuelos' 2008 popular French film similarly titled "
LOL (Laughing Out Loud) ''LOL (Laughing Out Loud)'' is a 2008 French comedy film directed by Lisa Azuelos and starring Sophie Marceau, Christa Theret, and Alexandre Astier. Written by Azuelos and Delgado Nans, the film is about a teenage girl whose life is split bet ...
". The use of internet slangs is not limited to the English language but extends to other languages as well. The
Korean language Korean ( South Korean: , ''hangugeo''; North Korean: , ''chosŏnmal'') is the native language for about 80 million people, mostly of Korean descent. It is the official and national language of both North Korea and South Korea (geographic ...
has incorporated the English alphabet in the formation of its slang, while others were formed from common misspellings arising from fast typing. The new Korean slang is further reinforced and brought into everyday language use by television shows such as soap operas or comedy dramas like “
High Kick Through the Roof ''High Kick Through the Roof'' () was a popular South Korean situation comedy revolving around the life of the Lee family. Cast Soon-jae's House ;Lee Soon-jae (, Lee Soon-jae) : President of the self-named food company Lee Soon-jae F&B (Food & ...
” released in 2009.


Linguistic future of the Internet

With the emergence of greater computer/Internet mediated communication systems, coupled with the readiness with which people adapt to meet the new demands of a more technologically sophisticated world, it is expected that users will continue to remain under pressure to alter their language use to suit the new dimensions of communication. As the number of Internet users increase rapidly around the world, the cultural background, linguistic habits and language differences among users are brought into the Web at a much faster pace. These individual differences among Internet users are predicted to significantly impact the future of Internet linguistics, notably in the aspect of the multilingual web. As seen from 2000 to 2010, Internet penetration has experienced its greatest growth in non-English speaking countries such as China and India and countries in Africa, resulting in more languages apart from English penetrating the Web. Also, the interaction between English and other languages is predicted to be an important area of study.Ivkovic, D., & Lotherington, H. (2009). Multilingualism in cyberspace: Conceptualising the virtual linguistic landscape. International Journal of Multilingualism, 6(1), 17–36. As global users interact with each other, possible references to different languages may continue to increase, resulting in formation of new Internet stylistics that spans across languages. Chinese and Korean languages have already experienced English language's infiltration leading to the formation of their multilingual Internet lingo. At current state, the Internet provides a form of education and promotion for minority languages. However, similar to how cross-language interaction has resulted in English language's infiltration into Chinese and Korean languages to form new slangs, minority languages are also affected by the more common languages used on the Internet (such as English and Spanish). While language interaction can cause a loss in the authentic standard of minority languages, familiarity of the majority language can also affect the minority languages in adverse ways. For example, users attempting to learn the minority language may opt to read and understand about it in a majority language and stop there, resulting in a loss instead of gain in the potential speakers of the minority language. Also, speakers of minority languages may be encouraged to learn the more common languages that are being used on the Web in order to gain access to more resources, and in turn leading to a decline in their usage of their own language. The future of endangered minority languages in view of the spread of Internet remains to be observed.


See also

* Internet slang * Stylistics (linguistics) * Standard English * Enron Corpus, publicly available database of 600,000 emails within the
Enron Corporation Enron Corporation was an American energy, commodities, and services company based in Houston, Texas. It was founded by Kenneth Lay in 1985 as a merger between Lay's Houston Natural Gas and InterNorth, both relatively small regional companies. B ...
*
Applied linguistics Applied linguistics is an interdisciplinary field which identifies, investigates, and offers solutions to language-related real-life problems. Some of the academic fields related to applied linguistics are education, psychology, communication rese ...
* Glossary of Internet-related terminology
Appendix: Internet Slang
* Internetlinguistik (German)


References


Further reading

* Aitchison, J., & Lewis, D. M. (Eds.). (2003). New Media Language. London and New York: Routledge. * Baron, N. S. (2000). Alphabet to Email: How Written English Evolved and Where It's Heading. London and New York: Routledge. * Beard, A. (2004). Language Change. London and New York: Routledge. * Biewer, C., Nesselhauf, N., & Hundt, M. (Eds.). (2006). Corpus Linguistics and the Web. The Netherlands: Rodopi. * Boardman, M. (2005). The Language of Websites. New York and London: Routledge. * Crystal, D. (2004). A Glossary of Netspeak and Textspeak. Edinburgh: Edinburgh University Press. * Crystal, D. (2004). The Language Revolution (Themes for the 21st Century). United Kingdom: Polity Press Ltd. * Crystal, D. (2006). Language and the Internet (2nd Ed.). Cambridge: Cambridge University Press. * Crystal, D. (2011). Internet Linguistics: A Student Guide. New York: Routledge. * Dieter, J. (2007). Webliteralität: Lesen und Schreiben im World Wide Web. * Enteen, J. (2010). Virtual English: Internet Use, Language, and Global Subjects. London and New York: Routledge. * Gerrand, P. (2009). Minority Languages on the Internet: Promoting the Regional Languages of Spain. VDM Verlag. * Gibbs, D., & Krause, K. (Eds.). (2006). Cyberlines 2.0.: Languages and Cultures of the Internet. Australia: James Nicholas Publishers. * Jenkins, J. (2003). World Englishes: A Resource Book for Students. London and New York: Routledge. * Macfadyen, L. P., Roche, J., & Doff, S. (2005). Communicating Across Cultures in Cyberspace : A Bibliographical Review of Intercultural Communication Online. Lit Verlag. * Thurlow, C., Lengel, L. B., & Tomic, A. (2004). Computer Mediated Communication: Social Interaction and the Internet. London: Sage Publications. {{DEFAULTSORT:Internet Linguistics Internet culture Natural language and computing Applied linguistics Sociolinguistics