HOME

TheInfoList



OR:

Search engine privacy is a subset of
internet privacy Internet privacy involves the right or mandate of personal privacy concerning the storing, re-purposing, provision to third parties, and displaying of information pertaining to oneself via Internet. Internet privacy is a subset of data privacy. Pr ...
that deals with user data being collected by
search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
. Both types of privacy fall under the umbrella of
information privacy Information privacy is the relationship between the collection and dissemination of data, technology, the public expectation of privacy, contextual information norms, and the legal and political issues surrounding them. It is also known as data ...
. Privacy concerns regarding search engines can take many forms, such as the ability for search engines to log individual search queries,
browsing history Web browsing history refers to the list of web pages a user has visited, as well as associated metadata such as page title and time of visit. It is usually stored locally by web browsers in order to provide the user with a history list to go back ...
,
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
es, and
cookies A cookie is a baked or cooked snack or dessert that is typically small, flat and sweet. It usually contains flour, sugar, egg, and some type of oil, fat, or butter. It may include other ingredients such as raisins, oats, chocolate chi ...
of users, and conducting user profiling in general. The collection of personally identifiable information (PII) of users by search engines is referred to as "tracking".Pekala, Shayna. 2017.
Privacy and User Experience in 21st Century Library Discovery
. ''Information Technology and Libraries''36(2):48–58.
This is controversial because search engines often claim to collect a user's data in order to better tailor results to that specific user and to provide the user with a better searching experience. However, search engines can also abuse and compromise its users' privacy by selling their data to advertisers for profit. In the absence of regulations, users must decide what is more important to their search engine experience: relevance and speed of results or their privacy, and choose a search engine accordingly.Lenard, Thomas M. and Paul H. Rubin. 2010. "In Defense of Data: Information and the Costs of Privacy". ''Policy & Internet''2(1):1–56. The legal framework for protecting user privacy is not very solid.Foley, Jayni. 2007. "Are Google Searches Private? An Originalist Interpretation of the Fourth Amendment in Online Communication Cases". ''Berkeley Technology Law Journal''22(1):447–75. The most popular search engines collect personal information, but other search engines that are focused on privacy have cropped up recently. There have been several well publicized breaches of search engine user privacy that occurred with companies like AOL and
Yahoo Yahoo! (, styled yahoo''!'' in its logo) is an American web services provider. It is headquartered in Sunnyvale, California and operated by the namesake company Yahoo Inc., which is 90% owned by investment funds managed by Apollo Global Manage ...
. For individuals interested in preserving their privacy, there are options available to them, such as using software like Tor which makes the user's location and personal information anonymousRidgway, Renee. 2017. "Against a Personalisation of the Self". ''Ephemera: Theory & Politics in Organization''17(2):377–97. or using a privacy focused search engine.


Privacy policies

Search engines generally publish privacy policies to inform users about what data of theirs may be collected and what purposes it may be used for. While these policies may be an attempt at transparency by search engines, many people never read themStrahilevitz, Lior Jacob and Matthew B. Kugler. 2016. "Is Privacy Policy Language Irrelevant to Consumers?" ''The Journal of Legal Studies''45(S2). and are therefore unaware of how much of their private information, like passwords and saved files, are collected from
cookies A cookie is a baked or cooked snack or dessert that is typically small, flat and sweet. It usually contains flour, sugar, egg, and some type of oil, fat, or butter. It may include other ingredients such as raisins, oats, chocolate chi ...
and may be logged and kept by the search engine.Dolin, Ron A. 2010. "Search Query Privacy: The Problem of Anonymization". ''Hastings Science and Technology Law Journal''2(2):137–82.Nissenbaum, Helen. 2011. "A Contextual Approach to Privacy Online". ''Daedalus, the Journal of the American Academy of Arts & Sciences''140(4):32–48. This ties in with the phenomenon of notice and consent, which is how many privacy policies are structured. Notice and consent policies essentially consist of a site showing the user a privacy policy and having them click to agree. This is intended to let the user freely decide whether or not to go ahead and use the website. This decision, however, may not actually be made so freely because the costs of opting out can be very high.Tene, Omer. 2008. "What Google Knows: Privacy and Internet Search Engines". ''Utah Law Review''2008(4):1433–92. Another big issue with putting the privacy policy in front of users and having them accept quickly is that they are often very hard to understand, even in the unlikely case that a user decides to read them. Privacy minded search engines, such as
DuckDuckGo DuckDuckGo (DDG) is an internet search engine that emphasizes protecting searchers' privacy and avoiding the filter bubble of personalized search results. DuckDuckGo does not show search results from content farms. It uses various APIs o ...
, state in their privacy policies that they collect much less data than search engines such as
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
or Yahoo, and may not collect any. As of 2008, search engines were not in the business of selling user data to third parties, though they do note in their privacy policies that they comply with government subpoenas.


Google and Yahoo

Google, founded in 1998, is the most widely used search engine, receiving billions and billions of search queries every month. Google logs all search terms in a database along with the date and time of search, browser and
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also i ...
, IP address of user, the Google cookie, and the URL that shows the search engine and search query.Church, Peter and Georgina Kon. 2007. "Google at the Heart of a Data Protection Storm". ''Computer Law & Security Report''23(5):461–65. The privacy policy of Google states that they pass user data on to various affiliates, subsidiaries, and "trusted" business partners. Yahoo, founded in 1995, also collects user data. It is a well-known fact that users do not read privacy policies, even for services that they use daily, such as
Yahoo! Mail Yahoo! Mail is an email service launched on October 8, 1997, by the American company Yahoo, Inc. The service is free for personal use, with an optional monthly fee for additional features. Business email was previously available with the Yahoo! ...
and
Gmail Gmail is a free email service provided by Google. As of 2019, it had 1.5 billion active users worldwide. A user typically accesses Gmail in a web browser or the official mobile app. Google also supports the use of email clients via the POP and ...
. This persistent failure of consumers to read these privacy policies can be disadvantageous to them because while they may not pick up on differences in the language of privacy policies, judges in court cases certainly do. This means that search engine and email companies like Google and Yahoo are technically able to keep up the practice of targeting advertisements based on email content since they declare that they do so in their privacy policies. A study was done to see how much consumers cared about privacy policies of Google, specifically
Gmail Gmail is a free email service provided by Google. As of 2019, it had 1.5 billion active users worldwide. A user typically accesses Gmail in a web browser or the official mobile app. Google also supports the use of email clients via the POP and ...
, and their detail, and it determined that users often thought that Google's practices were somewhat intrusive but that users would not often be willing to counteract this by paying a premium for their privacy.


DuckDuckGo

DuckDuckGo, founded in 2008, claims to be privacy focused.Hands, Africa. 2012. "Duckduckgo http://www.duckduckgo.comor http://www.ddg.gg". ''Technical Services Quarterly''29(4): 345-347.Allen, Jeffrey and Ashley Hallene. 2018. "Privacy and Security Tips for Avoiding Financial Chaos". ''American Journal of Family Law'' 101–7. DuckDuckGo does not collect or share any personal information of users, such as IP addresses or cookies, which other search engines usually do log and keep for some time. It also does not have spam, and protects user privacy further by anonymizing search queries from the website the user chooses and using encryption. Similarly privacy oriented search engines include
Startpage A home page (or homepage) is the main web page of a website. The term may also refer to the start page shown in a web browser when the application software, application first opens. Usually, the home page is located at the root of the website's ...
and Disconnect.


Types of data collected by search engines

Most search engines can, and do, collect personal information about their users according to their own privacy policies. This user data could be anything from location information to cookies, IP addresses, search query histories, click-through history, and online fingerprints.Wicker, Jörg and Stefan Kramer. 2017. "The Best Privacy Defense Is a Good Privacy Offense: Obfuscating a Search Engine User's Profile". ''Data Mining and Knowledge Discovery''31(5):1419–43.Squitieri, Chad. 2015. "Confronting Big Data: Applying the Confrontation Clause to Government Data Collection". ''Virginia Law Review''101(7):2011–49. This data is often stored in large databases, and users may be assigned numbers in an attempt to provide them with anonymity. Data can be stored for an extended period of time. For example, the data collected by Google on its users is retained for up to 9 months.Viejo, Alexandre and Jordi Castellà-Roca. 2010. "Using Social Networks to Distort Users' Profiles Generated by Web Search Engines". ''Computer Networks'' 54(9):1343–57.Evans, David S. n.d. "The Online Advertising Industry: Economics, Evolution, and Privacy". ''The Journal of Economic Perspectives''23(3):37–60. Some studies state that this number is actually 18 months.Chiru, Claudiu. 2016. "Search Engines: Ethical Implications". ''Economics, Management, and Financial Markets''11(1):162–67. This data is used for various reasons such as optimizing and personalizing search results for users, targeting advertising, and trying to protect users from scams and phishing attacks. Such data can be collected even when a user is not logged in to their account or when using a different IP address by using cookies.


Uses


User profiling and personalization

What search engines often do once they have collected information about a user's habits is to create a profile of them, which helps the search engine decide which links to show for different search queries submitted by that user or which ads to target them with. An interesting development in this field is the invention of automated learning, also known as
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
. Using this, search engines can refine their profiling models to more accurately predict what any given user may want to click on by doing
A/B testing A/B testing (also known as bucket testing, split-run testing, or split testing) is a user experience research methodology. A/B tests consist of a randomized experiment that usually involves two variants (A and B), although the concept can be al ...
of results offered to users and measuring the reactions of users.van Otterlo, Martijn. 2014. "Automated Experimentation in Walden 3.0. : The Next Step in Profiling, Predicting, Control and Surveillance". ''Surveillance & Society'' 12(2):255–72. Companies like Google,
Netflix Netflix, Inc. is an American subscription video on-demand over-the-top streaming service and production company based in Los Gatos, California. Founded in 1997 by Reed Hastings and Marc Randolph in Scotts Valley, California, it offers a fi ...
,
YouTube YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second mo ...
, and
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technolog ...
have all started personalizing results more and more. One notable example is how Google Scholar takes into account the publication history of a user in order to produce results it deems relevant. Personalization also occurs when Amazon recommends books or when
IMDb IMDb (an abbreviation of Internet Movie Database) is an online database of information related to films, television series, home videos, video games, and streaming content online – including cast, production crew and personal biographies, ...
suggests movies by using previously collected information about a user to predict their tastes. For personalization to occur, a user need not even be logged into their account.


Targeted advertising

The
internet advertising Online advertising, also known as online marketing, Internet advertising, digital advertising or web advertising, is a form of marketing and advertising which uses the Internet to promote products and services to audiences and platform users. ...
company
DoubleClick DoubleClick Inc. was an advertisement company that developed and provided Internet ad serving services from 1995 until its acquisition by Google in March 2008. DoubleClick offered technology products and services that were sold primarily to adv ...
, which helps advertisers target users for specific ads, was bought by Google in 2008 and was a subsidiary until June 2018, when Google rebranded and merged DoubleClick into its
Google Marketing Platform Google Marketing Platform is an online advertising and analytics platform developed by Google and launched on July 24, 2018. It unifies DoubleClick's advertising services (acquired in March 2008) and Google's own advertising and analytics services ...
. DoubleClick worked by depositing cookies on user's computers that would track sites they visited with DoubleClick ads on them. There was a privacy concern when Google was in the process of acquiring DoubleClick that the acquisition would let Google create even more comprehensive profiles of its users since they would be collecting data about search queries and additionally tracking websites visited. This could lead to users being shown ads that are increasingly effective with the use of behavioral targeting. With more effective ads comes the possibility of more purchases from consumers that they may not have made otherwise. In 1994, a conflict between selling ads and relevance of results on search engines began. This was sparked by the development of the cost-per-click model, which challenged the methods of the already-created cost-per-mille model. The cost-per-click method was directly related to what users searched, whereas the cost-per-mille method was directly influenced by how much a company could pay for an ad, no matter how many times people interacted with it.


Improving search quality

Besides ad targeting and personalization, Google also uses data collected on users to improve the quality of searches. Search result click histories and query logs are crucial in helping search engines optimize search results for individual users. Search logs also help search engines in the development of the algorithms they use to return results, such as Google's well known
PageRank PageRank (PR) is an algorithm used by Google Search to rank webpages, web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. A ...
. An example of this is how Google uses databases of information to refine Google Spell Checker.


Privacy organizations

There are many who believe that user profiling is a severe invasion of user privacy, and there are organizations such as the
Electronic Privacy Information Center Electronic Privacy Information Center (EPIC) is an independent nonprofit research center in Washington, D.C. EPIC's mission is to focus public attention on emerging privacy and related human rights issues. EPIC works to protect privacy, freedom ...
(EPIC) and
Privacy International Privacy International (PI) is a UK-based registered charity that defends and promotes the right to privacy across the world. First formed in 1990, registered as a non-profit company in 2002 and as a charity in 2012, PI is based in London. Its c ...
that are focused on advocating for user privacy rights. In fact, EPIC filed a complaint in 2007 with the Federal Trade Commission claiming that Google should not be able to acquire DoubleClick on the grounds that it would compromise user privacy.


Users' perception of privacy

Experiments have been done to examine consumer behavior when given information on the privacy of retailers by integrating privacy ratings with search engines.Tsai, Janice Y., Serge Egelman, Lorrie Cranor, and Alessandro Acquisti. 2011. "The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study". ''Information Systems Research''22(2):254–68. Researchers used a search engine for the treatment group called Privacy Finder, which scans websites and automatically generates an icon to show the level of privacy the site will give the consumer as it compares to the privacy policies that consumer has specified that they prefer. The results of the experiment were that subjects in the treatment group, those who were using a search engine that indicated privacy levels of websites, purchased products from websites that gave them higher levels of privacy, whereas the participants in the control groups opted for the products that were simply the cheapest. The study participants also were given financial incentive because they would get to keep leftover money from purchases. This study suggests that since participants had to use their own credit cards, they had a significant aversion to purchasing products from sites that did not offer the level of privacy they wanted, indicating that consumers value their privacy monetarily.


Ethical debates

Many individuals and scholars have recognized the ethical concerns regarding search engine privacy.


Pro data collection

The collection of user data by search engines can be viewed as a positive practice because it allows the search engine to personalize results. This implies that users would receive more relevant results, and be shown more relevant advertisements, when their data, such as past search queries, location information, and clicks, is used to create a profile for them. Also, search engines are generally free of charge for users and can remain afloat because one of their main sources of revenue is advertising, which can be more effective when targeted.


Anti-data collection

This collection of user data can also be seen as an overreach by private companies for their own financial gain or as an intrusive surveillance tactic. Search engines can make money using targeted advertising because advertisers are willing to pay a premium to present their ads to the most receptive consumers. Also, when a search engine collects and catalogs large amounts of data about its users, there is the potential for it to be leaked accidentally or breached. The government can also subpoena user data from search engines when they have databases of it. Search query database information may also be
subpoena A subpoena (; also subpœna, supenna or subpena) or witness summons is a writ issued by a government agency, most often a court, to compel testimony by a witness or production of evidence under a penalty for failure. There are two common types of ...
ed by private
litigants - A lawsuit is a proceeding by a party or parties against another in the Civil law (common law), civil court of law. The archaic term "suit in law" is found in only a small number of laws still in effect today. The term "lawsuit" is used in re ...
for use in civil cases, such as divorces or employment disputes.


Data and privacy breaches


AOL search data leak

One major controversy regarding search engine privacy was the
AOL search data leak In 2006, the internet company AOL released a large excerpt from its Web search query logs to the public. AOL did not identify users in the report, but personally identifiable information was present in many of the queries. This allowed some use ...
of 2006. For academic and research purposes, AOL made public a list of about 20 million search queries made by about 650,000 unique users. Although they assigned unique identification numbers to the users instead of attaching names to each query, it was still possible to ascertain the true identities of many users simply by analyzing what they had searched, including locations near them and names of friends and family members. A notable example of this was how the
New York Times ''The New York Times'' (''the Times'', ''NYT'', or the Gray Lady) is a daily newspaper based in New York City with a worldwide readership reported in 2020 to comprise a declining 840,000 paid print subscribers, and a growing 6 million paid ...
identified Thelma Arnold through "reverse searching". Users also sometimes do " ego searches" where they search themselves to see what information about them is on the internet, making it even easier to identify supposedly anonymous users. Many of the search queries released by AOL were incriminating or seemingly extremely private, such as "how to kill your wife" and "can you adopt after a suicide attempt". This data has since been used in several experiments that attempt to measure the effectiveness of user privacy solutions.


Google and Yahoo

Both Google and Yahoo were subjects of a Chinese hack in 2010.Trautman, Lawrence J. and Peter C. Ormerod. 2016. "Corporate Directors' and Officers' Cybersecurity Standard of Care: The Yahoo Data Breach". ''SSRN Electronic Journal''66(5). While Google responded to the situation seriously by hiring new cybersecurity engineers and investing heavily into securing user data, Yahoo took a much more lax approach. Google started paying hackers to find vulnerabilities in 2010 while it took Yahoo until 2013 to follow suit. Yahoo was also identified in the Snowden data leaks as a common hacking target for spies of various nations, and Yahoo still did not give its newly hired chief information security officer the resources to really effect change within the company. In 2012, Yahoo hired
Marissa Mayer Marissa Ann Mayer (; born May 30, 1975) is an American businesswoman and investor. She is an information technology executive, and co-founder of Sunshine Contacts. Mayer formerly served as the president and chief executive officer of Yahoo!, a p ...
, previously a Google employee, to be the new CEO, but she chose not to invest much in the security infrastructure of Yahoo and went as far as to refuse the implementation of a basic and standard security measure to force the reset of all passwords after a breach. Yahoo is known for being the subject of multiple breaches and hacks that have compromised large amounts of user data. As of late 2016, Yahoo had announced that at least 1.5 billion user accounts had been breached during 2013 and 2014. The breach of 2013 compromised over a billion accounts while the breach of 2014 included about 500 million accounts. The data compromised in the breaches included personally identifiable information such as phone numbers, email addresses, and birth dates as well as information like
security question A security question is form of shared secret used as an authenticator. It is commonly used by banks, cable companies and wireless providers as an extra security layer. History Financial institutions have used questions to authenticate custome ...
s (used to reset passwords) and
encrypted In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can deci ...
passwords. Yahoo made a statement saying that their breaches were a result of state sponsored actors, and in 2017, two Russian intelligence officers were indicted by the
United States Department of Justice The United States Department of Justice (DOJ), also known as the Justice Department, is a federal executive department of the United States government tasked with the enforcement of federal law and administration of justice in the United Stat ...
as part of a conspiracy to hack Yahoo and steal user data. As of 2016, the Yahoo breaches of 2013 and 2014 were the largest of all time. In October 2018, there was a Google+ data breach that potentially affected about 500,000 accounts which led to the shutdown of the
Google+ Google+ (pronounced and sometimes written as Google Plus; sometimes called G+) was a social network owned and operated by Google. The network was launched on June 28, 2011, in an attempt to challenge other social networks, linking other Google p ...
platform.


Government subpoenas of data

The government may want to subpoena user data from search engines for any number of reasons, which is why it a big threat to user privacy. In 2006, they wanted it as part of their defense of COPA, and only Google refused to comply. While protecting the online privacy of children may be an honorable goal, there are concerns about whether the government should have access to such personal data to achieve it. At other times, they may want it for national security purposes; access to big databases of search queries in order to prevent terrorist attacks is a common example of this. Whatever the reason, it is clear that the fact that search engines do create and maintain these databases of user data is what makes it possible for the government to access it. Another concern regarding government access to search engine user data is "function creep", a term that here refers to how data originally collected by the government for national security purposes may eventually be used for other purposes, such as debt collection. This would indicate to many a government overreach. While protections for search engine user privacy have started developing recently, the government has increasingly been on the side that wants to ensure search engines retain data, making users less protected and their data more available for anyone to subpoena.


Methods for increasing privacy


Switching search engines

A different, although popular, route for a privacy centered user to take is to simply start using a privacy oriented search engine, such as DuckDuckGo. This search engine maintains the privacy of its users by not collecting data on or tracking its users. While this may sound simple, users must take into account the trade-off between privacy and relevant results when deciding to switch search engines. Results to search queries can be very different when the search engine has no search history to aid it in
personalization Personalization (broadly known as customization) consists of tailoring a service or a product to accommodate specific individuals, sometimes tied to groups or segments of individuals. A wide variety of organizations use personalization to improv ...
.


Using privacy oriented browsers

Mozilla Mozilla (stylized as moz://a) is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, spreads and supports Mozilla products, thereby promoting exclusively free software and open standards, w ...
is known for its beliefs in protecting user privacy on
Firefox Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current ...
. Mozilla Firefox users have the capability to delete the tracking cookie that Google places on their computer, making it much harder for Google to group data. Firefox also has a button called "Clear Private Data", which allows users to have more control over their settings.
Internet Explorer Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical web browsers developed by Microsoft which was used in the Windows line of operating systems ( ...
users have this option as well. When using a browser like Google Chrome or Safari, users also have the option to browse in "incognito" or "private browsing" modes respectively. When in these modes, the user's browsing history and cookies are not collected.


Opting out

The Google, Yahoo!, AOL, and MSN search engines all allow users to opt out of the behavioral targeting they use. Users can also delete search and browsing history at any time. The
Ask.com Ask.com (originally known as Ask Jeeves) is a question answering–focused e-business founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original software was implemented by Gary Chevsky, from his own design. Wa ...
search engine also has AskEraser, which, when used, purges user data from their servers. Deleting a user's profile and history of data from search engine logs also helps protect user privacy in the event a government agency wants to subpoena it. If there are no records, there is nothing the government can access. It is important to note that simply deleting your browsing history does not delete all the information the search engine has on you, some companies do not delete the data associated with your account when you clear your browsing history. For companies that do delete user data, they usually do not delete all of it keeping records of how you used the search engine.


Social network solution

An innovative solution, proposed by researchers Viejo and Castellà-Roca, is a social network solution whereby user profiles are distorted. In their plan, each user would belong to a group, or network, of people who all use the search engine. Every time somebody wanted to submit a search query, it would be passed on to another member of the group to submit on their behalf until someone submitted it. This would ideally lead to all search queries being divvied up equally between all members of the network. This way, the search engine cannot make a useful profile of any individual user in the group since it has no way to discern which query actually belonged to each user.


Delisting and reordering

After the '' Google Spain v. AEPD'' case, it was established that people had the right to request that search engines delete personal information from their search results in compliance with other European data protection regulations. This process of simply removing certain search results is called de-listing.de Mars, Sylvia and Patrick O'Callaghan. 2016. "Privacy and Search Engines: Forgetting or Contextualizing?" ''Journal of Law and Society'' 43(2):257–84. While effective in protecting the privacy of those who wish information about them to not be accessed by anyone using a search engine, it does not necessarily protect the contextual integrity of search results. For data that is not highly sensitive or compromising, reordering search results is another option where people would be able to rank how relevant certain data is at any given point in time, which would then alter results given when someone searched their name.


Anonymity networks

A sort of
DIY "Do it yourself" ("DIY") is the method of building, modifying, or repairing things by oneself without the direct aid of professionals or certified experts. Academic research has described DIY as behaviors where "individuals use raw and sem ...
option for privacy minded users is to use a software like Tor, which is an anonymity network. Tor functions by encrypting user data and routing queries through thousands of relays. While this process is effective at masking IP addresses, it can slow the speed of results. While Tor may work to mask IP addresses, there have also been studies that show that a simulated attacker software could still match search queries to users even when anonymized using Tor.Petit, Albin et al. 2016. "SimAttack: Private Web Search under Fire". ''Journal of Internet Services and Applications'' 7(2):1–17.Peddinti, Sai Teja and Nitesh Saxena. 2014. "Web Search Query Privacy: Evaluating Query Obfuscation and Anonymizing Networks". ''Journal of Computer Security''22(1):155–99.


Unlinkability and indistinguishability

Unlinkability and indistinguishability are also well-known solutions to search engine privacy, although they have proven somewhat ineffective in actually providing users with anonymity from their search queries. Both unlinkability and indistinguishability solutions try to anonymize search queries from the user who made them, therefore making it impossible for the search engine to definitively link a specific query with a specific user and create a useful profile on them. This can be done in a couple of different ways.


Unlinkability

Another way for the user to hide information such as their IP address from the search engine, which is an unlinkability solution. This is perhaps more simple and easy for the user because any user can do this by using a
VPN A virtual private network (VPN) extends a private network across a public network and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network. The be ...
, although it still does not guarantee total privacy from the search engine.


Indistinguishability

One way is for the user to use a plugin or software that generates multiple different search queries for every real search query the user makes. This is an indistinguishability solution, and it functions by obscuring the real searches a user makes so that a search engine cannot tell which queries are the software's and which are the user's. Then, it is more difficult for the search engine to use the data it collects on a user to do things like target ads.


Legal rights and court cases

Being that the internet and search engines are relatively recent creations, no solid legal framework for privacy protections in terms of search engines has been put in place. However, scholars do write about the implications of existing laws on privacy in general to inform what right to privacy search engine users have. As this is a developing field of law, there have been several lawsuits with respect to the privacy search engines are expected to afford to their users.


United States


The Fourth Amendment

The Fourth Amendment is well known for the protections it offers citizens from unreasonable searches and seizures, but in ''
Katz v. United States ''Katz v. United States'', 389 U.S. 347 (1967), was a landmark decision of the U.S. Supreme Court in which the Court redefined what constitutes a "search" or "seizure" with regard to the protections of the Fourth Amendment to the U.S. Constituti ...
'' (1967), these protections were extended to cover intrusions of privacy of individuals, in addition to simply intrusion of property and people. Privacy of individuals is a broad term, but it is not hard to imagine that it includes the online privacy of an individual.


The Sixth Amendment

The
Confrontation Clause The Confrontation Clause of the Sixth Amendment to the United States Constitution provides that ''"in all criminal prosecutions, the accused shall enjoy the right…to be confronted with the witnesses against him."'' The right only applies to cri ...
of the Sixth Amendment is applicable to the protection of big data from government surveillance. The Confrontation Clause essentially states that defendants in criminal cases have the right to confront witnesses who provide testimonial statements. If a search engine company like Google gives information to the government to prosecute a case, these witnesses are the Google employees involved in the process of selecting which data to hand over to the government. The specific employees who must be available to be confronted under the Confrontation Clause are the producer who decides what data is relevant and provides the government with what they've asked for, the Google analyst who certifies the proper collection and transmission of data, and the custodian who keeps records. The data these employees of Google curate for trial use is then thought of as testimonial statement. The overall effectiveness of the Confrontation Clause on search engine privacy is that it places a check on how the government can use big data and provides defendants with protection from human error.


''Katz v. United States''

This 1967 case is prominent because it established a new interpretation of privacy under the Fourth Amendment, specifically that people had a reasonable expectation of it. ''Katz v. United States'' was about whether or not it was constitutional for the government to listen to and record, electronically using a
pen register A pen register, or dialed number recorder (DNR), is an electronic device that records all numbers called from a particular telephone line. The term has come to include any device or program that performs similar functions to an original pen regi ...
, a conversation Katz had from a public phone booth. The court ruled that it did violate the Fourth Amendment because the actions of the government were considered a "search" and that the government needed a warrant. When thinking about search engine data collected about users, the way telephone communications were classified under ''Katz v. United States'' could be a precedent for how it should be handled. In ''Katz v. United States'', public telephones were deemed to have a "vital role" in private communications. This case took place in 1967, but surely nowadays, the internet and search engines have this vital role in private communications, and people's search queries and IP addresses can be thought of as analogous to the private phone calls placed from public booths.


''United States v. Miller''

This 1976 Supreme Court case is relevant to search engine privacy because the court ruled that when third parties gathered or had information given to them, the Fourth Amendment was not applicable. Jayni Foley argues that the ruling of '' United States v. Miller'' implies that people cannot have an expectation of privacy when they provide information to third parties. When thinking about search engine privacy, this is important because people willingly provide search engines with information in the form of their search queries and various other data points that they may not realize are being collected.


''Smith v. Maryland''

In the Supreme Court case '' Smith v. Maryland'' of 1979, the Supreme Court went off the precedent set in the 1976 ''United States v. Miller'' case about assumption of risk. The court ruled that the Fourth Amendment did not prevent the government from monitoring who dialed which phone numbers by using a pen register because it did not qualify as a "search". Both the ''United States v. Miller'' and the ''Smith v. Maryland'' cases have been used to prevent users from the privacy protections offered under the Fourth Amendment from the records that
internet service provider An Internet service provider (ISP) is an organization that provides services for accessing, using, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, non-profit, or otherwise privat ...
s (ISPs) keep. This is also articulated in the Sixth Circuit ''Guest v. Leis'' case as well as the ''United States v. Kennedy'' case where the courts ruled that Fourth Amendment protections did not apply to ISP customer data since they willingly provided ISPs with their information just by using the services of ISPs. Similarly, the current legal structure regarding privacy and assumption of risk can be interpreted to mean that users of search engines cannot expect privacy in regards to the data they communicate by using search engines.


Electronic Communication Privacy Act

The
Electronic Communications Privacy Act Electronic Communications Privacy Act of 1986 (ECPA) was enacted by the United States Congress to extend restrictions on government wire taps of telephone calls to include transmissions of electronic data by computer ( ''et seq.''), added new pr ...
(ECPA) of 1986 was passed by Congress in an effort to start creating a legal structure for privacy protections in the face of new forms of technologies, although it was by no means comprehensive because there are considerations for current technologies that Congress never imagined in 1986 and could account for. The EPCA does little to regulate ISPs and mainly prevents government agencies from gathering information stored by ISPs without a warrant. What the EPCA does not do, unsurprisingly because it was enacted before internet usage became a common occurrence, is say anything about search engine privacy and the protections users are afforded in terms of their search queries.


''Gonzales v. Google Inc.''

The background of this 2006 case is that the government was trying to bolster its defense for the Child Online Protection Act (COPA). It was doing a study to see how effective its filtering software was in regards to child pornography. To do this, the government subpoenaed search data from Google, AOL, Yahoo!, and Microsoft to use in its analysis and to show that people search information that is potentially compromising to children. This search data that the government wanted included both the URLs that appeared to users and the actual search queries of users. Of the search engines the government subpoenaed to produce search queries and URLs, only Google refused to comply with the government, even after the request was reduced in size. Google itself claimed that handing over these logs was to hand over personally identifiable information and user identities. The court ruled that Google had to hand over 50,000 randomly selected URLs to the government but not search queries because that could seed public distrust of the company and therefore compromise its business.


Law of Confidentiality

While not a strictly defined law enacted by Congress, the Law of Confidentiality is
common law In law, common law (also known as judicial precedent, judge-made law, or case law) is the body of law created by judges and similar quasi-judicial tribunals by virtue of being stated in written opinions."The common law is not a brooding omnipres ...
that protects information shared by a party who has trust and an expectation of privacy from the party they share the information with. If the content of search queries and the logs they are stored in is thought of in the same manner as information shared with a physician, as it is similarly confidential, then it ought to be afforded the same privacy protections.


Europe


''Google Spain v. AEPD''

The European Court of Justice ruled in 2014 that its citizens had the "
Right to Be Forgotten The right to be forgotten (RTBF) is the right to have private information about a person be removed from Internet searches and other directories under some circumstances. The concept has been discussed and put into practice in several jurisdiction ...
" in the ''Google Spain SL v. Agencia Española de Protección de Datos'' case, which meant that they had the right to demand search engines wipe any data collected on them. While this single court decision did not directly establish the "right to be forgotten", the court interpreted existing law to mean that people had the right to request that some information about them be wiped from search results provided by search engine companies like Google. The background of this case is that one Spanish citizen, Mario Costeja Gonzalez, set out to erase himself from Google's search results because they revealed potentially compromising information about his past debts. In the ruling in favor of Mario Costeja Gonzalez, the court noted that search engines can significantly impact the privacy rights of many people and that Google controlled the
dissemination To disseminate (from lat. ''disseminare'' "scattering seeds"), in the field of communication, is to broadcast a message to the public without direct feedback from the audience. Meaning Dissemination takes on the theory of the traditional view ...
of personal data. This court decision did not claim that all citizens should be able to request that information about them be completely wiped from Google at any time, but rather that there are specific types of information, particularly information that is obstructing one's right to be forgotten, that do not need to be so easily accessible on search engines.


General Data Protection Regulation (GDPR)

The
GDPR The General Data Protection Regulation (GDPR) is a European Union regulation on data protection and privacy in the EU and the European Economic Area (EEA). The GDPR is an important component of EU privacy law and of human rights law, in parti ...
is a European regulation that was put in place to protect data and provide privacy to European citizens, regardless of whether they are physically in the
European Union The European Union (EU) is a supranational political and economic union of member states that are located primarily in Europe. The union has a total area of and an estimated total population of about 447million. The EU has often been de ...
. This means that countries around the globe have had to comply with their rules so that any European citizen residing in them is afforded the proper protections. The regulation became enforceable in May 2018.


See also


References

{{Privacy Data protection Internet privacy Internet search engines