
A search engine is a
software system
A software system is a system of intercommunicating software component, components based on forming part of a computer system (a combination of Computer hardware, hardware and software). It "consists of a number of separate Computer program, program ...
that is designed to carry out
web searches. They search the
World Wide Web
The World Wide Web (WWW), commonly known as the Web, is an information system
An information system (IS) is a formal, sociotechnical
Sociotechnical systems (STS) in organizational development is an approach to complex organizational w ...
in a systematic way for particular information specified in a textual
web search query
A web search query is a query based on a specific search term that a user enters into a web search engine
A search engine is a software system that is designed to carry out web searches (Internet searches), which means to search the World Wide ...
. The
search results
Search Engine Results Pages (SERP) are the pages displayed by search engines in response to a query by a user. The main component of the SERP is the listing of results that are returned by the search engine (computing), search engine in response t ...
are generally presented in a line of results, often referred to as
search engine results page
Search Engine Results Pages (SERP) are the pages displayed by search engines in response to a query by a user. The main component of the SERP is the listing of results that are returned by the search engine
A search engine is a software syste ...
s (SERPs) The information may be a mix of links to
web page
A web page (or webpage) is a hypertext
File:Douglas Engelbart in 2008.jpg, Douglas Engelbart in 2009, at the 40th anniversary celebrations of "The Mother of All Demos" in San Francisco, a 90-minute 1968 presentation of the NLS (computer sy ...

s, images, videos,
infographic
Infographics (a clipped compound of "information" and "graphics") are graphic visual representations of information, data, or knowledge intended to present information quickly and clearly.Doug Newsom and Jim Haynes (2004). ''Public Relations Writ ...

s, articles, research papers, and other types of files. Some search engines also
mine data available in
database
In , a database is an organized collection of stored and accessed electronically from a . Where databases are more complex they are often developed using formal techniques.
The (DBMS) is the that interacts with s, applications, and the data ...

s or open directories. Unlike
web directories
A web directory or link directory is an online list or catalog of website
A website (also written as web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web serv ...
, which are maintained only by human editors, search engines also maintain
real-time information by running an
algorithm
In and , an algorithm () is a finite sequence of , computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always and are used as specifications for performing s, , , and other ...

on a
web crawler
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot
An Internet bot, web
Web most often refers to:
* Spider web
A spider web, spiderweb, spider's web, or cobweb (from the archaic word ...
. Internet content that is not capable of being searched by a web search engine is generally described as the
deep web
The deep web, invisible web, or hidden web are parts of the World Wide Web whose contents are not Search engine indexing, indexed by standard web search engine, web search-engines. The opposite term to the deep web is the "surface web", which is ...

.
History
Pre-1990s
A system for locating published information intended to overcome the ever increasing difficulty of locating information in ever-growing centralized indices of scientific work was described in 1945 by
Vannevar Bush
Vannevar Bush ( ; March 11, 1890 – June 28, 1974) was an American engineer, inventor and science administrator, who during World War II, World War II headed the U.S. Office of Scientific Research and Development (OSRD), through which almo ...

, who wrote an article in
The Atlantic Monthly
''The Atlantic'' is an American magazine and multi-platform publisher. It was founded in 1857 in Boston, Massachusetts, as ''The Atlantic Monthly'', a literary and cultural magazine that published leading writers' commentary on education, the ...

titled "
As We May Think
"As We May Think" is a 1945 essay by Vannevar Bush
Vannevar Bush ( ; March 11, 1890 – June 28, 1974) was an American engineer, inventor and science administrator, who during World War II headed the U.S. Office of Scientific Research ...
" in which he envisioned libraries of research with connected annotations not unlike modern
hyperlink
In computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and soft ...

s.
Link analysis
In network theory, link analysis is a data-analysis technique used to evaluate relationships (connections) between nodes. Relationships may be identified among various types of nodes (objects), including organizations, people and Financial transacti ...
would eventually become a crucial component of search engines through algorithms such as
Hyper Search and
PageRank
PageRank (PR) is an algorithm used by Google Search to rank webpages, web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. Ac ...

.
1990s: Birth of search engines
The first internet search engines predate the debut of the Web in December 1990:
WHOIS
WHOIS (pronounced as the phrase "who is") is a query and response that is widely used for querying s that store the registered users or assignees of an resource, such as a , an block or an , but is also used for a wider range of other informati ...
user search dates back to 1982, and the
Knowbot Information Service
The Knowbot Information Service (KIS), also known as netaddress, is an Internet user search engine that debuted in December 1989. Although it searched users, not content, it could be argued to be the first search engine on the Internet as it querie ...
multi-network user search was first implemented in 1989. The first well documented search engine that searched content files, namely
FTP
The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data c ...

files, was
Archie
Archie is a masculine given name, a diminutive
A diminutive is a root word that has been modified to convey a slighter degree of its root meaning, to convey the smallness of the object or quality named, or to convey a sense of intimacy or endearm ...
, which debuted on 10 September 1990.
Prior to September 1993, the
World Wide Web
The World Wide Web (WWW), commonly known as the Web, is an information system
An information system (IS) is a formal, sociotechnical
Sociotechnical systems (STS) in organizational development is an approach to complex organizational w ...
was entirely indexed by hand. There was a list of
webserver
A web server is computer software and underlying hardware that accepts requests via Hypertext Transfer Protocol, HTTP, the network protocol created to distribute web pages, or its secure variant HTTPS. A user agent, commonly a web browser or web ...

s edited by
Tim Berners-Lee
Sir Timothy John Berners-Lee (born 8 June 1955), also known as TimBL, is an English computer scientist best known as the inventor of the World Wide Web
upright=1.35, A global map of the web index for countries in 2014
The World Wide W ...

and hosted on the
CERN
The European Organization for Nuclear Research (french: Organisation européenne pour la recherche nucléaire), known as CERN (; ; derived from the name ), is a European research organization that operates the largest particle physics laborato ...
webserver. One snapshot of the list in 1992 remains, but as more and more web servers went online the central list could no longer keep up. On the
NCSA site, new servers were announced under the title "What's New!"
The first tool used for searching content (as opposed to users) on the
Internet
The Internet (or internet) is the global system of interconnected s that uses the (TCP/IP) to communicate between networks and devices. It is a ' that consists of private, public, academic, business, and government networks of local to ...

was
Archie
Archie is a masculine given name, a diminutive
A diminutive is a root word that has been modified to convey a slighter degree of its root meaning, to convey the smallness of the object or quality named, or to convey a sense of intimacy or endearm ...
.
["Internet History - Search Engines" (from ]Search Engine Watch
Search Engine Watch (SEW) provides news and information about search engines and search engine marketing.
Search Engine Watch was started by Danny Sullivan in 1996. In 1997, Sullivan sold it for an undisclosed amount to MecklerMedia (now WebM ...
), Universiteit Leiden, Netherlands, September 2001, web
LeidenU-Archie
The name stands for "archive" without the "v".,
It was created by
Alan Emtage
Alan Emtage (born November 27, 1964) is a Canadian computer scientist
A computer scientist is a person
A person (plural people or persons) is a being that has certain capacities or attributes such as reason, morality, consciousness or self-co ...
computer science
Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application.
Computer science is the study of , , and . Computer science ...
student at
McGill University
McGill University is a public university, public research university located in Montreal, Quebec, Canada. Founded in 1821 by royal charter granted by George IV, King George IV,Frost, Stanley Brice. ''McGill University, Vol. I. For the Advanceme ...
in
Montreal, Quebec
Montreal ( ; officially Montréal, ) is the second-most populous city in Canada
Canada is a country in the northern part of North America. Its Provinces and territories of Canada, ten provinces and three territories extend from the Atla ...
, Canada. The program downloaded the directory listings of all the files located on public anonymous FTP (
File Transfer Protocol
The File Transfer Protocol (FTP) is a standard communication protocol
A communication protocol is a system of rules that allows two or more entities of a communications system
400px, Communication system
A communications system or com ...
) sites, creating a searchable
database
In , a database is an organized collection of stored and accessed electronically from a . Where databases are more complex they are often developed using formal techniques.
The (DBMS) is the that interacts with s, applications, and the data ...

of file names; however,
Archie Search Engine
Archie is a tool for indexing File Transfer Protocol, FTP archives, allowing users to more easily identify specific files. It is considered the first Internet Search engine (computing), search engine. The original implementation was written in 1 ...
did not index the contents of these sites since the amount of data was so limited it could be readily searched manually.
The rise of
Gopher
Pocket gophers, commonly referred to simply as gophers, are burrowing
An Eastern chipmunk at the entrance of its burrow
A burrow is a hole or tunnel excavated into the ground by an animal
Animals (also called Metazoa) are multicellular e ...
(created in 1991 by
Mark McCahill at the
University of Minnesota
The University of Minnesota, Twin Cities (the U of M or Minnesota) is a public university, public Land-grant university, land-grant research university in the Minneapolis–Saint Paul, Twin Cities of Minneapolis and Saint Paul, Minnesota. The T ...

) led to two new search programs,
VeronicaVeronica, Veronika, etc., may refer to:
People
* Veronica (name)
Veronica (also spelled Weronika, Veronika, Verónica or Verônica) is a female given name, the Latin transliteration of the Greek language, Greek name Berenice, Βερενίκη, wh ...
and
Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica (''V''ery ''E''asy ''R''odent-''O''riented ''N''et-wide ''I''ndex to ''C''omputerized ''A''rchives) provided a keyword search of most Gopher menu titles in the entire Gopher listings. Jughead (''J''onzy's ''U''niversal ''G''opher ''H''ierarchy ''E''xcavation ''A''nd ''D''isplay) was a tool for obtaining menu information from specific Gopher servers. While the name of the search engine "
Archie Search Engine
Archie is a tool for indexing File Transfer Protocol, FTP archives, allowing users to more easily identify specific files. It is considered the first Internet Search engine (computing), search engine. The original implementation was written in 1 ...
" was not a reference to the
Archie comic book series, "
VeronicaVeronica, Veronika, etc., may refer to:
People
* Veronica (name)
Veronica (also spelled Weronika, Veronika, Verónica or Verônica) is a female given name, the Latin transliteration of the Greek language, Greek name Berenice, Βερενίκη, wh ...
" and "
Jughead" are characters in the series, thus referencing their predecessor.
In the summer of 1993, no search engine existed for the web, though numerous specialized catalogues were maintained by hand.
Oscar Nierstrasz
Oscar Marius Nierstrasz (born ) is a Professor at the Computer Science Institute (IAM) at the University of Berne, and a specialist in software engineering and programming languages. He is active in the field of
* programming languages and mecha ...
at the
University of Geneva
The University of Geneva (French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
France (), officially the French Republic (french: link=no, République française), is a country primarily ...

wrote a series of
Perl
Perl is a family of two high-level
High-level and low-level, as technical terms, are used to classify, describe and point to specific Objective (goal), goals of a systematic operation; and are applied in a wide range of contexts, such as, for ...
scripts that periodically mirrored these pages and rewrote them into a standard format. This formed the basis for
W3CatalogW3 Catalog was an early web search engine, first released on September 2, 1993 by developer Oscar Nierstrasz at the University of Geneva.
The engine was initially given the name ''jughead'', but then later renamed. Unlike later search engines, li ...
, the web's first primitive search engine, released on September 2, 1993.
In June 1993, Matthew Gray, then at
MIT
Massachusetts Institute of Technology (MIT) is a private land-grant research university
A research university is a university
A university ( la, universitas, 'a whole') is an educational institution, institution of higher education, hi ...
, produced what was probably the first
web robot
An Internet bot, web
Web most often refers to:
* Spider web
A spider web, spiderweb, spider's web, or cobweb (from the archaic word '' coppe'', meaning "spider") is a structure created by a spider
Spiders ( order Araneae) are air-breath ...
, the
Perl
Perl is a family of two high-level
High-level and low-level, as technical terms, are used to classify, describe and point to specific Objective (goal), goals of a systematic operation; and are applied in a wide range of contexts, such as, for ...
-based
World Wide Web Wanderer, and used it to generate an index called "Wandex". The purpose of the Wanderer was to measure the size of the World Wide Web, which it did until late 1995. The web's second search engine
Aliweb
ALIWEB (Archie Like Indexing for the Web) is considered the first Web , as its predecessors were either built with different purposes (, ) or were only indexers (, and ).
First announced in November 1993 by developer while working at , and pr ...
appeared in November 1993. Aliweb did not use a
web robot
An Internet bot, web
Web most often refers to:
* Spider web
A spider web, spiderweb, spider's web, or cobweb (from the archaic word '' coppe'', meaning "spider") is a structure created by a spider
Spiders ( order Araneae) are air-breath ...
, but instead depended on being notified by
of the existence at each site of an index file in a particular format.
JumpStation
JumpStation was the first WWW search engine that behaved, and appeared to the user, the way current web search engines do. It started indexing on 12 December 1993 and was announced on the Mosaic
A mosaic is a pattern or image made of small regu ...
(created in December 1993 by
Jonathon Fletcher) used a
web robot
An Internet bot, web
Web most often refers to:
* Spider web
A spider web, spiderweb, spider's web, or cobweb (from the archaic word '' coppe'', meaning "spider") is a structure created by a spider
Spiders ( order Araneae) are air-breath ...
to find web pages and to build its index, and used a
web form
A webform, web form or HTML form on a web page allows a user to enter data that is sent to a server for processing. Forms can resemble paper
Paper is a thin sheet material produced by mechanically and/or chemically processing cellulose fibres ...
as the interface to its query program. It was thus the first
WWW
The World Wide Web (WWW), commonly known as the Web, is an information system where documents and other web resources are identified by URL, Uniform Resource Locators (URLs, such as ), which may be interlinked by hyperlinks, and are acc ...
resource-discovery tool to combine the three essential features of a web search engine (crawling, indexing, and searching) as described below. Because of the limited resources available on the platform it ran on, its indexing and hence searching were limited to the titles and headings found in the web pages the crawler encountered.
One of the first "all text" crawler-based search engines was
WebCrawler
WebCrawler is a search engine, and one of the oldest surviving search engines on the web today. For many years, it operated as a metasearch engine. WebCrawler was the first web search engine to provide full text search.
History
Brian Pinkerton ...
, which came out in 1994. Unlike its predecessors, it allowed users to search for any word in any webpage, which has become the standard for all major search engines since. It was also the search engine that was widely known by the public. Also in 1994,
Lycos
Lycos, Inc., is a and established in 1994, spun out of . Lycos also encompasses a network of email, web hosting, social networking, and entertainment websites. The company is based in , and is a subsidiary of .
Etymology
The word "Lycos" is ...

(which started at
Carnegie Mellon University
Carnegie Mellon University (CMU) is a private
Private or privates may refer to:
Music
* "In Private
"In Private" was the third single in a row to be a charting success for United Kingdom, British singer Dusty Springfield, after an absence ...
) was launched and became a major commercial endeavor.
The first popular search engine on the Web was
Yahoo! Search
Yahoo! Search is a rebadged version of the Microsoft Bing search engine owned by Yahoo!
Yahoo! (, styled as yahoo''!'') is an American web services provider. It is headquartered in Sunnyvale, California and owned by Verizon Media, which a ...
. The first product from
Yahoo!
Yahoo (, styled as yahoo''!'') is an American web services
The term Web service (WS) is either:
* a service offered by an electronic device to another electronic device, communicating with each other via the World Wide Web, or
* a server run ...
, founded by
Jerry Yang
Jerry Chih-Yuan Yang (born November 6, 1968) is a Taiwanese-American billionaire computer programmer, internet entrepreneur, and venture capitalist. He is the co-founder and former CEO of Yahoo! Inc.
Early life
Yang was born with the name ...

and
David Filo
David Robert Filo (born April 20, 1966) is an American billionaire businessman and the co-founder of Yahoo!
Yahoo! (, styled as yahoo''!'') is an American web services provider. It is headquartered in Sunnyvale, California and owned by Veri ...

in January 1994, was a
Web directory
A web directory or link directory is an online list or catalog of website
A website (also written as web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web server ...
called
Yahoo! Directory
The Yahoo! Directory was a web directory which at one time rivaled DMOZ in size. The directory was Yahoo!
Yahoo! (, styled as yahoo''!'') is an American web services provider. It is headquartered in Sunnyvale, California and owned by Ver ...
. In 1995, a search function was added, allowing users to search Yahoo! Directory! It became one of the most popular ways for people to find web pages of interest, but its search function operated on its web directory, rather than its full-text copies of web pages.
Soon after, a number of search engines appeared and vied for popularity. These included
Magellan
Ferdinand Magellan ( or ; pt, Fernão de Magalhães, ; es, link=no, Fernando de Magallanes, ; c. 1480 – 27 April 1521) was a Portuguese people, Portuguese explorer who organised the Spanish expedition to the East Indies from 1519 to 1522, re ...
,
Excite
Excite (stylized as excite) is a web portal launched in 1995 that provides a variety of content including news and weather, a metasearch engine, a web-based email, instant messaging, Financial quote, stock quotes, and a customizable user homepa ...
,
Infoseek
Infoseek (also known as the "big yellow") was an American internet search engine founded in 1994 by Steve Kirsch.
Infoseek was originally operated by the Infoseek Corporation, headquartered in Sunnyvale, California. Infoseek was bought by The Wal ...
,
Inktomi
Inktomi Corporation was a company that provided software for Internet service providers (ISPs). It was incorporated in Delaware and headquartered in Foster City, California, United States. Customers included Microsoft, HotBot, Amazon.com, eBay, a ...
,
Northern Light, and
AltaVista
AltaVista was a established in 1995. It became one of the most-used early search engines, but lost ground to and was purchased by in 2003, which retained the brand, but based all AltaVista searches on its own search engine. On July 8, 2013, t ...
. Information seekers could also browse the directory instead of doing a keyword-based search.
In 1996,
Robin Li
Robin Li Yanhong (; born 17 November 1968) is a Chinese software engineer and billionaire internet entrepreneur. He is the co-founder of the search engine Baidu
Baidu, Inc. (, meaning "a hundred times" or "a hundred degrees", anglicized ) ...
developed the
RankDex
Baidu, Inc. (, meaning "a hundred times" or "a hundred degrees", anglicized ) is a Chinese multinational technology company specializing in Internet-related services and products and artificial intelligence
Artificial intelligence (AI ...
site-scoring
algorithm
In and , an algorithm () is a finite sequence of , computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always and are used as specifications for performing s, , , and other ...

for search engines results page ranking
["About: RankDex"](_blank)
''rankdex.com'' and received a US patent for the technology. It was the first search engine that used
hyperlink
In computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and soft ...

s to measure the quality of websites it was indexing, predating the very similar algorithm patent filed by
Google
Google LLC is an American multinational
Multinational may refer to:
* Multinational corporation, a corporate organization operating in multiple countries
* Multinational force, a military body from multiple countries
* Multinational stat ...

two years later in 1998.
Larry Page
Lawrence Edward Page (born March 26, 1973) is an American computer scientist
A computer scientist is a person who has acquired the knowledge of computer science
Computer science deals with the theoretical foundations of information, a ...

referenced Li's work in some of his U.S. patents for PageRank.
Li later used his Rankdex technology for the
Baidu
Baidu, Inc. (, meaning "a hundred times" or "a hundred degrees", anglicized ) is a Chinese multinational specializing in Internet-related services and products and (AI), headquartered in 's . It is one of the largest AI and Internet compan ...
search engine, which was founded by Robin Li in China and launched in 2000.
In 1996,
Netscape
Netscape Communications Corporation (originally Mosaic Communications Corporation) was an American independent computer services company with headquarters in Mountain View, California
Mountain View is a city in Santa Clara County, California
...

was looking to give a single search engine an exclusive deal as the featured search engine on Netscape's web browser. There was so much interest that instead Netscape struck deals with five of the major search engines: for $5 million a year, each search engine would be in rotation on the Netscape search engine page. The five engines were Yahoo!, Magellan, Lycos, Infoseek, and Excite.
Google
Google LLC is an American multinational
Multinational may refer to:
* Multinational corporation, a corporate organization operating in multiple countries
* Multinational force, a military body from multiple countries
* Multinational stat ...

adopted the idea of selling search terms in 1998, from a small search engine company named
goto.com. This move had a significant effect on the search engine business, which went from struggling to one of the most profitable businesses in the Internet.
Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s. Several companies entered the market spectacularly, receiving record gains during their
initial public offering
An initial public offering (IPO) or stock launch is a in which shares of a company are sold to s and usually also retail (individual) investors. An IPO is typically by one or more , who also arrange for the shares to be listed on one or more s ...
s. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the
dot-com bubble
The dot-com bubble, also known as the dot-com boom, the tech bubble, and the Internet bubble, was a stock market bubble
Stock (also capital stock) is all of the Share (finance), shares into which ownership of a corporation is divided.Longm ...
, a speculation-driven market boom that peaked in 1990 and ended in 2000.
2000s–present: Post dot-com bubble
Around 2000,
Google's search engine rose to prominence. The company achieved better results for many searches with an algorithm called
PageRank
PageRank (PR) is an algorithm used by Google Search to rank webpages, web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. Ac ...

, as was explained in the paper ''Anatomy of a Search Engine'' written by
Sergey Brin
Sergey Mikhaylovich Brin (russian: Серге́й Миха́йлович Брин, tr. ''Sergéj Mixájlovič Brin''; born August 21, 1973) is an American computer scientist and Internet entrepreneur. Together with Larry Page
Lawrence ...
and
Larry Page
Lawrence Edward Page (born March 26, 1973) is an American computer scientist
A computer scientist is a person who has acquired the knowledge of computer science
Computer science deals with the theoretical foundations of information, a ...

, the later founders of Google.
This
iterative algorithm
In computational mathematics
Computational mathematics involves mathematics, mathematical research in mathematics as well as in areas of science where computation, computing plays a central and essential role, and emphasizes algorithms, numerical ...
ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others. Larry Page's patent for PageRank cites
Robin Li
Robin Li Yanhong (; born 17 November 1968) is a Chinese software engineer and billionaire internet entrepreneur. He is the co-founder of the search engine Baidu
Baidu, Inc. (, meaning "a hundred times" or "a hundred degrees", anglicized ) ...
's earlier
RankDex
Baidu, Inc. (, meaning "a hundred times" or "a hundred degrees", anglicized ) is a Chinese multinational technology company specializing in Internet-related services and products and artificial intelligence
Artificial intelligence (AI ...
patent as an influence.
Google also maintained a minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a
web portal
A web portal is a specially designed website
A website (also written as web site) is a collection of web page
A web page (or webpage) is a hypertext
File:Douglas Engelbart in 2008.jpg, Douglas Engelbart in 2009, at the 40th annivers ...
. In fact, the Google search engine became so popular that spoof engines emerged such as
Mystery Seeker.
By 2000,
Yahoo!
Yahoo (, styled as yahoo''!'') is an American web services
The term Web service (WS) is either:
* a service offered by an electronic device to another electronic device, communicating with each other via the World Wide Web, or
* a server run ...
was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002, and
Overture
Overture (from French language, French ''ouverture'', "opening") in music was originally the instrumental introduction to a ballet, opera, or oratorio in the 17th century. During the early Romantic era, composers such as Ludwig van Beethoven, Beet ...

(which owned
AlltheWeb
AlltheWeb (sometimes referred to as FAST or FAST Search) was an Internet Search engine (computing), search engine that made its debut in mid-1999 and was closed in 2011. It grew out of ''FTP Search'', Tor Egge's doctorate thesis at the Norwegian Un ...
and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.
Microsoft
Microsoft Corporation is an American multinational
Multinational may refer to:
* Multinational corporation, a corporate organization operating in multiple countries
* Multinational force, a military body from multiple countries
* Multination ...

first launched MSN Search in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from
Looksmart, blended with results from Inktomi. For a short time in 1999, MSN Search used results from AltaVista instead. In 2004,
Microsoft
Microsoft Corporation is an American multinational
Multinational may refer to:
* Multinational corporation, a corporate organization operating in multiple countries
* Multinational force, a military body from multiple countries
* Multination ...

began a transition to its own search technology, powered by its own
web crawler
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot
An Internet bot, web
Web most often refers to:
* Spider web
A spider web, spiderweb, spider's web, or cobweb (from the archaic word ...
(called
msnbot Msnbot was a (type of ), deployed by to collect documents from the web to build a searchable index for the . It went into beta in 2004, and had full public release in 2005. The month of October 2010 saw the official retirement of msnbot from most ...
).
Microsoft's rebranded search engine,
Bing
Bing most often refers to:
* Bing Crosby
Harry Lillis "Bing" Crosby Jr. (May 3, 1903 – October 14, 1977) was an American singer, comedian and actor. The first multimedia star, Crosby was one of the most popular and influential musical a ...
, was launched on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft finalized a deal in which
Yahoo! Search
Yahoo! Search is a rebadged version of the Microsoft Bing search engine owned by Yahoo!
Yahoo! (, styled as yahoo''!'') is an American web services provider. It is headquartered in Sunnyvale, California and owned by Verizon Media, which a ...
would be powered by Microsoft Bing technology.
As of 2019, active search engine crawlers include those of
Google
Google LLC is an American multinational
Multinational may refer to:
* Multinational corporation, a corporate organization operating in multiple countries
* Multinational force, a military body from multiple countries
* Multinational stat ...
,
Petal
upright=1.4, Diagram showing the parts of a mature flower. In this example the perianth is separated into a calyx (sepals) and corolla (petals)
Petals are modified leaves
A leaf (plural leaves) is the principal lateral appendage of the ...
,
Sogou
Sogou, Inc. () is a Chinese technology company that offers a search engine
A search engine is a software system that is designed to carry out Web search query, web searches. They search the World Wide Web in a systematic way for particula ...
,
Baidu
Baidu, Inc. (, meaning "a hundred times" or "a hundred degrees", anglicized ) is a Chinese multinational specializing in Internet-related services and products and (AI), headquartered in 's . It is one of the largest AI and Internet compan ...
,
Bing
Bing most often refers to:
* Bing Crosby
Harry Lillis "Bing" Crosby Jr. (May 3, 1903 – October 14, 1977) was an American singer, comedian and actor. The first multimedia star, Crosby was one of the most popular and influential musical a ...
,
Gigablast,
Mojeek,
DuckDuckGo
DuckDuckGo (also abbreviated as DDG) is an internet search engine
A search engine is a software system that is designed to carry out Web search query, web searches (Internet searches), which means to search the World Wide Web in a systematic ...
and
Yandex
Yandex N.V. (; russian: link=no, Яндекс, p=ˈjandəks) is a multinational corporation primarily for Russian and Russian-language users, providing 70 Internet
The Internet (or internet) is the global system of interconnected s th ...

.
Approach
A search engine maintains the following processes in near real time:
#
Web crawling
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically operated by search engines for the purpose of Web indexing (''web spidering'').
W ...
#
Indexing
#
Searching
Searching or search may refer to:
Computing technology
* Search algorithm, including keyword search
** :Search algorithms
* Search and optimization for problem solving in artificial intelligence
* Search engine technology, software for findin ...
Web search engines get their information by
web crawling
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically operated by search engines for the purpose of Web indexing (''web spidering'').
W ...
from site to site. The "spider" checks for the standard filename ''
robots.txt#REDIRECT Robots exclusion standard#REDIRECT Robots exclusion standard {{R from other capitalisation
Internet server's services'Cloud's driver's ... {{R from other capitalisation
Internet server's services'Cloud's driver's ...
'', addressed to it. The robots.txt file contains directives for search spiders, telling it which pages to crawl and which pages not to crawl. After checking for robots.txt and either finding it or not, the spider sends certain information back to be