HOME

TheInfoList



OR:

''YaCy'' (pronounced “ya see”) is a free distributed search engine, built on the principles of
peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
(P2P) networks created by Michael Christen in 2003. The engine is written in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
and distributed on several hundred computers, , so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database (so-called index) which is shared with other YaCy-peers using principles of
peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
. It is a
search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
that everyone can use to build a search portal for their intranet and to help search the public internet clearly. Compared to semi-distributed search engines, the YaCy-network has a distributed architecture. All YaCy-peers are equal and no central
server Server may refer to: Computing *Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients Role * Waiting staff, those who work at a restaurant or a bar attending customers and su ...
exists. It can be run either in a crawling mode or as a local
proxy server In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. Instead of connecting directly to a server that can fulfill a reques ...
, indexing web pages visited by the person running YaCy on their computer. Several mechanisms are provided to protect the user's privacy. Access to the search functions is made by a locally run web server which provides a search box to enter search terms, and returns search results in a similar format to other popular search engines.


System components

YaCy search engine is based on four elements: ;Crawler: A search robot that traverses between web pages, analyzing their content. ;Indexer: It creates a reverse word index (RWI), i.e., each word from the RWI has its own list of relevant URLs and ranking information. Words are saved in the form of word hashes. ;Search and administration interface: Made as a web interface provided by a local
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide We ...
servlet with servlet engine. ;Data storage: Used to store the reverse word index database utilizing a
distributed hash table A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The ...
.


Search-engine technology

* ''YaCy is a complete search appliance with user interface, index, administration and monitoring.'' * YaCy harvests web pages with a web crawler. Documents are then parsed, indexed and the search index is stored locally. If your peer is part of a peer network, then your local search index is also merged into the shared index for that network. * A search is started, then the local index contributes together with a global search index from peers in the YaCy search network. *The YaCy Grid is a second-generation implementation of the YaCy peer-to-peer search. A YaCy Grid installation consists of microservices that communicate using the Master Connect Program (MCP). *The YaCy Parser is a microservice that can be deployed using Docker. When the Parser Component is started, it searches for an MCP and connects to it. By default, the local host is searched for an MCP, but you can configure one yourself.


YaCy platform architecture

YaCy uses a combination of techniques for the networking, administration, and maintenance of indexing the search engine, including blacklisting, moderation, and communication with the community. Here is how YaCy performs these operations: * Community components *# Web forum *# Statistics *# XML API * Maintenance *# Web Server *# Indexing *# Crawler with Balancer *# Peer-to-Peer Server Communication * Content organization *# Blacklisting and Filtering *# Search interface *# Bookmarks *# Monitoring search results


Distribution

YaCy is available in packages for Linux, Windows, Macintosh and also as a Docker image. YaCy can also be installed on any other operating system either by manually compiling it, or using a
tarball Tarball may refer to: * Tarball (computing), a type of archive file * Tarball (oil) A tarball is a blob of petroleum which has been weathered after floating in the ocean. Tarballs are an aquatic pollutant in most environments, although they can ...
. YaCy requires Java 8, OpenJDK 8 is recommended. The
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of De ...
package can be installed from a repository available at the subdomain of the project's website. The package is not maintained in the official Debian package repository yet.


See also

*
Dooble Dooble is a free and open-source web browser that was created to improve privacy. Currently, Dooble is available for FreeBSD, Linux, macOS, OS/2, and Windows. Dooble uses Qt for its user interface and abstraction from the operating system ...
– an open-source web browser with an integrated YaCy Search Engine Tool Widget


References


Further reading

YaCy at LinuxReviews


External links

* {{DEFAULTSORT:Yacy Anonymity networks Distributed data storage Free search engine software Free web crawlers Internet properties established in 2003 Internet search engines Java platform software Cross-platform software Software using the GPL license Java (programming language) software Peer-to-peer software