WebCite was an on-demand

archive site In web archiving, an archive site is a website that stores information on webpages from the past for anyone to view. Common techniques Two common techniques for archiving websites are using a web crawler or soliciting user submissions: # Using ...

, designed to digitally preserve scientific and educationally important material on the web by taking snapshots of Internet contents as they existed at the time when a blogger or a scholar cited or quoted from it. The preservation service enabled verifiability of claims supported by the cited sources even when the original web pages are being revised, removed, or disappear for other reasons, an effect known as link rot.

Service features

WebCite allowed for preservation of all types of web content, including HTML web pages,

PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...

files, style sheets, JavaScript and

digital image A digital image is an image composed of picture elements, also known as ''pixels'', each with ''finite'', '' discrete quantities'' of numeric representation for its intensity or gray level that is an output from its two-dimensional functions ...

s. It also archived

metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...

about the collected resources such as access time,

MIME type A media type (also known as a MIME type) is a two-part identifier for file formats and format contents transmitted on the Internet. The Internet Assigned Numbers Authority, Internet Assigned Numbers Authority (IANA) is the official authority for t ...

, and content length. WebCite was a non-profit

consortium A consortium (plural: consortia) is an association of two or more individuals, companies, organizations or governments (or any combination of these entities) with the objective of participating in a common activity or pooling their resources for ...

supported by publishers and editors, and it could be used by individuals without charge. It was one of the first services to offer on-demand archiving of pages, a feature later adopted by many other archiving services, such as

archive.today archive.today (or archive.is) is a web archiving site, founded in 2012, that saves snapshots on demand, and has support for JavaScript-heavy sites such as Google Maps and progressive web apps such as Twitter. archive.today records two snaps ...

and the Wayback Machine. It did not do web page crawling.

History

Conceived in 1997 by Gunther Eysenbach, WebCite was publicly described the following year when an article on Internet

quality control Quality control (QC) is a process by which entities review the quality of all factors involved in production. ISO 9000 defines quality control as "a part of quality management focused on fulfilling quality requirements". This approach places ...

declared that such a service could also measure the citation impact of web pages. In the next year, a pilot service was set up at the address webcite.net. Although it seemed that the need for WebCite decreased when Google's ''short term'' copies of web pages began to be offered by

Google Cache Search engine cache is a cache of web pages that shows the page as it was when it was indexed by a web crawler. Cached versions of web pages can be used to view the contents of a page when the live version cannot be reached, has been altered or t ...

and the Internet Archive expanded their crawling (which started in 1996), WebCite was the only one allowing "on-demand" archiving by users. WebCite also offered interfaces to scholarly journals and publishers to automate the archiving of cited links. By 2008, over 200 journals had begun routinely using WebCite. WebCite was formerly a member of the International Internet Preservation Consortium. In response a 2012 message on Twitter relating to WebCite's former membership of the consortium, Eysenbach commented that "WebCite has no funding, and IIPC charges €4000 per year in annual membership fees." WebCite "feeds its content" to other digital preservation projects, including the Internet Archive. Lawrence Lessig, an American academic who writes extensively on copyright and technology, used WebCite in his ''amicus'' brief in the

Supreme Court of the United States The Supreme Court of the United States (SCOTUS) is the highest court in the federal judiciary of the United States. It has ultimate appellate jurisdiction over all U.S. federal court cases, and over state court cases that involve a point o ...

case of ''

MGM Studios, Inc. v. Grokster, Ltd. ''MGM Studios, Inc. v. Grokster, Ltd.'', 545 U.S. 913 (2005), is a United States Supreme Court decision in which the Court ruled unanimously that the defendants, peer-to-peer file sharing companies Grokster and Streamcast (maker of Morpheus), cou ...

'' Sometime between July 9 and 17, 2019, WebCite stopped accepting new archiving requests. In a further outage, as of October 29, 2021, all previously archived content is no longer available, and only the home page still works.

Fundraising

WebCite ran a fund-raising campaign using FundRazr from January 2013 with a target of $22,500, a sum which its operators stated was needed to maintain and modernize the service beyond the end of 2013. This includes relocating the service to Amazon EC2 cloud hosting and legal support. it remained undecided whether WebCite would continue as a non-profit or as a for-profit entity.

Business model

The term "WebCite" is a registered trademark. WebCite did not charge individual users, journal editors and publishers any fee to use their service. WebCite earned revenue from publishers who wanted to "have their publications analyzed and cited webreferences archived". Early support was from the University of Toronto.

Copyright issues

WebCite maintained the legal position that its archiving activities are allowed by the copyright doctrines of fair use and

implied license An implied license is an unwritten license which permits a party (the licensee) to do something that would normally require the express permission of another party (the licensor). Implied licenses may arise by operation of law from actions by th ...

. To support the fair use argument, WebCite noted that its archived copies are

transformative In United States copyright law, transformative use or transformation is a type of fair use that builds on a copyrighted work in a different manner or for a different purpose from the original, and thus does not infringe its holder's copyright. Tr ...

, socially valuable for academic research, and not harmful to the market value of any copyrighted work. WebCite argued that caching and archiving web pages was not considered a copyright infringement when the archiver offers the copyright owner an opportunity to "opt-out" of the archive system, thus creating an implied license. To that end, WebCite would not archive in violation of Web site "do-not-cache" and "no-archive"

, as well as

robot exclusion standard The robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the site they are allowed to visit. Th ...

s, the absence of which creates an "

" for web archive services to preserve the content. In a similar case involving Google's web caching activities, on January 19, 2006, the United States District Court for the District of Nevada agreed with that argument in the case of ''

Field v. Google ''Field v. Google, Inc.'', 412 F.Supp. 2d 1106 (D. Nev. 2006) is a case where Google Inc. successfully defended a lawsuit for copyright infringement. Field argued that Google infringed his exclusive right to reproduce his copyrighted works when ...

'' (CV-S-04-0413-RCJ-LRL), holding that fair use and an "implied license" meant that Google's caching of Web pages did not constitute copyright violation. The "implied license" referred to general Internet standards.

DMCA requests

According to their policy, after receiving legitimate DMCA requests from the copyright holders, WebCite would remove saved pages from public access, as the archived pages are still under the safe harbor of being citations. The pages were removed to a "dark archive" and in cases of legal controversies or evidence requests, there was pay-per-view access of "$200 (up to 5 snapshots) plus $100 for each further 10 snapshots" to the copyrighted content.

References

External links

* {{DEFAULTSORT:Webcite Internet properties established in 2004 Organizations established in 2003 Web archiving initiatives