Karen Spärck Jones
   HOME

TheInfoList



OR:

Karen Sparck Jones is a computer science researcher and innovator who pioneered the search engine algorithm known as inverse document frequency (IDF). While many early information scientists and computer engineers were focused on developing programming languages and coding computer systems, Sparck-Jones thought it more beneficial to develop information retrieval systems that could understand human language. /sup> Background Karen Sparck-Jones was born in Huddersfield, Yorkshire, England in 1935 and attended school through university at Girton College in Cambridge. While she did not study computer science in school, she began her research career in a niche organization known as the Cambridge Language Research Unit (CLRU). Through her work at the CLRU, Sparck-Jones began pursuing her Ph.D. At the time of submission, her Ph.D thesis was cast aside as uninspired and lacking original thought but was later published in its entirety as a book. /sup> Professional Career            After completing her Ph.D, Sparck-Jones continued to research language computation techniques. Using unofficial connections she made through her marriage to Roger Needham in 1958, she was able to continue her pursuit of using refined term clustering in language and information retrieval. /sup> This study combined with the use of some of her husband’s authored works afforded Sparck-Jones the ability to come up with her method of inverse document frequency.            Very soon after publishing her first paper on IDF in 1972, the practice of using IDF in laboratory research was gaining traction and laboratories such as the Vector Space Laboratory at Cornell had already positioned it as a primary in their procedures. /sup> Innovative Legacy            Karen Sparck-Jones’ discoveries helped pave the way for modern-day information retrieval that allows search engines to quickly identify the most relevant results and curate millions of responses to internet queries. The problem that she solved was one that seemed at the time to be specific to one vein of academic research in Information Retrieval, then a sect of Computer Information Systems (CIS). /sup> It is clear now that Sparck-Jones had a lasting impact on the general public as well, however, with the creation and wide-spread use of the internet several years after her paper was published. /sup>            While the direct implementation of IDF is no longer the driving force behind information retrieval in favor of more sophisticated and complex designs that have evolved in the decades since Sparck-Jones’ research was at large, her original papers are among the most cited papers in the field of CIS. In this sense, the innovation made possible by Sparck-Jones is not a direct invention, but a foundation that was laid through years of study and work in a time when her efforts were overlooked because her station as a woman was not respected.            The impact that Karen Sparck-Jones left on the world lies in the current reliance on the internet and the World Wide Web for all information needs. In the digital age, it is expected that questions should be answered through a simple search and retrieval process where results are curated to ensure the first listed answer solves the user’s problem. The quantity of data will continue to grow, but in this era, its utilization and analysis are of utmost importance. Footnotes (Bowles, 2019) (Robertson & Tait, 2008) (Tait, 2007) (A Brief History, n.d) References Bowles, N. (2019, January 2). ''Overlooked no more: Karen Sparck Jones, who established the basis for search engines''. The New York Times. Retrieved November 28, 2022, from https://www.nytimes.com/2019/01/02/obituaries/karen-sparck-jones-overlooked.html Robertson, S., & Tait, J. (2008). Karen Spärck Jones. Journal of the American Society for Information Science & Technology, 59(5), 852–854. https://doi-org.ezproxy.neu.edu/10.1002/asi.20784 Tait, J. I. (2007). Karen Spärck Jones. Computational Linguistics, 33(3), 289–291. https://doi-org.ezproxy.neu.edu/10.1162/coli.2007.33.3.289 University System of Georgia. (n.d.). ''A Brief History of the Internet''. Online Library Learning Center. Retrieved November 28, 2022, from https://www.usg.edu/galileo/skills/about_ollc_site.phtml Karen Spärck Jones (26 August 1935 – 4 April 2007) was a pioneering British computer scientist responsible for the concept of
inverse document frequency Inverse or invert may refer to: Science and mathematics * Inverse (logic), a type of conditional sentence which is an immediate inference made from another conditional sentence * Additive inverse (negation), the inverse of a number that, when a ...
(IDF), a technology that underlies most modern
search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
s. In 2019, ''
The New York Times ''The New York Times'' (''the Times'', ''NYT'', or the Gray Lady) is a daily newspaper based in New York City with a worldwide readership reported in 2020 to comprise a declining 840,000 paid print subscribers, and a growing 6 million paid ...
'' published her belated obituary in its series ''Overlooked'', calling her "a pioneer of computer science for work combining statistics and linguistics, and an advocate for women in the field." From 2008, to recognize her achievements in the fields of
information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other c ...
(IR) and
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
(NLP), the Karen Spärck Jones Award is awarded to a new recipient with outstanding research in one or both of her fields.


Early life and education

Karen Ida Boalth Spärck Jones was born in
Huddersfield Huddersfield is a market town in the Kirklees district in West Yorkshire, England. It is the administrative centre and largest settlement in the Kirklees district. The town is in the foothills of the Pennines. The River Holme's confluence i ...
, Yorkshire, England. Spärck Jones was educated at a grammar school in Huddersfield and then from 1953 to 1956 at
Girton College, Cambridge Girton College is one of the 31 constituent colleges of the University of Cambridge. The college was established in 1869 by Emily Davies and Barbara Bodichon as the first women's college in Cambridge. In 1948, it was granted full college statu ...
, studying history, with an additional final year in Moral Sciences (philosophy). She briefly became a school teacher, before moving into computer science.


Career

Spärck Jones worked at the Cambridge Language Research Unit from the late 1950s, then at
Cambridge University Computer Laboratory The Department of Computer Science and Technology, formerly the Computer Laboratory, is the computer science department of the University of Cambridge. it employed 35 academic staff, 25 support staff, 35 affiliated research staff, and about 1 ...
from 1974 until her retirement in 2002. From 1999 she held the post of Professor of Computers and Information. Prior to 1999 she was employed on a series of short-term contracts. She continued to work in the Computer Laboratory until shortly before her death. Her publications include nine books and numerous papers. A full list of her publications is available from the Cambridge Computer Laboratory. Her main research interests, since the late 1950s, were
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
and
information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other c ...
. One of her most important contributions was the concept of
inverse document frequency Inverse or invert may refer to: Science and mathematics * Inverse (logic), a type of conditional sentence which is an immediate inference made from another conditional sentence * Additive inverse (negation), the inverse of a number that, when a ...
(IDF) weighting in information retrieval, which she introduced in a 1972 paper. IDF is used in most
search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
s today, usually as part of the
term frequency–inverse document frequency Term may refer to: *Terminology, or term, a noun or compound word used in a specific context, in particular: **Technical term, part of the specialized vocabulary of a particular field, specifically: ***Scientific terminology, terms used by scienti ...
(TF–IDF) weighting scheme. In 1982 she became involved in the
Alvey The Alvey Programme was a British government sponsored research programme in information technology that ran from 1984 to 1990. The programme was a reaction to the Japanese Fifth Generation project, which aimed to create a computer using massi ...
Programme.


Honours and awards

An annual Karen Spärck Jones Award and lecture is named in her honour. In August 2017, the
University of Huddersfield , mottoeng = Thus not for you alone , established = 1825 – Huddersfield Science and Mechanics' Institute1992 – university status , type = Public , endowment = £2.47 million (2015) , chancellor = George W. Buckley , vice_chancell ...
renamed one of its campus buildings in her honour. Formerly known as Canalside West, the Spärck Jones building houses the University's School of Computing and Engineering. Other honours and awards include * Elected a
Fellow A fellow is a concept whose exact meaning depends on context. In learned or professional societies, it refers to a privileged member who is specially elected in recognition of their work and achievements. Within the context of higher education ...
of the
British Academy The British Academy is the United Kingdom's national academy for the humanities and the social sciences. It was established in 1902 and received its royal charter in the same year. It is now a fellowship of more than 1,000 leading scholars s ...
(FBA), where she also served as Vice-President in 2000–2002 * Elected a Fellow of
Association for the Advancement of Artificial Intelligence The Association for the Advancement of Artificial Intelligence (AAAI) is an international scientific society devoted to promote research in, and responsible use of, artificial intelligence. AAAI also aims to increase public understanding of artif ...
(AAAI) in 1993 * Fellow of European Association for Artificial Intelligence (ECCAI) * President of the
Association for Computational Linguistics The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language proces ...
(ACL) in 1994 * Gerard Salton Award (1988) *
Association for Information Science and Technology The Association for Information Science and Technology (ASIS&T) is a nonprofit membership organization for information professionals that sponsors an annual conference as well as several serial publications, including the '' Journal of the Ass ...
(ASIS&T) Award of Merit (2002) *
Association for Computational Linguistics The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language proces ...
(ACL) Lifetime Achievement Award (2004) * BCS Lovelace Medal (2007) * ACM - AAAI Allen Newell Award (2006) *
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional member ...
(ACM) Women's Group Athena Award (2007)


Personal life

Spärck Jones was married to fellow Cambridge computer scientist
Roger Needham Roger Michael Needham (9 February 1935 – 1 March 2003) was a British computer scientist. Early life and education Needham was born in Birmingham, England, the only child of Phyllis Mary, ''née'' Baker (''c''.1904–1976) and Leonard Wil ...
in 1958.


References

{{DEFAULTSORT:Sparck Jones, Karen 1935 births 2007 deaths Alumni of Girton College, Cambridge British computer scientists British women computer scientists Fellows of the British Academy Fellows of the Association for the Advancement of Artificial Intelligence Fellows of Newnham College, Cambridge Fellows of Wolfson College, Cambridge Members of the University of Cambridge Computer Laboratory People from Huddersfield Deaths from cancer in England Information retrieval researchers Artificial intelligence researchers 20th-century British women scientists People from South Cambridgeshire District Natural language processing researchers