HOME

TheInfoList



OR:

A bibliogram is a graphical representation of the frequency of certain target words, usually noun phrases, in a given text. The term was introduced in 2005 by Howard D. White to name the linguistic object studied, but not previously named, in
informetrics Informetrics is the study of quantitative aspects of information, it is an extension and evolution of traditional bibliometrics and scientometrics. Informetrics uses bibliometrics and scientometrics methods to study mainly the problems of literat ...
,
scientometrics Scientometrics is the field of study which concerns itself with measuring and analysing scholarly literature. Scientometrics is a sub-field of informetrics. Major research issues include the measurement of the impact of research papers and academi ...
and
bibliometrics Bibliometrics is the use of statistical methods to analyse books, articles and other publications, especially in regard with scientific contents. Bibliometric methods are frequently used in the field of library and information science. Biblio ...
. The noun phrases in the
ranking A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second. In mathematics, this is known as a weak order or total preorder of o ...
may be authors, journals, subject headings, or other indexing terms. The "stretches of text” may be a book, a set of related articles, a subject bibliography, a set of Web pages, and so on. Bibliograms are always generated from writings, usually from scholarly or scientific literature.


Definition

A bibliogram is verbal construct made when
noun phrase In linguistics, a noun phrase, or nominal (phrase), is a phrase that has a noun or pronoun as its head or performs the same grammatical function as a noun. Noun phrases are very common cross-linguistically, and they may be the most frequently oc ...
s from extended stretches of text are ranked high to low by their
frequency Frequency is the number of occurrences of a repeating event per unit of time. It is also occasionally referred to as ''temporal frequency'' for clarity, and is distinct from ''angular frequency''. Frequency is measured in hertz (Hz) which is eq ...
of
co-occurrence In linguistics, co-occurrence or cooccurrence is an above-chance frequency of occurrence of two terms (also known as coincidence or concurrence) from a text corpus alongside each other in a certain order. Co-occurrence in this linguistic sense can ...
with one or more user-supplied seed terms. Each bibliogram has three components: * A seed term that sets a
context Context may refer to: * Context (language use), the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary Computing * Context (computing), the virtual environment required to su ...
. * Words that co-occur with the seed across some set of records. * Counts (frequencies) by which co-occurring words can be ordered high to low. As a family of term-frequency distributions, the bibliogram has frequently been written about under descriptions such as: * positive skew distribution * empirical hyperbolic * scale-free (see also
Scale-free network A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as : P(k) ...
) *
power law In statistics, a power law is a Function (mathematics), functional relationship between two quantities, where a Relative change and difference, relative change in one quantity results in a proportional relative change in the other quantity, inde ...
*
size frequency distribution Size in general is the Magnitude (mathematics), magnitude or dimensions of a thing. More specifically, ''geometrical size'' (or ''spatial size'') can refer to linear dimensions (length, width, height, diameter, perimeter), area, or volume ...
* reverse-J It is sometimes called a "core and scatter" distribution. The "core" consists of relatively few top-ranked terms that account for a disproportionately large share of co-occurrences overall. The "scatter” consists of relatively many lower-ranked terms that account for the remaining share of co-occurrences. Usually the top-ranked terms are not tied in frequency, but identical frequencies and tied ranks become more common as the frequencies get smaller. At the bottom of the distribution, a long tail of terms are tied in rank because each co-occurs with the seed term only once. In most cases bibliograms can be described by
power law In statistics, a power law is a Function (mathematics), functional relationship between two quantities, where a Relative change and difference, relative change in one quantity results in a proportional relative change in the other quantity, inde ...
s such as Zipf's law and
Bradford's law Bradford's law is a pattern first described by Samuel C. Bradford in 1934 that estimates the exponentially diminishing returns of searching for references in science journals. One formulation is that if journals in a field are sorted by number of ...
. In this regard, they have long been studied by mathematicians and statisticians in information science. However, these treatments typically ignore the qualitative meanings of the ranked terms themselves, which are often of interest in their own right. For example, the following bibliogram was made with an author's name as seed and shows the descriptors that co-occur with her name in the
ERIC The given name Eric, Erich, Erikk, Erik, Erick, or Eirik is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Norse ''* ain ...
database. The descriptors are ranked by how many of her articles they were used to index: 6 Creativity 4 Creativity Tests 3 Divergent Thinking 2 Elementary School Mathematics 2 Instruction 2 Mathematics Education 2 Problem Solving 2 Research 2 Time 1 Acceleration 1 Anxiety 1 Beginning Teachers 1 Behavioral Objectives 1 Child Development 1 Classroom Techniques 1 Cognitive Development etc. This author is a researcher in education, and it will be seen that the terms profile her intellectual interests over the years. In general, bibliograms can be used to: * suggest additional terms for search strategies * characterize the work of scholars, scientists, or institutions * show who an author cites over time * show who cites an author over time * show the other authors with whom an author is co-cited over time * show the subjects associated with a journal or an author * show the authors, organizations, or journals associated with a subject * show library classification codes associated with subject headings and vice versa * show the popularity of items in the collections of libraries * model the structure of literatures with title terms, descriptors, author names, journal names Bibliograms can be created with the RANK command on Dialog (other vendors have similar commands), ranking options within
WorldCat WorldCat is a union catalog that itemizes the collections of tens of thousands of institutions (mostly libraries), in many countries, that are current or past members of the OCLC global cooperative. It is operated by OCLC, Inc. Many of the OCL ...
,
HistCite HistCite is a software package used for bibliometric analysis and information visualization. It was developed by Eugene Garfield, the founder of the Institute for Scientific Information and the inventor of important information retrieval tools such ...
,
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes p ...
, and inexpensive content analysis software. White suggests that bibliograms have a parallel construct in what he calls ''associograms''. These are the rank-ordered lists of word association norms studied in
psycholinguistics Psycholinguistics or psychology of language is the study of the interrelation between linguistic factors and psychological aspects. The discipline is mainly concerned with the mechanisms by which language is processed and represented in the mind ...
. They are similar to bibliograms in statistical structure but are not generated from writings. Rather, they are generated by presenting panels of people with a stimulus term (which functions like a seed term) and tabulating the words they associate with the seed by frequency of co-occurrence. They are currently of interest to information scientists as a nonstandard way of creating thesauri for document retrieval.


Examples

Other examples of bibliograms are the ordered set of an author's
co-authors Collaborative writing, or collabwriting is a method of group work that takes place in the workplace and in the classroom. Researchers expand the idea of collaborative writing beyond groups working together to complete a writing task. Collaboration ...
or the list of authors that are published in a specific journal together with their number of articles. A popular example is the list of additional titles to consider for purchase that you get when you search an item in
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technology c ...
. These suggested titles are the top terms in the "core" of a bibliogram formed with your search term as seed. The frequencies are counts of the times they have been co-purchased with the seed. Examples of associagrams may be found in th
Edinburgh Associative Thesaurus


Other methods

Similar but different methods are used in
data clustering Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...
and data mining
Google Sets
does also create list of associated terms to a given set of terms.


See also

*
Bibliographic coupling Bibliographic coupling, like co-citation, is a similarity measure that uses citation analysis to establish a similarity relationship between documents. Bibliographic coupling occurs when two works reference a common third work in their bibliographi ...
*
Co-citation Co-citation is the frequency with which two documents are ''cited'' together by other documents.. If at least one other document cites two documents in common, these documents are said to be ''co-cited''. The more co-citations two documents recei ...


References

* Howard D. White (2005): ''On Extending Informetrics: An Opinion Paper''. In: Proceedings of the 10th International Congress of the International Society for Scientometrics and Informetrics. Stockholm p. 442-449 Bibliometrics