An infobox is a digital or physical Table (information), table used to collect and present a subset of information about its subject, such as a document. It is a structured document containing a set of attribute–value pairs, and in Wikipedia represents a summary of information about the subject of an Article (publishing), article. In this way, they are comparable to data table (information), tables in some aspects. When presented within the larger document it summarizes, an infobox is often presented in a sidebar (publishing), sidebar format. An infobox may be implemented in another document by transclusion, transcluding it into that document and specifying some or all of the attribute–value pairs associated with that infobox, known as parameterization.

Wikipedia

An infobox may be used to summarize the information of an article on Wikipedia. They are used on similar articles to ensure consistency of presentation by using a common format. Originally, infoboxes (and templates in general) were used for page layout purposes. An infobox may be transcluded into an article by specifying the Value (computer science), value for some or all of its Parameter (computer programming), parameters. The parameter name used must be the same as that specified in the infobox template, but any value may be associated to it. The name is delimiter, delimited from the value by an equals sign. The parameter name may be regarded as an attribute of the article's subject. Crostata infobox, February 2018

On Wikipedia, an infobox is transcluded into an article by enclosing its name and attribute–value pairs within a double set of Curly bracket, braces. The MediaWiki software on which Wikipedia operates then Parsing, parses the document, for which the infobox and other templates are processed by a template processor. This is a Web template system, template engine which produces a web document and a Style sheet (web development), style sheet used for presentation of the document. This enables the design of the infobox to be separated from the content it manipulates; that is, the design of the template may be updated without affecting the information within it, and the new design will automatically propagate to all articles that transclude the infobox. Usually, infoboxes are Typesetting, formatted to appear in the top-right corner of a Wikipedia article in the desktop view, or at the top in the mobile view. Placement of an infobox within the Wiki markup, wikitext of an article is important for Web accessibility, accessibility. A best practice is to place them following ''disambiguation'' templates (those that direct readers to articles about topics with similar names) and maintenance templates (such as that marking an article as unreferenced), but before all other Content (media), content. Ricardo Baeza-Yates, Baeza-Yates and King say that some editors find templates such as infoboxes complicated, as the template may hide text about a property or resource that the editor wishes to change; this is exacerbated by chained templates, that is templates transcluded within other templates. As of August 2009, English Wikipedia used about infobox templates that collectively used more than attributes. Since then, many have been merged, to reduce redundancy. As of June 2013, there were at least transclusions of the parent Template:Infobox, Infobox template, used by some, but not all, infoboxes, on articles. The name of an Infobox is typically "Infobox [genre]"; however, widely used infoboxes may be assigned shorter names, such as "taxobox" for taxonomy.

Machine learning

About 44.2% of Wikipedia articles contained an infobox in 2008, and about 33% in 2010. Automated Semantic data model, semantic knowledge extraction using machine learning algorithms is used to "extract machine-processable information at a relatively low complexity cost". However, the low coverage makes it more difficult, though this can be partially overcome by complementing article data with that in Categorization, categories in which the article is included. The French Wikipedia initiated the project ''Infobox Version 2'' in May 2011.The project is hosted on the French Wikipedia page :fr:Projet:Infobox/V2, Infobox/V2. Knowledge obtained by machine learning can be used to improve an article, such as by using automated software suggestions to editors for adding infobox data. The iPopulator project created a system to add a value to an article's infobox parameter via an automated parsing of the text of that article. DBpedia uses structured content extracted from infoboxes by machine learning algorithms to create a resource of linked data in the Semantic Web; it has been described by Tim Berners-Lee as "one of the more famous" components of the linked data project. Machine extraction creates a triple consisting of a subject, predicate or relation, and object. Each attribute-value pair of the infobox is used to create an Resource Description Framework, RDF statement using an Ontology (information science), ontology. This is facilated by the narrower gap between Wikipedia and an ontology than exists between unstructured or free text and an ontology. The Ontology components, semantic relationship between the subject and object is established by the predicate. In the example infobox, the triple ("crostata", type, "tart") indicates that a crostata is a type of tart. The article's topic is used as the subject, the parameter name is used as the predicate, and the parameter's value as the object. Each type of infobox is mapped to an ontology class, and each property (parameter) within an infobox is mapped to an ontology property. These mappings are used when parsing a Wikipedia article to extract data.

Citations

Works cited

* * * * * * *

Wikipedia

Machine learning

Citations

Works cited

Further reading