Document Structuring
   HOME
*





Document Structuring
Document Structuring is a subtask of Natural language generation, which involves deciding the order and grouping (for example into paragraphs) of sentences in a generated text. It is closely related to the Content determination NLG task. Example Assume we have four sentences which we want to include in a generated text # It will rain on Saturday # It will be sunny on Sunday # Max temperature will be 10 °C on Saturday # Max temperature will be 15 °C on Sunday There are 24 (4!) orderings of these messages, including * (1234) It will rain on Saturday. It will be sunny on Sunday. Max temperature will be 10 °C on Saturday. Max temperature will be 15 °C on Sunday. * (2341) It will be sunny on Sunday. Max temperature will be 10 °C on Saturday. Max temperature will be 15 °C on Sunday. It will rain on Saturday. * (4321) Max temperature will be 15 °C on Sunday. Max temperature will be 10 °C on Saturday. It will be sunny on Sunday. It will ra ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Natural Language Generation
Natural language generation (NLG) is a software process that produces natural language output. In one of the most widely-cited survey of NLG methods, NLG is characterized as "the subfield of artificial intelligence and computational linguistics that is concerned with the construction of computer systems than can produce understandable texts in English or other human languages from some underlying non-linguistic representation of information". While it is widely agreed that the output of any NLG process is text, there is some disagreement on whether the inputs of an NLG system need to be non-linguistic. Common applications of NLG methods include the production of various reports, for example weather and patient reports; image captions; and chatbots. Automated NLG can be compared to the process humans use when they turn ideas into writing or speech. Psycholinguists prefer the term language production for this process, which can also be described in mathematical terms, or modeled in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Content Determination
Content determination is the subtask of natural language generation (NLG) that involves deciding on the information to be communicated in a generated text. It is closely related to the task of document structuring. Example Consider an NLG system which summarises information about sick babies. Suppose this system has four pieces of information it can communicate # The baby is being given morphine via an IV drop # The baby's heart rate shows bradycardia's (temporary drops) # The baby's temperature is normal # The baby is crying Which of these bits of information should be included in the generated texts? Issues There are three general issues which almost always impact the content determination task, and can be illustrated with the above example. Perhaps the most fundamental issue is the ''communicative goal'' of the text, i.e. its ''purpose'' and ''reader''. In the above example, for instance, a doctor who wants to make a decision about medical treatment would probably be most int ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Text Corpus
In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical analysis and statistical hypothesis testing, hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. In Search engine (computing), search technology, a corpus is the collection of documents which is being searched. Overview A corpus may contain texts in a single language (''monolingual corpus'') or text data in multiple languages (''multilingual corpus''). In order to make the corpora more useful for doing linguistic research, they are often subjected to a process known as annotation. An example of annotating a corpus is part-of-speech tagging, or ''POS-tagging'', in which information about each word's part of speech (verb, noun, adjective, etc.) is added to the corpus in the form o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Automatic Summarisation
Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are commonly developed and employed to achieve this, specialized for different types of data. Text summarization is usually implemented by natural language processing methods, designed to locate the most informative sentences in a given document. On the other hand, visual content can be summarized using computer vision algorithms. Image summarization is the subject of ongoing research; existing approaches typically attempt to display the most representative images from a given image collection, or generate a video that only includes the most important content from the entire collection. Video summarization algorithms identify and extract from the original video content the most important frames (''key-frames''), and/or the most important video s ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Narrative
A narrative, story, or tale is any account of a series of related events or experiences, whether nonfictional (memoir, biography, news report, documentary, travel literature, travelogue, etc.) or fictional (fairy tale, fable, legend, thriller (genre), thriller, novel, etc.). Narratives can be presented through a sequence of written or spoken words, through still or moving images, or through any combination of these. The word derives from the Latin verb ''narrare'' (to tell), which is derived from the adjective ''gnarus'' (knowing or skilled). Narration (i.e., the process of presenting a narrative) is a rhetorical modes, rhetorical mode of discourse, broadly defined (and paralleling argumentation, description, and exposition (narrative), exposition), is one of four rhetorical modes of discourse. More narrowly defined, it is the fiction-writing mode in which a narrator communicates directly to an audience. The school of literary criticism known as Russian formalism has applied metho ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computational Linguistics
Computational linguistics is an Interdisciplinarity, interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon linguistics, computer science, artificial intelligence, mathematics, logic, philosophy, cognitive science, cognitive psychology, psycholinguistics, anthropology and neuroscience, among others. Sub-fields and related areas Traditionally, computational linguistics emerged as an area of artificial intelligence performed by computer scientists who had specialized in the application of computers to the processing of a natural language. With the formation of the Association for Computational Linguistics (ACL) and the establishment of independent conference series, the field consolidated during the 1970s and 1980s. The Association for Computational Linguistics defines computational linguistics as: The term "comp ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Natural Language Processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]