HOME

TheInfoList



OR:

SiSU (SiSU information structuring universe or Structured information, serialized units), is a
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
command line A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
-oriented framework for document structuring, publishing and search.


Usage

Using markup applied to a document, or a collection of documents, SiSU can produce
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
,
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
,
XHTML Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated. While HTML, prior ...
,
EPUB EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
,
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
,
OpenDocument The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed wi ...
,
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
or
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
files, and populate an SQL database.


Document structuring

SiSU offers its user a way to structure plain text and to add graphics, hyperlinks, endnotes, footnotes etc. with simple text editing programs such as Notepad (Windows), TextEdit (Mac) or Gedit (Linux). The
lightweight markup language A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightwei ...
is mnemonic and
human readable A human-readable medium or human-readable format is any encoding of data or information that can be naturally read by humans. In computing, ''human-readable'' data is often encoded as ASCII or Unicode text, rather than as binary data. In most ...
. To process the marked up document(s) with SiSU, the user issues a command via the
command-line A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
of the computer terminal. The output can be generated in multiple formats (html, pdf, epub, and others) with one single command.


Publishing and self-publishing

A document, or a collection of documents, which has been processed by SiSU is technically ready to be published on the web, or printed on paper. Canadian author
Cory Doctorow Cory Efram Doctorow (; born July 17, 1971) is a Canadian-British blogger, journalist, and science fiction author who served as co-editor of the blog ''Boing Boing''. He is an activist in favour of liberalising copyright laws and a proponent of ...
, for instance, has used SiSU as a publishing tool and blogged about it. In a newspaper article, Doctorow called SiSU an "automated ebook workflow tool". Earlier examples of webpublishing with SiSU are ''Projet de traité instituant l'Union Européenne / Draft Treaty Establishing the European Union'' and the novel
Tainaron Cape Matapan ( el, Κάβο Ματαπάς, Maniot dialect: Ματαπά), also named as Cape Tainaron or Taenarum ( el, Ακρωτήριον Ταίναρον), or Cape Tenaro, is situated at the end of the Mani Peninsula, Greece. Cape Matapan ...
by Finnish author
Leena Krohn Leena Krohn (born February 28, 1947 in Helsinki) is a Finnish author. Her large and varied body of work includes novels, short stories, children's books, and essays. In her books she deals with topics that include man's relationship with himself a ...
.


Search

SiSU can populate an SQL database with ''objects'' (equating generally to paragraph-sized chunks) so searches may be performed and matches returned with that degree of granularity (e.g. your search criteria are met by these documents and at these locations within each document). Document output formats share a common object numbering system for locating content. This is particularly suitable for "published" works (finalized texts as opposed to works that are frequently changed or updated) for which it provides a fixed means of reference of content.


History

SiSU has been under development since 1997, and written in
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
since 2000. It was released under the GPL in January 2005. SiSU developed out of work done on a project started earlier on documents related to (primarily private)
international commercial law International Commercial Law is a body of legal rules, conventions, treaties, domestic legislation and commercial customs or usages, that governs international commercial or business transactions. A transaction will qualify to be international if e ...
and international trade law started in 1993 on a site known then as Ananse, and more recently a
LexMercatoria
SiSU first open source was on January 5, 2005, and to
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
was in July 2005. SiSU version 1 was released December 2009. SiSU version 2 was released March 2010. Version 2 features a new processing engine. Markup remains substantially identical between versions, apart from changes to the markup for document headers (which contain document metadata and processing instructions). Both version 1 and 2 text processing engines are available in the version 2 tarball. Development takes place on the version 2 branch. Version 1 is available to guarantee compatibility with older prepared texts (prior to the updating of document headers), and as an earlier reference implementation.


Notes and references


External links

*
SiSU
original homepage {{DEFAULTSORT:Sisu Unix software Lightweight markup languages Linux text-related software Free software