SiSU (SiSU information structuring universe or Structured information, serialized units), is a
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
command line
A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
-oriented framework for document structuring, publishing and search.
Usage
Using
markup applied to a document, or a collection of documents, SiSU can produce
plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
,
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
,
XHTML
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
While HTML, prior ...
,
EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
,
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
,
OpenDocument
The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed wi ...
,
LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well.
In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
or
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
files, and populate an
SQL database.
Document structuring
SiSU offers its user a way to structure plain text and to add graphics, hyperlinks, endnotes, footnotes etc. with simple text editing programs such as Notepad (Windows), TextEdit (Mac) or Gedit (Linux). The
lightweight markup language
A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightwei ...
is mnemonic and
human readable
A human-readable medium or human-readable format is any encoding of data or information that can be naturally read by humans.
In computing, ''human-readable'' data is often encoded as ASCII or Unicode text, rather than as binary data. In most c ...
.
To process the marked up document(s) with SiSU, the user issues a command via the
command-line
A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
of the computer terminal. The output can be generated in multiple formats (html, pdf, epub, and others) with one single command.
Publishing and self-publishing
A document, or a collection of documents, which has been processed by SiSU is technically ready to be published on the web, or printed on paper. Canadian author
Cory Doctorow
Cory Efram Doctorow (; born July 17, 1971) is a Canadian-British blogger, journalist, and science fiction author who served as co-editor of the blog ''Boing Boing''. He is an activist in favour of liberalising copyright laws and a proponent of ...
, for instance, has used SiSU as a publishing tool and blogged about it. In a newspaper article, Doctorow called SiSU an "automated ebook workflow tool".
Earlier examples of webpublishing with SiSU are ''Projet de traité instituant l'Union Européenne / Draft Treaty Establishing the European Union'' and the novel
Tainaron
Cape Matapan ( el, Κάβο Ματαπάς, Maniot dialect: Ματαπά), also named as Cape Tainaron or Taenarum ( el, Ακρωτήριον Ταίναρον), or Cape Tenaro, is situated at the end of the Mani Peninsula, Greece. Cape Matapan ...
by Finnish author
Leena Krohn
Leena Krohn (born February 28, 1947 in Helsinki) is a Finnish author. Her large and varied body of work includes novels, short stories, children's books, and essays. In her books she deals with topics that include man's relationship with himself a ...
.
Search
SiSU can populate an
SQL database with ''objects'' (equating generally to paragraph-sized chunks) so searches may be performed and matches returned with that degree of granularity (e.g. your search criteria are met by these documents and at these locations within each document). Document output formats share a common object numbering system for locating content. This is particularly suitable for "published" works (finalized texts as opposed to works that are frequently changed or updated) for which it provides a fixed means of reference of content.
History
SiSU has been under development since 1997, and written in
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
since 2000. It was released under the GPL in January 2005. SiSU developed out of work done on a project started earlier on documents related to (primarily private)
international commercial law International Commercial Law is a body of legal rules, conventions, treaties, domestic legislation and commercial customs or usages, that governs international commercial or business transactions. A transaction will qualify to be international if e ...
and international trade law started in 1993 on a site known then as Ananse, and more recently a
LexMercatoria
SiSU first open source was on January 5, 2005,
and to
Debian
Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
was in July 2005. SiSU version 1 was released December 2009. SiSU version 2 was released March 2010. Version 2 features a new processing engine. Markup remains substantially identical between versions, apart from changes to the markup for document headers (which contain document metadata and processing instructions). Both version 1 and 2 text processing engines are available in the version 2 tarball. Development takes place on the version 2 branch. Version 1 is available to guarantee compatibility with older prepared texts (prior to the updating of document headers), and as an earlier reference implementation.
Notes and references
External links
*
SiSU original homepage
{{DEFAULTSORT:Sisu
Unix software
Lightweight markup languages
Linux text-related software
Free software