Nokogiri (software)
   HOME
*





Nokogiri (software)
Nokogiri is an open source software library to parse HTML and XML in Ruby. It depends on libxml2 and libxslt to provide its functionality. Overview It markets itself as providing a sensible, easy-to-understand API for reading, writing, modifying, and querying documents. It is available for ruby as well as java through Jruby. It provides fast and standards-compliant parser by relying on native parsers like libxml2 ( CRuby) and xerces (JRuby). It is one of the most downloaded Ruby gems, having been downloaded over 550 million times from the rubygems.org repository. Features * DOM Parser for XML, HTML4, and HTML5 * SAX Parser for XML and HTML4 * Push Parser for XML and HTML4 * Document search via XPath 1.0 * Document search via CSS3 selectors * XSD Schema validation * XSLT XSLT (Extensible Stylesheet Language Transformations) is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. Headquartered in California, it has been a subsidiary of Microsoft since 2018. It is commonly used to host open source software development projects. As of June 2022, GitHub reported having over 83 million developers and more than 200 million repositories, including at least 28 million public repositories. It is the largest source code host . History GitHub.com Development of the GitHub.com platform began on October 19, 2007. The site was launched in April 2008 by Tom Preston-Werner, Chris Wanstrath, P. J. Hyett and Scott Chacon after it had been made available for a few months prior as a beta release. GitHub has an annual keynote called GitHub Universe. Organizational ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Libxml2
libxml2 is a software library for parsing XML documents. It is also the basis for the libxslt library which processes XSLT-1.0 stylesheets. Description Written in the C programming language, libxml2 provides bindings to C++, Ch, XSH, C#, Python, Kylix/Delphi and other Pascals, Ruby, Perl, Common Lisp, and PHP. It was originally developed for the GNOME project, but can be used outside it. libxml2's code is highly portable, since it depends on standard ANSI C libraries only, and it is released under the MIT license. This library was written by Daniel Veillard and receives active feedback from its users. It includes the command-line utility xmllint and an HTML parser. See also * libxslt (the LibXML2's XSLT module) * XML validation * Comparison of HTML parsers * Expat (library) * Saxon XSLT Saxon is an XSLT and XQuery processor created by Michael Kay and now developed and maintained by his company, Saxonica. There are open-source and also closed-source commercial versi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


XML Parsers
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML. The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services. Several schema systems exist to aid in the definition of XML-based languages, while programmers have developed many application programming interfaces (APIs) to aid the processing of XML data. Overview The main purpose of XML is serialization, i. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


XSLT
XSLT (Extensible Stylesheet Language Transformations) is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification. As of August 2022, the most recent stable version of the language is XSLT 3.0, which achieved Recommendation status in June 2017. XSLT 3.0 implementations support Java, .NET, C/C++, Python, PHP and NodeJS. An XSLT 3.0 Javascript library can also be hosted within the Web Browser. Modern web browsers also include native support for XSLT 1.0. For an XSLT document transformation, the original document is not changed; rather, a new document is created based on the content of an existing one. Typically, input documents are XML files, but anything from which the processo ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




XML Schema (W3C)
XSD (XML Schema Definition), a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document, to assure it adheres to the description of the element it is placed in. Like all XML schema languages, XSD can be used to express a set of rules to which an XML document must conform to be considered "valid" according to that schema. However, unlike most other schema languages, XSD was also designed with the intent that determination of a document's validity would produce a collection of information adhering to specific data types. Such a post-validation ''infoset'' can be useful in the development of XML document processing software. History XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. It was the first separate schema language for XML to achieve Recommendation stat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages. Overview The XPath language is based on a tree representation of the XML document, and provides the ability to navigate around the tree, selecting nodes by a variety of criteria. In popular use (though not in the official specification), an XPath expression is often referred to simply as "an XPath". Originally motivated by a desire to provide a common syntax and behavior model between XPointer and XSLT, subsets of the XPath query language are used in other W3C specifications such as XML Schema, XForms and the Internationalization Tag Set (ITS). XPath has been adopted by a number of ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Simple API For XML
SAX (Simple API for XML) is an event-driven online algorithm for parsing XML documents, with an API developed by the XML-DEV mailing list. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole—building the full abstract syntax tree of an XML document for convenience of the user—SAX parsers operate on each piece of the XML document sequentially, issuing parsing events while making a single pass through the input stream. Definition Unlike DOM, there is no formal specification for SAX. The Java implementation of SAX is considered to be normative. SAX processes documents state-independently, in contrast to DOM which is used for state-dependent processing of XML documents. Benefits A SAX parser only needs to report each parsing event as it happens, and normally discards almost all of that information once reported (it does, however, keep some ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Document Object Model
The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed. The principal standardization of the DOM was handled by the World Wide Web Consortium (W3C), which last developed a recommendation in 2004. WHATWG took over the development of the standard, publishing it as a living document. The W3C now publishes stable snapshots of the WHATWG standard. In HTML DOM (Document Object Model), every element is a node: * A document is a document node. * All HTML elements are element nodes. * ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


RubyGems
RubyGems is a package manager for the Ruby programming language that provides a standard format for distributing Ruby programs and libraries (in a self-contained format called a "gem"), a tool designed to easily manage the installation of gems, and a server for distributing them. It was created by Chad Fowler, Jim Weirich, David Alan Black, Paul Brannan and Richard Kilmer during RubyConf 2004. The interface for RubyGems is a command-line tool called ''gem'' which can install and manage libraries (the gems). RubyGems integrates with Ruby run-time loader to help find and load installed gems from standardized library folders. Though it is possible to use a private RubyGems repository, the public repository is most commonly used for gem management. The public repository helps users find gems, resolve dependencies and install them. RubyGems is bundled with the standard Ruby package as of Ruby 1.9. History Development on RubyGems started in November 2003 and was released to th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Apache Xerces
In computing, Xerces is Apache's collection of software libraries for parsing, validating, serializing and manipulating XML. The library implements a number of standard APIs for XML parsing, including DOM, SAX and SAX2. The implementation is available in the Java, C++ and Perl programming languages. The name "Xerces" is believed to commemorate the extinct Xerces blue butterfly (''Glaucopsyche xerces''). Xerces language versions There are several language versions of the Xerces parser: * Xerces2 Java, the Java reference implementation * Xerces C++, a C++ implementation * Xerces Perl, a Perl implementation. This implementation is a wrapper around the C++ API. Though technically marked active by Apache, there has been no major release in any language since 2010. Features The features supported by Xerces depend on the language, the Java version having the most features. See also * Apache License *Java XML {{unreferenced, article, date=April 2008 The Java programming langua ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

CRuby
Matz's Ruby Interpreter or Ruby MRI (also called CRuby) was the reference implementation of the Ruby programming language named after Ruby creator Yukihiro Matsumoto ("Matz"). Until the specification of the Ruby language in 2011, the MRI implementation was considered the ''de facto'' reference, especially since an independent attempt to create the specification (RubySpec) had failed. Starting with Ruby 1.9, and continuing with Ruby 2.x and above, the official Ruby interpreter has been YARV ("Yet Another Ruby VM"). The latest stable version is Ruby 3.1.0 History Yukihiro Matsumoto ("Matz") started working on Ruby on February 24, 1993, and released it to the public in 1995. "Ruby" was named as a gemstone because of a joke within Matsumoto's circle of friends alluding to the name of the Perl programming language. The 1.8 branch has been maintained until June 2013, and 1.8.7 releases have been released since April 2008. This version provides bug fixes, but also many Ruby feat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Jruby
JRuby is an implementation of the Ruby programming language atop the Java Virtual Machine, written largely in Java. It is free software released under a three-way EPL/GPL/LGPL license. JRuby is tightly integrated with Java to allow the embedding of the interpreter into any Java application with full two-way access between the Java and the Ruby code (similar to Jython for the Python language). JRuby's lead developers are Charles Oliver Nutter and Thomas Enebo, with many current and past contributors including Ola Bini and Nick Sieger. In September 2006, Sun Microsystems hired Enebo and Nutter to work on JRuby full-time. In June 2007, ThoughtWorks hired Ola Bini to work on Ruby and JRuby. In July 2009, the JRuby developers left Sun to continue JRuby development at Engine Yard. In May 2012, Nutter and Enebo left Engine Yard to work on JRuby at Red Hat. History JRuby was originally created by Jan Arne Petersen, in 2001. At that time and for several years following, the code was ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]