HOME
*





HtmlUnit
HtmlUnit is a headless web browser written in Java. It allows high-level manipulation of websites from other Java code, including filling and submitting forms and clicking hyperlinks. It also provides access to the structure and the details within received web pages. HtmlUnit emulates parts of browser behaviour including the lower-level aspects of TCP/IP and HTTP. A sequence such as getPage(url), getLinkWith("Click here"), click() allows a user to navigate through hypertext and obtain web pages that include HTML, JavaScript, Ajax and cookies. This headless browser can deal with HTTPS security, basic HTTP authentication, automatic page redirection and other HTTP headers. It allows Java test code to examine returned pages either as text, an XML DOM, or as collections of forms, tables, and links. The goal is to simulate real browsers; namely Chrome, Firefox ESR 38, Internet Explorer 8 and 11, and Edge (experimental). The most common use of HtmlUnit is test automation of web pages ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Headless Web Browser
A headless browser is a web browser without a graphical user interface. Headless browsers provide automated control of a web page in an environment similar to popular web browsers, but they are executed via a command-line interface or using network communication. They are particularly useful for testing web pages as they are able to render and understand HTML the same way a browser would, including styling elements such as page layout, colour, font selection and execution of JavaScript and Ajax which are usually not available when using other testing methods. Since version 59 of Google Chrome and version 56 of Firefox, there is native support for remote control of the browser. This made earlier efforts obsolete, notably PhantomJS. Use cases The main use cases for headless browsers are: * Test automation in modern web applications (web testing) * Taking screenshots of web pages. * Running automated tests for JavaScript libraries. * Automating interaction of web pages. Other uses ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Selenium (software)
Selenium is an open source umbrella project for a range of tools and libraries aimed at supporting browser automation. It provides a playback tool for authoring functional tests across most modern web browsers, without the need to learn a test scripting language (Selenium IDE). It also provides a test domain-specific language (Selenese) to write tests in a number of popular programming languages, including JavaScript (Node.js), C#, Groovy, Java, Perl, PHP, Python, Ruby and Scala. Selenium runs on Windows, Linux, and macOS. It is open-source software released under the Apache License 2.0. History Selenium was originally developed by Jason Huggins in 2004 as an internal tool at ThoughtWorks. Huggins was later joined by other programmers and testers at ThoughtWorks, before Paul Hammant joined the team and steered the development of the second mode of operation that would later become "Selenium Remote Control" (RC). The tool was open sourced that year. In 2005 Dan Fabulich ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Java (programming Language)
Java is a high-level, class-based, object-oriented programming language that is designed to have as few implementation dependencies as possible. It is a general-purpose programming language intended to let programmers ''write once, run anywhere'' ( WORA), meaning that compiled Java code can run on all platforms that support Java without the need to recompile. Java applications are typically compiled to bytecode that can run on any Java virtual machine (JVM) regardless of the underlying computer architecture. The syntax of Java is similar to C and C++, but has fewer low-level facilities than either of them. The Java runtime provides dynamic capabilities (such as reflection and runtime code modification) that are typically not available in traditional compiled languages. , Java was one of the most popular programming languages in use according to GitHub, particularly for client–server web applications, with a reported 9 million developers. Java was originally developed ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS, iOS, and also for Android, where it is the default browser. The browser is also the main component of ChromeOS, where it serves as the platform for web applications. Most of Chrome's source code comes from Google's free and open-source software project ''Chromium'', but Chrome is licensed as proprietary freeware. WebKit was the original rendering engine, but Google eventually forked it to create the Blink engine; all Chrome variants except iOS now use Blink. , StatCounter estimates that Chrome has a 67% worldwide browser market share (after peaking at 72.38% in November 2018) on personal computers (PC), is most used on tablets (having surpassed Safari), and is also dominant on smartphones and at 65% across all platforms combined. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Google Web Toolkit
Google Web Toolkit (GWT ), or GWT Web Toolkit, is an open-source set of tools that allows web developers to create and maintain JavaScript front-end applications in Java. It is licensed under the Apache License 2.0. GWT emphasizes reusable approaches to everyday web development tasks, namely asynchronous remote procedure calls, history management, bookmarking, UI abstraction, internationalization, and cross-browser portability. History GWT version 1.0 RC 1 was released on May 16, 2006. Google announced GWT at the JavaOne conference in 2006. In August 2010, Google acquired Instantiations, a company known for focusing on Eclipse Java developer tools, including GWT Designer, which is now bundled with Google Plugin for Eclipse. In 2011 with the introduction of the Dart programming language, Google reassured the GWT community that GWT would continue to be supported for the foreseeable future but also hinted at a possible rapprochement between the two Google approaches t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Xalan
Xalan is a popular open source software library from the Apache Software Foundation, that implements the XSLT 1.0 XML transformation language and the XPath 1.0 language. The Xalan XSLT processor is available for both the Java and C++ programming languages. It combines technology from two main sources: an XSLT processor originally created by IBM under the name LotusXSL, and an XSLT compiler created by Sun Microsystems under the name XSLTC. A wrapper for the Eiffel language is available. See also * Java XML * Apache Xerces * libxml2 * Saxon XSLT Saxon is an XSLT and XQuery processor created by Michael Kay and now developed and maintained by his company, Saxonica. There are open-source and also closed-source commercial versions. Versions exist for Java, JavaScript and .NET. The current ... References External links Xalan Home page Xalan Java (programming language) libraries Java platform Software using the Apache license XSLT processors {{Compu-library-stub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages. Overview The XPath language is based on a tree representation of the XML document, and provides the ability to navigate around the tree, selecting nodes by a variety of criteria. In popular use (though not in the official specification), an XPath expression is often referred to simply as "an XPath". Originally motivated by a desire to provide a common syntax and behavior model between XPointer and XSLT, subsets of the XPath query language are used in other W3C specifications such as XML Schema, XForms and the Internationalization Tag Set (ITS). XPath has been adopted by a number of ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Parsing
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Latin ''pars'' (''orationis''), meaning part (of speech). The term has slightly different meanings in different branches of linguistics and computer science. Traditional sentence parsing is often performed as a method of understanding the exact meaning of a sentence or word, sometimes with the aid of devices such as sentence diagrams. It usually emphasizes the importance of grammatical divisions such as subject and predicate. Within computational linguistics the term is used to refer to the formal analysis by a computer of a sentence or other string of words into its constituents, resulting in a parse tree showing their syntactic relation to each other, which may also contain semantic and other information (p-values). Some parsing algor ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Rhino (JavaScript Engine)
Rhino is a JavaScript engine written fully in Java and managed by the Mozilla Foundation as open source software. It is separate from the SpiderMonkey engine, which is also developed by Mozilla, but written in C++ and used in Mozilla Firefox. History The Rhino project was started at Netscape in 1997. At the time, Netscape was planning to produce a version of Netscape Navigator written fully in Java and so it needed an implementation of JavaScript written in Java. When Netscape stopped work on ''Javagator'', as it was called, the Rhino project was finished as a JavaScript engine. Since then, a couple of major companies (including Sun Microsystems) have licensed Rhino for use in their products and paid Netscape to do so, allowing work to continue on it. Originally, Rhino compiled all JavaScript code to Java bytecode in generated Java class files. This produced the best performance, often beating the C++ implementation of JavaScript run with just-in-time compilation (JIT), b ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Apache Software Foundation
The Apache Software Foundation (ASF) is an American nonprofit corporation (classified as a 501(c)(3) organization in the United States) to support a number of open source software projects. The ASF was formed from a group of developers of the Apache HTTP Server, and incorporated on March 25, 1999. As of 2021, it includes approximately 1000 members. The Apache Software Foundation is a decentralized open source community of developers. The software they produce is distributed under the terms of the Apache License and is a non-copyleft form of free and open-source software (FOSS). The Apache projects are characterized by a collaborative, consensus-based development process and an open and pragmatic software license, which is to say that it allows developers who receive the software freely, to re-distribute it under nonfree terms. Each project is managed by a self-selected team of technical experts who are active contributors to the project. The ASF is a meritocracy, implying t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Document Object Model
The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed. The principal standardization of the DOM was handled by the World Wide Web Consortium (W3C), which last developed a recommendation in 2004. WHATWG took over the development of the standard, publishing it as a living document. The W3C now publishes stable snapshots of the WHATWG standard. In HTML DOM (Document Object Model), every element is a node: * A document is a document node. * All HTML elements are element nodes. * ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Web Scraping
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis. Scraping a web page involves fetching it and extracting from it. Fetching is the downloading of a page (which a browser does when a user views a page). Therefore, web crawling is a main component of web scraping, to fetch pages for later processing. Once fetched, extraction can take place. The content of a page may be parsed, searched and reformatted, and its data copied into a spreadsheet or loaded into a database. Web scrapers typically ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]