Java Open Source Projects

HTML Parsers

HTML Parser
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags a...
CyberNeko HTML Parser
NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces. The parser...
HtmlCleaner
HtmlCleaner is open-source HTML parser written in Java. HTML found on Web is usually dirty, ill-formed and unsuitable for further processing. For any serious consumption of such do...
Cobra: Java HTML Renderer & Parser
Cobra is a pure Java HTML renderer and DOM parser that is being developed to support HTML 4, Javascript and CSS 2. Cobra can be used as a Javascript-aware and CSS-aware HTML DOM...
TagSoup
TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish , though quite of...
Jericho HTML Parser
Jericho HTML Parser is a simple but powerful java library allowing analysis and manipulation of parts of an HTML document, including some common server-side tags, while reproducing...
HotSAX
HotSAX is a small fast SAX2 parser for HTML, XHTML and XML.
JTidy
JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. In add...
HtmlRipper
HtmlRipper is a Java package that contains routines that enable dynamic data to be extracted from Web pages, HTML documents, using pre-defined rule sets. These routines allow you t...

List of Companies, Suppliers, Distributors, Importers & Exporters
Add to favorites | Contact US | English Books | Why and How | Sitemap Generator
Copyright © 2007 - 2008 BizDrv.com