HTML Cleaner to process and clean HTML data in a smart and efficient way. The HTML Cleaner contains the ability to clean, or alter, a source HTML document or page, through the use of Cleaners. Cleaners are made up of a list of operations that can be build, saved and applied to an HTML document as required. The layout of the HTML Cleaner makes the comparison of the original document (or HTML source) with the processed view very clear.
With the batch processing capability, 100s of HTML file can be processed in seconds, making cleaning a whole website an easy task!
Ideal for taking an HTML page written in a rich text editor like Word and cleaning it before publishing a Blog or CMS to remove redundant code and formatting.
By way of an example many people compose content intended for SharePoint using Word. However, they struggle when the formatting in the published document clashes with that on their SharePoint intranet site. Rons Cleaner can strip the document of redundant formatting and code so that SharePoint can display the document using the corporate styling in use on that site. This applies equally to any Blog or CMS.
* Clean and Format a WEB Page : Build Cleaners from a list of rules, and view the formatted output.
* 8 powerful HTML Cleaning rules : Add Tag, Change Tag, Change Tag Content, Delete Tag, Set Attribute, Delete Attribute, Replace Attribute Value, Replace Text, Text Encoding
* Formatting Options : Remove Empty Tags, Remove HTML Comments
* Combine any number of rules in a Cleaner : Any number of rules can be combined into Cleaners and saved allowing total flexibility and fast operation.
* Save Cleaners : Saved Cleaners are listed above the processed page for quick retrieval.
* Live preview : Instant preview of the page after being cleaned by a Cleaner.
* Batch Processing : Scan a directory (and sub-directories) to process 100s of HTML files in seconds.
* Content Extraction : Show the content (text and pictures) of a web page with no distraction.
* Link and Resource Extraction : View lists of links and resources from a web page.