Extensions Wiki » Extensions » XML Module

XML Module

Last modified by Thomas Mortagne on 2013/07/17 15:19
cogOffers XML and HTML/XHTML manipulation and cleaning APIs
TypeJAR
Developed by

XWiki Development Team

LicenseGNU Lesser General Public License 2.1
Bundled With

XWiki Enterprise, XWiki Enterprise Manager

Description

Features

  • XML Utility methods
  • HTML Utility methods
  • HTML Cleaner: cleans HTML and produces valid XHTML 1.1 content
  • Factory to create optimised XMLReader instances. This gives us a level of indirection versus using directly javax.xml.parsers.SAXParserFactory. We use that for example to verify if we're using Xerces and if so we configure it to cache parsed DTD grammars for better performance.

HTML Cleaning

The HTML Cleaner is pretty powerful: it uses HTMLCleaner to produce valid XML and then has a series of transformations to make the resulting XML valid XHTML 1.1 content (see the test suite).

Example:

// Initialize Rendering components and allow getting instances
EmbeddableComponentManager componentManager = new EmbeddableComponentManager();
componentManager.initialize(this.getClass().getClassLoader());

HTMLCleaner cleaner = componentManager.lookup(HTMLCleaner.class);
String xhtml = HTMLUtils.toString(cleaner.clean(new StringReader("this <b>is</b> bold")));
Assert.assertEquals("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
   + "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">\n"
   + "<html><head></head><body>"
   + "<p>this <strong>is</strong> bold</p>"
   + "</body></html>\n", xhtml);

To use the HTML Cleaner, you need the following dependency in your Maven pom.xml (available in Maven's Central Repository):

<dependency>
 <groupId>org.xwiki.commons</groupId>
 <artifactId>xwiki-commons-xml</artifactId>
 <version>3.2-milestone-3</version>
</dependency>
Tags:
Created by Vincent Massol on 2011/10/04 17:41

Download XWiki

My Recent Modifications

Get Connected