Transform xml into html using XSLT using Apache XALAN

XSLT provides a way to transform a XML document into various format types. In this screencast find out how to convert XML into HTML using a java based XSLT transformer.

Detailed Video Notes

XSL stands for Extensible Stylesheet Language and can be used to transform XML documents into different types, similar to how CSS works with HTML. In other words if you have an XML input and XSL file you can produce new documents in different formats. In this tutorial we will walk through transforming XML into HTML using Apache's XALAN implementation.

Understanding implementations

Java by nature provides a spec or an interface where vendors and open source projects create implementations to consume. Transforming xml w/ XSLT is no exception. Popular java processor implementation, some with a cost while others are open source, include SAXON, apache cocoon, Context framework and apache XALAN which we will be using in our examples.

Project set up

[0:45]

We will create a maven project within eclipse and add the xalan jar in our pom.xml

<dependency>
    <groupId>xalan</groupId>
    <artifactId>xalan</artifactId>
    <version>2.7.2</version>
</dependency>

Create xml document

[0:50]

Creating a file named books.xml we will pull a sample XML file of books from Microsoft. This file will act as a source of a data which could be retrieved from a local file store, database, SOAP web service or rest web service that returns xml.

<?xml version="1.0"?>
<catalog>
   ...
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications
      with XML.</description>
   </book>
   ...
</catalog>   

Create stylesheet

[1:5]

Once we have xml we need the view or the template which the xml will bind to so let's create a books.xsl. This tutorial will focus on converting to html but know you could substitute various output types such as PDF, XML, Microsoft word document or just about any output type. We won't dive into detail to explain what each XSLT element it as it can be found on mozilla xslt elements reference site. We did include a reference to boostrap for table styling

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">

    <xsl:output method="html" indent="no" omit-xml-declaration="yes"
        doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
        doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
        encoding="iso-8859-1" />

    <xsl:template match="catalog">
        <html lang="en">
            <head>
                <meta charset="UTF-8" />
                <title>Level up lunch book transformation with XSLT</title>

                <!-- Latest compiled and minified CSS -->
                <link rel="stylesheet"
                    href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css" />

                <!-- Optional theme -->
                <link rel="stylesheet"
                    href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap-theme.min.css" />

                <!-- Latest compiled and minified JavaScript -->
                <script
                    src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/js/bootstrap.min.js"></script>
            </head>
            <body>

                <h2>Level up lunch - XSL + XML = Bootstrap</h2>

                <table class="table table-striped">
                    <thead>
                        <tr>
                            <th>Title</th>
                            <th>Author</th>
                            <th>Genre</th>
                            <th>Description</th>
                        </tr>
                    </thead>
                    <tbody>
                        <xsl:for-each select="book">
                            <tr>
                                <td>
                                    <xsl:value-of select="title" />
                                </td>
                                <td>
                                    <xsl:value-of select="author" />
                                </td>
                                <td>
                                    <xsl:value-of select="genre" />
                                </td>
                                <td>
                                    <xsl:value-of select="description" />
                                </td>
                            </tr>
                        </xsl:for-each>
                    </tbody>
                </table>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>

Transforming in java code

[1:35]

Next let's write the java code that will take the XML and XSL style sheet to produce an HTML document. The TransformerFactory is the main class to do the transformation and was created in a way to allow consumers to configure with various providers. Each implementation may differ on which configuration options are allowed so be aware if you plan on supporting multiple implementation libraries.

TransformerFactory transformerFactory = TransformerFactory.newInstance(
                "<insert class name as string from classpath>", null);

StreamSource is an abstraction class so the same parsing code and be used from various sources like DOMSource, SAXSource, StAXSource or JAXBSource. We will read in the books.xml and books.xsl files using java 7 syntax passing them into into a StreamSource. The final piece before making the transformation is where should the output go. Initializing a StreamResult by converting a Paths to a file we will output the contents of the transformation to myfile.html. The final step is obtaining a transformer and passing in the data, the template and the output. Let's run this code and view the output in chrome.

public static void main(String[] args) throws TransformerException {

    StreamSource xlsStreamSource = new StreamSource(Paths
            .get("src/test/resources/books.xsl")
            .toAbsolutePath().toFile());

    StreamSource xmlStreamSource = new StreamSource(Paths
            .get("src/test/resources/books.xml")
            .toAbsolutePath().toFile());

    TransformerFactory transformerFactory = TransformerFactory.newInstance(
            "org.apache.xalan.processor.TransformerFactoryImpl", null);

    Path pathToHtmlFile = Paths.get("src/test/resources/myfile.html");
    StreamResult result = new StreamResult(pathToHtmlFile.toFile());

    Transformer transformer = transformerFactory.newTransformer(xlsStreamSource);
    transformer.transform(xmlStreamSource, result);

}

Outputting as ByteArray

If you are working within a web context you may want to write your results to the HttpServletResponse as a byte array. Below is a code snippet that could help get you started.

ByteArrayOutputStream baos = new ByteArrayOutputStream();
StreamResult result = new StreamResult(baos);

transformerFactory.newTransformer(xlsStreamSource).transform(
        xmlStreamSource, result);

System.out.println(result.getOutputStream().toString());

Hope you enjoyed today's level up, have a great day!