Create sitemap with MarkupBuilder

In a prior example we showed how to read a sitemap with XMLSlurper and the snippets below will focus on how to create a sitemap using MarkupBuilder and StreamingMarkupBuilder. A sitemap allows users and crawlers the ability to identify accessible webpages within a website. The setup will define a groovy class named MyUrlsthat will hold attributes for creating the sitemap.

Setup

class MyUrls {
    def loc
    def lastmod
    def changefreq
    def priority
}

Create with MarkupBuilder

@Test
public void create_sitemap_markupbuilder() {

    def writer = new StringWriter()
    def builder = new groovy.xml.MarkupBuilder(writer)

    def myUrls = [new MyUrls(changefreq: "weekly", loc: "http://www.leveluplunch.com/groovy/examples/", lastmod: "2014-07-22T11:46:18-05:00", priority: "0.8"),
                  new MyUrls(changefreq: "weekly", loc: "http://www.leveluplunch.com/java/examples/", lastmod: "2014-07-22T11:46:18-05:00", priority: "0.8")]

    builder.urlset {
        myUrls.each() { obj ->
            url() {
                changefreq(obj.changefreq)
                loc(obj.loc)
                lastmod(obj.lastmod)
                priority(obj.priority)
            }
        }
    }
    println writer.toString()
}

Output

<urlset>
  <url>
    <changefreq>weekly</changefreq>
    <loc>http://www.leveluplunch.com/groovy/examples/</loc>
    <lastmod>2014-07-22T11:46:18-05:00</lastmod>
    <priority>0.8</priority>
  </url>
  <url>
    <changefreq>weekly</changefreq>
    <loc>http://www.leveluplunch.com/java/examples/</loc>
    <lastmod>2014-07-22T11:46:18-05:00</lastmod>
    <priority>0.8</priority>
  </url>
</urlset>

Create with StreamingMarkupBuilder

If you need to take it a step further and add a prefix with namespace StreamingMarkupBuilder provides this option. I didn't realize when first adding elements to declareNamespace it prefixes "xmlns" to your key which resulted in Attribute name "xmlns:xmlns" associated with an element type "urlset" must be followed by the ' = ' character. Finally calling XmlUtil.serialize() will serialize the xml to a string.

@Test
public void create_sitemap_streamingmarkupbuilder() {

    def myUrls = [new MyUrls(changefreq: "weekly", loc: "http://www.leveluplunch.com/blog/2014/09/29/amazon-cloudfront-s3-subfolders-default-index/", lastmod: "2014-07-22T11:46:18-05:00", priority: "0.8"),
                  new MyUrls(changefreq: "weekly", loc: "http://www.leveluplunch.com/blog/2013/12/30/convert-recorded-audio-text-using-osx-dictation-audacity-soundflower/", lastmod: "2014-07-22T11:46:18-05:00", priority: "0.8")]

    def builder = new groovy.xml.StreamingMarkupBuilder()
    builder.encoding = 'UTF-8'
    def mySitemap = builder.bind {
        mkp.xmlDeclaration()
        mkp.declareNamespace(
                "xsi ":"http://www.w3.org/2001/XMLSchema-instance",
                "schemaLocation" : "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd",
                "":"http://www.sitemaps.org/schemas/sitemap/0.9")
        urlset() {
            myUrls.each() { obj ->
                url() {
                    changefreq(obj.changefreq)
                    loc(obj.loc)
                    lastmod(obj.lastmod)
                    priority(obj.priority)
                }
            }
        }
    }
    println XmlUtil.serialize(mySitemap)
}

Output

<?xml version="1.0" encoding="UTF-8"?><urlset xmlns:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <changefreq>weekly</changefreq>
    <loc>http://www.leveluplunch.com/blog/2014/09/29/amazon-cloudfront-s3-subfolders-default-index/</loc>
    <lastmod>2014-07-22T11:46:18-05:00</lastmod>
    <priority>0.8</priority>
  </url>
  <url>
    <changefreq>weekly</changefreq>
    <loc>http://www.leveluplunch.com/blog/2013/12/30/convert-recorded-audio-text-using-osx-dictation-audacity-soundflower/</loc>
    <lastmod>2014-07-22T11:46:18-05:00</lastmod>
    <priority>0.8</priority>
  </url>
</urlset>