Not so static, static site generator

The concept of static sites is foreign to folks that have grown up in a three tier architecture or designing a single page applications. It presents unique challenges of showing personalized information or feeling the need to have a REST URL to make a request to get data. It challenges our brain to find solutions at build time versus page render time. As teams are moving towards continuous delivery will they become more comfortable populating pages that would of shown "live data" with data that is produced at build time for the sake of page performance?

As we got to brainstorming we wanted to incorporate the concept of popular pages or popular posts that traditional CMS provide. The idea is to place an indicator by links to pages with the most hits specifically on our landing pages such as java examples, java exercises or java tutorials. Since Level up lunch is built using jekyll, a static site generator, there isn't a data store behind it so we couldn't just pull stats from a database, or could we?

Jekyll 1.3 added support to read in data files so all we needed to do is get JSON, YAML or CSV data into a directory named _data and it would be available in our liquid templates. LUL, like most websites, uses google analytics and contains information about all of our top pages so we created a relatively simple modular process to populate data files pre build time.

High level process

We use jekyll grunt plugin to build LUL primarily because we have more control building for a specific environment, like the selection of available plugins and lack of ruby experience. At a high level, during a production build we will make a request to google analytics for the top pages for the last thirty days and drop it into the _data directory. Once the data is present we will kick off the jekyll build to generate the site.

Level up lunch's build process

Google analytics java plugin

Getting the permissions to access the analytics api was the biggest challenge due to google cloud less than intuitive, built by a developer looking interface (Why do you need to create an application to access analytics, just give me a key to analytics?!?). We cloned java ga client project and made a couple slight modifications. Since LUL has a pretty good, straight forward, url structure we were able to use content drill down report which groups data by subfolder and use the URL as artificial key to match while processing templates.

Making a request we pass in the start date, end date and profile ID to the GA API to pull a listing of the top 25 pages in a directory. It returns, URL, title and pages which is written to the a file in the _data directory. We will package this code up in a jar and call it in our build process via a grunt task.

Google analytics java code

private static GaData executeDataQuery(Analytics analytics,
        String profileId, String startDate, String endDate, String pagePath)
        throws IOException {

    return analytics.data().ga()
            .get("ga:" + profileId, startDate, endDate, "ga:visits")
            .setDimensions("ga:pagePath, ga:pageTitle")
            .setSort("-ga:visits")
            .setFilters(String.format("ga:pagePath=~^%s", pagePath))
            .setMaxResults(25).execute();
}

Output

/java/examples/maximum-value-in-array/ Java - Max value in array | Level Up Lunch  1009
/java/examples/construct-build-uri/ Java - Construct or build URI | Level Up Lunch 928
/java/examples/convert-list-to-map/ Java - Convert list to map | Level Up Lunch 850

Calling java jar via grunt

Using the grunt exec plugin we will execute the jar via the command line which will pass in date parameters that we want to fetch from google analytics.

module.exports = function(grunt) {

    function padLeft(nr, n, str){
        return Array(n-String(nr).length+1).join(str||'0')+nr;
    }
    
    function yyyy_mm_dd(dateIn) {
       var yyyy = dateIn.getFullYear();
       var mm = dateIn.getMonth() + 1; 
       var dd  = dateIn.getDate();
       return yyyy + "-" + padLeft (mm, 2) + "-" + padLeft(dd, 2); 
    }

    var endDate = new Date(), startDate = new Date();
    startDate.setDate(endDate.getDate() - 30);

    grunt.config.set("exec", {
        analytics_popular: {
            cwd: "../analytics-cmdline-sample/",
            command: "mvn -q exec:java -Dexec.args='" + yyyy_mm_dd(startDate) + " " + yyyy_mm_dd(endDate) + "'", 
            stdout: true,
            stderr: true
        }
    });

    grunt.loadNpmTasks("grunt-exec");
};

Liquid template code

Once the data is written to the _data the jekyll build process can begin. Below is a sample snippet to output the popular java exercises on the site. Since the page title from google analytics included "| Level Up Lunch" we opted to write a ruby plugin to prettify it.

{% for page in site.data.popular-java-exercises.rows %}
  {% assign titleScrubbed =  page[1] | scrubPageTitle %}
  <a href="{{ page[0] }}">{{ titleScrubbed }}</a></h1>
{% endfor %}
module Jekyll
  module ScrubPageTitle
    def scrubPageTitle(input)
      map = {"Java -" => "", "| Level Up Lunch" => ""}
      re = Regexp.new(map.keys.map { |x| Regexp.escape(x) }.join('|'))
      input = input.gsub(re, map)
    end
  end
end

Liquid::Template.register_filter(Jekyll::ScrubPageTitle)

This pattern leads to many possibilities of providing build time data to jekyll sites which might be good enough in many instances. In what ways has working with static site challenged your way of thinking and how did you get creative to solve them?