Big Data/Analytics Zone is brought to you in partnership with:

A friendly Finnish hacker. I am technology consultant, open source advocator and entrepreneur. My expertise areas cover HTML5, Python, Plone, Javascript, WebGL,UNIX and mobile web. Mikko likes sushi, Angry Birds and dislikes winter. Mikko is a DZone MVB and is not an employee of DZone and has posted 43 posts at DZone. You can read more from them at their website. View Full User Profile

A Script for Generating Google Docs from Spreadsheet Data Sources

01.31.2013
| 10378 views |
  • submit to reddit

Microsoft Office supports “data sources” to generate e.g. letters, invoices, address stickers and other repeating documents based on Microsoft Word template and Microsoft Excel data. This is a very common small business problem and Office has had a solution for it from mid-90s. Google Apps, the cloud based alternative to Microsoft Office, do not offer similar functionality natively (or at least if they do they hide it really well). However, you can quite easily create your own document generator using Google Apps scripting as long as your are proficient in programming. In this blog post I’ll show an example how to create such a script and learn the basics of Google Apps Scripts.

1. Ingredients of the document generator

We have following the following inputs for our business problem

  • A Google Apps spreadsheet which contains customer data.
  • A Google Apps docs template document. Based on it, we want to generate a document for each customer by filling this template document with the data from the spreadsheet.
  • A Google Drive folder where the resulting documents are stored.
  • A Google Apps script which automatizes the task for us (Javascript based)

In my case the use case was generating contract texts for the customers based on their price and quality of service data. Then I just exported and emailed the resulting Google Docs as PDF.

All of these are stored on your Google Apps account in Google Drive. All editing happens through Google Apps user interface, no external tools needed.

Example of source data (obfuscated with obfuscate.js)

Example of the template document (obfuscated with obfuscate.js). You can see the source labels, unfilled.

Example of the resulting document – labels filled in and no longer in bold (obfuscated withobfuscate.js)

2. Short introduction to Google Apps Script

Google Apps Scripts can be invoked in two ways

Because we are not working on the spreadsheet we need to use the former approach.

Google Apps Script is a JavaScript (ECMAScript version unspecified? Does it run V8?) cloud scripting language that provides easy ways to automate tasks across Google products and third party services. The Google Apps Script has extensive API documentation with examples and tutorials, but they are still much subject to change as almost everything is marked as experimental and already there exist a lot of deprecated methods. The Google Apps Scripts can also access Google Maps, contacts, email, sites, Google Apps domain setting and basically have automation solution almost everything you can do in Google cloud.

The script is executed on the server-side and you have a non-fancy localized browser based UI to edit and debug your script.

The philosophy and UI design patters feel like a step back to 90s, to the Visual Basic scripting environment. Maybe Google Apps developers wanted this… so that Visual Basic developers feel back home. However, coming from a web development, Javascript and general programming background you will find the lack of Firebug / Web Inspector like console disturbing. It does not feel like any other Javascript development, though certainly the syntax is the same.

So my minor complains include, but are not limited to

  • Logging from the applications is possible, but the log trace is very unreadable in UI
  • The program does not have a specific entry point, you need to choose a function using a selection widget. This makes the script feel like a toy.
  • Debugger (and lack of console) does not seem to allow you to modify and dynamically poke objects in run-time (call functions, etc.)
  • Debugger is a bit slow (round-trip to Google servers, a bit), though still pretty much useable
  • Lack of low end user interaction tools in standalone scripting (please see below)
  • API documents and reality did not always match (as everything is still experimental)

Debugger in action

Things could be better, but in the end I managed to get done what I was looking for and I am still not paying a penny for Google Apps, so I am happy. Also, I do not wish to go back to Microsoft Office unless I need to write well-formatted print documents… Google Docs is a toy what comes to heavy and graphically sensitive document authoring like offers…. Or presentations… where Keynote is the king.

3. The generator script

In the beginning of the script you have constants which define on which data to operate. You could build an user interface making the script to full web application, but this is too cumbersome approach for such a small task. The UI builder seemed nice, but definitely an overkill. Though there exist Google Apps Script API methods for performing simple prompt() question in the browser,  for some reason they were not supported in standalone scripting… so the fastest approach to enter data into the script was simply edit the script itself before each run. I sooo started to miss command line… first time in my life.

So, in the beginning of the script you define the source data

  • Spreadsheet id (you can pick it up from URL when you edit the document)
  • Template document id (you can pick it up from URL when you edit the document)
  • Customer id which is the spreadsheet row number, for the current script run
  • The Google Driver folder id where the resulting document will be placed for sharing. Again you can pick the id from URL when opening the folder.

Then the script simply replaces words with data. The keyword to be replaced in the template document are identified as the column labels (1st row) in the spreadsheet data. I am pretty sure there would be more efficient methods to do this, but I did not wish to spend time to go to knee deep to GS to figure out its nuances.

And then the script… please feel free to modify to your own needs (generator.gs):

/**
 * Generate Google Docs based on a template document and data incoming from a Google Spreadsheet
 *
 * License: MIT
 *
 * Copyright 2013 Mikko Ohtamaa, http://opensourcehacker.com
 */

// Row number from where to fill in the data (starts as 1 = first row)
var CUSTOMER_ID = 1;

// Google Doc id from the document template
// (Get ids from the URL)
var SOURCE_TEMPLATE = "xxx";

// In which spreadsheet we have all the customer data
var CUSTOMER_SPREADSHEET = "yyy";

// In which Google Drive we toss the target documents
var TARGET_FOLDER = "zzz";

/**
 * Return spreadsheet row content as JS array.
 *
 * Note: We assume the row ends when we encounter
 * the first empty cell. This might not be 
 * sometimes the desired behavior.
 *
 * Rows start at 1, not zero based!!!  
 *
 */
function getRowAsArray(sheet, row) {
  var dataRange = sheet.getRange(row, 1, 1, 99);
  var data = dataRange.getValues();
  var columns = [];

  for (i in data) {
    var row = data[i];

    Logger.log("Got row", row);

    for(var l=0; l<99; l++) {
        var col = row[l];
        // First empty column interrupts
        if(!col) {
            break;
        }

        columns.push(col);
    }
  }

  return columns;
}

/**
 * Duplicates a Google Apps doc
 *
 * @return a new document with a given name from the orignal
 */
function createDuplicateDocument(sourceId, name) {
    var source = DocsList.getFileById(sourceId);
    var newFile = source.makeCopy(name);

    var targetFolder = DocsList.getFolderById(TARGET_FOLDER);
    newFile.addToFolder(targetFolder);

    return DocumentApp.openById(newFile.getId());
}

/**
 * Search a paragraph in the document and replaces it with the generated text 
 */
function replaceParagraph(doc, keyword, newText) {
  var ps = doc.getParagraphs();
  for(var i=0; i<ps.length; i++) {
    var p = ps[i];
    var text = p.getText();

    if(text.indexOf(keyword) >= 0) {
      p.setText(newText);
      p.setBold(false);
    }
  } 
}

/**
 * Script entry point
 */
function generateCustomerContract() {

  var data = SpreadsheetApp.openById(CUSTOMER_SPREADSHEET);

  // XXX: Cannot be accessed when run in the script editor?
  // WHYYYYYYYYY? Asking one number, too complex?
  //var CUSTOMER_ID = Browser.inputBox("Enter customer number in the spreadsheet", Browser.Buttons.OK_CANCEL);
  if(!CUSTOMER_ID) {
      return; 
  }

  // Fetch variable names
  // they are column names in the spreadsheet
  var sheet = data.getSheets()[0];
  var columns = getRowAsArray(sheet, 1);

  Logger.log("Processing columns:" + columns);

  var customerData = getRowAsArray(sheet, CUSTOMER_ID);  
  Logger.log("Processing data:" + customerData);

  // Assume first column holds the name of the customer
  var customerName = customerData[0];

  var target = createDuplicateDocument(SOURCE_TEMPLATE, customerName + " agreement");

  Logger.log("Created new document:" + target.getId());

  for(var i=0; i<columns.length; i++) {
      var key = columns[i] + ":"; 
      // We don't replace the whole text, but leave the template text as a label
      var text = customerData[i] || ""; // No Javascript undefined
      var value = key + " " + text;
      replaceParagraph(target, key, value);
  }

}


Published at DZone with permission of Mikko Ohtamaa, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)