MarkLogic Toolkits for Word, Excel, and PowerPoint

I thought I’d use this post to provide a brief introduction to the MarkLogic Toolkits for Office.  So here’s an overview:

What is a Toolkit?

A Toolkit is a set of tools for jumpstarting your development with MarkLogic Server and Microsoft Office 2007 / Office 2010 / Open XML.

There are currently 3 Office Toolkits:

  1. MarkLogic Toolkit for Word
  2. MarkLogic Toolkit for Excel
  3. MarkLogic Toolkit for PowerPoint

We care about Word, Excel, and PowerPoint, because with Office 2007, their respective document formats are now XML.  Take a .docx, .xlsx, or .pptx and change it’s file extension to .zip.  Extract the file and inside you’ll find a bunch of interrelated XML parts.

This update to the document formats provides an interesting opportunity as people can now work with XML without learning new, specialized tools, or even really being aware of the fact that they’re working with XML.  Authors continue to use the tools they know and are familiar with in Office, and we can provide additional functionality to them by taking advantage of the XML.  The Toolkits provide ways for us to enhance the authoring experience within Office as well as on the Server where we can prepare content for Office as well as additional consumers.

Each Toolkit is composed of 3 major components:

  1. Add-in for Word | Excel | PowerPoint
  2. XQuery API
  3. Sample Applications

Add-in with supporting JavaScript API

The Add-in is just a standard Windows application you install using a .msi.  Double-click the .msi to start installation, click next, next, next, through the dialog screens as you would with any Windows app, and the next time you start Office you’ll find a Task Pane on the right hand side of the application (see image below).

NOTE: The Task Pane is just a browser!  It’s using whatever version of IE is installed on the client, and exposing that within Office.

The Addin may just be a browser, but it also installs a supporting library for interacting with the active document (the document being authored).   Access to this libary from the browser is available from the JavaScript API that comes with the Addin.  Developers can quickly create a webapp within Word, Excel, or PowerPoint that communicates with and/or is even served from MarkLogic Server and they can use the JavaScript APIs to get XML in and out of the document being authored.

We wanted to avoid creating a situation where users had to constantly re-install Add-ins on the client.  By making it a browser, we can update functionality by simply changing the application code on the Server.

JSDocs are provided with each Toolkit for the respective JavaScript API.

XQuery API

The XQuery APIs for Word, Excel, and PowerPoint provide functions for developers to manipulate and generate Office documents on the Server.

The goal is to simplify the use of Open XML.  Along with each XQuery API,   a CPF pipeline for MarkLogic is also provided that will automatically update Office documents on the Server as they are ingested to make all the XML more friendly for search and reuse.  These updates are done without using any custom XML and without losing any document fidelity.

NOTE: The Add-in and XQuery API can work in concert or separately.  If your authors use Office, it might make sense to use both. If you’re querying Office documents (or some other XML format) on the Server, and delivering results through a regular browser or some other consumer, you might not need Office on the client at all.  But if you’re delivering an Office document as a result, yes,  you still may require the Office application on the client, but you don’t necessarily require Add-in.  It all depends on your use-cases and particular goals.

XQuery API docs are provided with each TK as well.

Sample Applications

Rather than just give developers an empty browser with a .js file and JavaScript API documentation to start development with, each TK comes with Sample Applications.  A developer can just drop the Sample right into MarkLogic, configure their Addin to reference the URL of the Server, and quickly be up and running with applications within the Task Pane.

These Samples are VERY simple.  They provide just a sliver of the available API functionality.  Again, they’re intended to jumpstart development.  We provide these samples so a developer can quickly see some useful functionality, open the source to see how the code looks, and get in there and start hacking to create the app they actually want.  Developers can reference the API docs and add/change functionality as they require.  When you look at the docs you’ll see that there is a LOT that can be done on the client and in the Server that isn’t demonstrated in the Samples at all.

NOTE: The Sample Application is not the Toolkit.  It’s just one example of the type of application you can build using a TK.

NOTE: The Sample Application is not Office.  We skinned the samples to be the colors of Office.  But it’s just HTML, JavaScript, and CSS.  Remember, the Pane is just a browser serving up pages from MarkLogic.  The goal is to keep authors comfortable in their authoring environment, letting MarkLogic do what it does best (search, reuse, enrich, analyze, etc.) and let Office do what it does best (author,analyze,present).  If you want to use crazy colors and  the blink tag for your app, go for it!

A Toolkit Guide rounds out the documentation with details on creating, configuring, and delivering solutions that use the Toolkits.


Office is ubiquitous.  The goal of the Toolkits is to keep authors authoring and analysts analyzing  in the tools they are already using and comfortable with.  Office is a publisher and consumer of XML, MarkLogic is an XML Server.  The products compliment each other very nicely and we can create a much richer Office experience for authors without requiring them to learn new, custom tools, or even be aware of the fact that behind the scenes, it’s all XML.


The Toolkits are all free and now available on codeplex.  They are all open source, released under the Apache 2 license.

The response to the TKs has been very positive.  I’ve seen an increase in interest lately, and it’s been great to hear people are using these and finding them very useful.  I was surprised to hear at #MLUC10 how one person has deployed all 3 TKs across his organization and is very excited about the possibilities.  He also told me that multiple authors are enjoying the Sample apps in PowerPoint as-is.  Very cool!

I just have to say, I get excited too!  Each Office application has a different degree of XML friendliness and Word is by far the friendliest.  With the Toolkit for Word we can use Word as a browser into MarkLogic.  At work I send content back and forth between Word and MarkLogic and never have to save a local .docx on the client.  It’s just XML going back and forth.  Office consumes and publishes Open XML.  Using the XQuery API, I can dynamically create the XML Word requires for consumption from alternative XML formats. Also, Word publishes WordprocessingML, but my destination XML format isn’t necessarily always Office docs.  It’s pretty awesome, and that’s just Word!  Similar opportunities exist for Excel and PowerPoint as well.

Spoiler Alert: there’s more awesome coming!

So that’s it. You’re now Toolkit experts.  Go download the Toolkits and have fun creating your own MarkLogic applications for Office.  If you have any questions , comments, or suggestions for the TKs please feel free to drop me a line in the comments. Thanks!