I thought I’d use this post to provide a brief introduction to the MarkLogic Toolkits for Office. So here’s an overview:
What is a Toolkit?
We care about Word, Excel, and PowerPoint, because with Office 2007, their respective document formats are now XML. Take a .docx, .xlsx, or .pptx and change it’s file extension to .zip. Extract the file and inside you’ll find a bunch of interrelated XML parts.
This update to the document formats provides an interesting opportunity as people can now work with XML without learning new, specialized tools, or even really being aware of the fact that they’re working with XML. Authors continue to use the tools they know and are familiar with in Office, and we can provide additional functionality to them by taking advantage of the XML. The Toolkits provide ways for us to enhance the authoring experience within Office as well as on the Server where we can prepare content for Office as well as additional consumers.
Each Toolkit is composed of 3 major components:
- Add-in for Word | Excel | PowerPoint
- XQuery API
- Sample Applications
The Add-in is just a standard Windows application you install using a .msi. Double-click the .msi to start installation, click next, next, next, through the dialog screens as you would with any Windows app, and the next time you start Office you’ll find a Task Pane on the right hand side of the application (see image below).
NOTE: The Task Pane is just a browser! It’s using whatever version of IE is installed on the client, and exposing that within Office.
We wanted to avoid creating a situation where users had to constantly re-install Add-ins on the client. By making it a browser, we can update functionality by simply changing the application code on the Server.
The XQuery APIs for Word, Excel, and PowerPoint provide functions for developers to manipulate and generate Office documents on the Server.
The goal is to simplify the use of Open XML. Along with each XQuery API, a CPF pipeline for MarkLogic is also provided that will automatically update Office documents on the Server as they are ingested to make all the XML more friendly for search and reuse. These updates are done without using any custom XML and without losing any document fidelity.
NOTE: The Add-in and XQuery API can work in concert or separately. If your authors use Office, it might make sense to use both. If you’re querying Office documents (or some other XML format) on the Server, and delivering results through a regular browser or some other consumer, you might not need Office on the client at all. But if you’re delivering an Office document as a result, yes, you still may require the Office application on the client, but you don’t necessarily require Add-in. It all depends on your use-cases and particular goals.
XQuery API docs are provided with each TK as well.
These Samples are VERY simple. They provide just a sliver of the available API functionality. Again, they’re intended to jumpstart development. We provide these samples so a developer can quickly see some useful functionality, open the source to see how the code looks, and get in there and start hacking to create the app they actually want. Developers can reference the API docs and add/change functionality as they require. When you look at the docs you’ll see that there is a LOT that can be done on the client and in the Server that isn’t demonstrated in the Samples at all.
NOTE: The Sample Application is not the Toolkit. It’s just one example of the type of application you can build using a TK.
A Toolkit Guide rounds out the documentation with details on creating, configuring, and delivering solutions that use the Toolkits.
Office is ubiquitous. The goal of the Toolkits is to keep authors authoring and analysts analyzing in the tools they are already using and comfortable with. Office is a publisher and consumer of XML, MarkLogic is an XML Server. The products compliment each other very nicely and we can create a much richer Office experience for authors without requiring them to learn new, custom tools, or even be aware of the fact that behind the scenes, it’s all XML.
The response to the TKs has been very positive. I’ve seen an increase in interest lately, and it’s been great to hear people are using these and finding them very useful. I was surprised to hear at #MLUC10 how one person has deployed all 3 TKs across his organization and is very excited about the possibilities. He also told me that multiple authors are enjoying the Sample apps in PowerPoint as-is. Very cool!
I just have to say, I get excited too! Each Office application has a different degree of XML friendliness and Word is by far the friendliest. With the Toolkit for Word we can use Word as a browser into MarkLogic. At work I send content back and forth between Word and MarkLogic and never have to save a local .docx on the client. It’s just XML going back and forth. Office consumes and publishes Open XML. Using the XQuery API, I can dynamically create the XML Word requires for consumption from alternative XML formats. Also, Word publishes WordprocessingML, but my destination XML format isn’t necessarily always Office docs. It’s pretty awesome, and that’s just Word! Similar opportunities exist for Excel and PowerPoint as well.
Spoiler Alert: there’s more awesome coming!
So that’s it. You’re now Toolkit experts. Go download the Toolkits and have fun creating your own MarkLogic applications for Office. If you have any questions , comments, or suggestions for the TKs please feel free to drop me a line in the comments. Thanks!