Written by John DeVight on 2011-June-10
Download RAZOR Source Code
Download ASPX Source Code
Download WebForm Source Code
Overview
I recently needed to generate a legal document based on an existing Word document format. It was necessary to be able to maintain the exact format of the legal document. I decided that the best solution for this was to use the Open XML SDK 2.0 for Microsoft Office to generate the document. Since the Microsoft Office 2007 document is based on ZIP and XML, it was easy to open the Word document as a ZIP file, extract the XML document, format the XML document as an XSL template and then transform the XSL template into an XML document and insert the new XML document into the Word document.
To generate an XML document containing all the data to apply to the XSL template for the transformation, I used the System.Xml.Serialization.XmlSerializer class to serialize my model object into XML.
The XML Template
Creating the XML Template
The fastest way to create the XML Template is to create a Word document using Microsoft Word and putting sample data into the Word document. The sample data becomes the placeholders for the xsl tags.
Once the Word document is finished, rename the Word document to have a ZIP file extension. Open the ZIP file, and in the ZIP file, open the word folder and copy the document.xml file. Paste the document.xml file in a location outside of the ZIP file. Rename the document.xml file as document.xslt.
Open the document.xslt file and replace the xml element at the top of the file with the following:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ms="urn:schemas-microsoft-com:xslt"> <xsl:template match="/">
At the end of the document.xslt file, add the following:
</xsl:stylesheet>
Adding XSL tags
In the attached example, I am creating a Word document that shows some statistics for my website. I have some heading information to be populated in places and I have two tables to populate. The two types of xsl tags that I used are: xsl:value-of and xsl:for-each.
To display the website name, I used the xsl:value-of function. Since the Website class contains a property called name, when the Website object is serialized into XML, the Name property looks like this:
<Website> <Name>ASP.NET Wiki</Name> ... </Website>
So the implementation of the xsl:value-of function to display the Website name is:
<xsl:value-of select="Website/Name"/>
To display the table of top 10 countries that visit the website, I used the xsl:for-each function. When Word creates a table, it creates a <w:tbl> XML document tag. Each row that appears in the table is in a <w:tr> XML document tag and each cell is in a <w:tc> XML document tag. Since I put place holder values in the Word document when it was originally created, it was easy to identify where the information is displayed, which is in a <w:t> XML document tag.
I put the xsl:for-each start tag around the before the <w:tr> XML document tag and the xsl:for-each end tag after the </w:tr> XML document tag. Then I used the xsl:value-of function to display each value in the appropriate cell. Here is a sample of what I did to display the list of top 10 countries:
<xsl:for-each select="Website/Statistics/CountryStatisticsList/CountryStatistics"> <w:tr w:rsidR="004508CE" w:rsidTr="004508CE"> <w:tc> ... <w:p w:rsidR="004508CE" w:rsidRDefault="004508CE"> <w:r> <w:t> <xsl:value-of select="Name"/> </w:t> </w:r> </w:p> </w:tc> ... </w:tr> </xsl:for-each>
Using the Microsoft XPath Extension Functions
In the example I need to be able to format date and time values. I used the ms:format-date and ms:format-time Microsoft XPath Extension Functions to do this.
The ms:format-date function takes two parameters, a date value in XSD format, and a string containing the format for the date. Here is an example of how I used the ms:format-date function:
<xsl:value-of select="ms:format-date(Website/Statistics/StartDate, 'MMM dd, yyyy')"/>
The ms:format-time function takes two parameters, a time value in XSD format, and a strintg containing the format for the time. Here is an example of how I used the ms:format-time function:
<xsl:value-of select="ms:format-time(AvgTimeOnSite, 'mm:ss')"/>
Creating the Website Model Class
The website model class is defined very much like any other model class. However, there are a few things of note about the Website class. I added a ToXml() function to convert an instance of the Website class into XML and properties of type DateTime, I a specified the System.Xml.Serialization.XmlElementAttribute to specify whether the XML datatype was a date or a time.
Implementing the ToXml() Method with System.Xml.Serialization.XmlSerializer
For the example, I implemented the ToXml() method in the Website class, but it is generic enough to be placed in a base class. Here is the code:
public string ToXml() { string xml = string.Empty; StreamWriter writer = null; try { MemoryStream stream = new MemoryStream(); writer = new StreamWriter(stream); XmlSerializer serializer = new XmlSerializer(this.GetType()); serializer.Serialize(writer, this); xml = Encoding.ASCII.GetString(stream.GetBuffer()); } catch { } finally { if (writer != null) writer.Close(); } return xml; }
Implementing the System.Xml.Serialization.XmlElementAttribute for DateTime Properties
I have a few properties that are of type date and one property that is of type time. The XmlElementAttribute contructor has several optional parameters. The parameter that I implemented is the DataType parameter. For Date properties, I set the DataType parameter to "date" and for the Time property, I set the DataType parameter to "time".
Here is the code:
[XmlElement(DataType = "date")] public DateTime StartDate { get; set; } [XmlElement(DataType = "time")] public DateTime AvgTimeOnSite { get; set; }
Generating the Word Document using the Open XML SDK 2.0 for Microsoft Office
The first step to generating the Word document is convert the instance of the Website class to an XML string and then transform the data with the XSL file file using the XslCompiledTransform class. The resulting XML is the document.xml that gets placed in the Word document.
I then open the Word document that I created with the place holder data in it and replace the document.xml file in the Word Document with the new document.xml file that I created in using the XslCompiledTransform class. To do this, I load the Word document into a MemoryStream and then use the DocumentFormat.OpenXml.Packaging.WordprocessingDocument class to open the MemoryStream. I then create an instance of the DocumentFormat.OpenXml.Packaging.Body class and set the XML string created from the transformation as the new body for the Word document and then close the Word document. Finally, I send the MemoryStream as a response back to the browser and the user is prompted to open or save the Word document.
I put the code for performing the transformation and replacing Word document body is a method called HomeController.GenerateWordDocument() / _Default.ExportToWordButton_OnClick(). It is a generic method that could be placed in a base class or another class that gets called by the HomeController / _Default class. Here is the code:
private static void GenerateWordDocument(string xmlBody, string xsltBody, ref MemoryStream templateDocumentStream) { XmlDocument wordBody = new XmlDocument(); // Create a writer for the output of the Xsl Transformation. StringWriter sw = new StringWriter(); XmlWriter xw = XmlWriter.Create(sw); // Create the Xsl Transformation object. XslCompiledTransform transform = new XslCompiledTransform(); transform.Load(new XmlTextReader(new StringReader(xsltBody))); // Transform the xml data into Open XML 2.0 Wordprocessing format. transform.Transform(XmlReader.Create(new StringReader(xmlBody)), xw); // Create an Xml Document with the new content. wordBody.LoadXml(sw.ToString()); // Use the Open XML SDK version 2.0 to open the output document in edit mode. using (WordprocessingDocument output = WordprocessingDocument.Open(templateDocumentStream, true)) { // Using the body element within the new content XmlDocument create // a new Open Xml Body object. Body updatedBodyContent = new Body(wordBody.DocumentElement.InnerXml); // Replace the existing Document Body with the new content. output.MainDocumentPart.Document.Body = updatedBodyContent; // Close the output document. output.Close(); } }
The HomeController.ExportToWord() method is called by a javascript function.
Here is the code for the HomeController.ExportToWord() method:
public ActionResult ExportToWord() { string xml = GetWebsite().ToXml(); string templateDocumentPath = string.Format("{0}/Website.docx", Server.MapPath("~/App_Data")); string xsltBodyPath = string.Format("{0}/document.xslt", Server.MapPath("~/App_Data")); Stream templateDocumentReadStream = System.IO.File.OpenRead(templateDocumentPath); BinaryReader templateDocumentBinaryReader = new BinaryReader(templateDocumentReadStream); byte[] templateDocumentByteArray = templateDocumentBinaryReader.ReadBytes( Convert.ToInt32(templateDocumentReadStream.Length)); MemoryStream templateDocumentStream = new MemoryStream(); templateDocumentStream.Write(templateDocumentByteArray, 0, templateDocumentByteArray.Length); string xsltBody = System.IO.File.ReadAllText(xsltBodyPath); XmlDocument xmlBody = new XmlDocument(); xmlBody.LoadXml(xml); GenerateWordDocument(xmlBody.OuterXml, xsltBody, ref templateDocumentStream); byte[] fileContent = templateDocumentStream.ToArray(); templateDocumentStream.Close(); Response.Buffer = true; Response.AddHeader("Content-Disposition", "filename=Website.docx"); return File(fileContent, "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "Website.docx"); }
Here is the code for the javascript function:
window.location.replace('/Home/ExportToWord/');
References
- Open XML SDK 2.0 for Microsoft Office
- Microsoft XPath Extension Functions
- XSLT <xsl:value-of> Element
- XSLT <xsl:for-each> Element
Support ASP.NET Wiki
If you like this page, click on the "Share on" links in the wikidot toolbar at the top of the page to share it with your friends.
Thanks for this post. This what is what I was looking for. Most examples tend to assume the developer is building the doc from scratch in VBA or equiv which is unrealistic in practice as most likely the business will come to them and say "Make these for me".
I have very large complex requirements document to be built based on an ISO approved word doc, so building from scratch would be impractical.
Couple of questions though.
1. Will the process above work ok with auto numbering? So if I were to wrap the required word XML with a "for-each" then it would auto number?
2. Generally our servers are LINUX based and not MS. I assume that will need to fire up a MS based server. I would need to load .NET, the SDK, and does it need to be running IIS or will Apache work. Anything else I would need to load?
Thanks for the help.
Mike
Hi Mike,
1. The process will work with auto numbering.
2. Yes, you will need an MS based server, .NET, the SDK. You will not need IIS. Apache will work just fine.
Regards,
John DeVight
Telerik MVP