Allen的自由天空

有财富未必有人生,有人生未必有财富

导航

Add DOC to PDF and Other Conversions to Microsoft Office SharePoint Server 2007 with Aspose Components

Introduction

This article describes how you can add an ability to convert DOC to PDF (DOC2PDF) to Microsoft Office SharePoint Server 2007 (MOSS) using Aspose.Words and Aspose.Pdf.

In this article, you will create a small console application in Visual Studio that works as a document converter for SharePoint and invokes the Aspose components to perform the conversion.

It is easy to add other types of conversions such as DOC to RTF, RTF to DOC, DOC to WordML, WordML to DOC, HTML to DOC etc by following the example in this article. You can also investigate other Aspose file format components such as Aspose.Cells and Aspose.Slides and use them to support even more types of document conversions in SharePoint.

Document Converters in SharePoint

Microsoft Office SharePoint Server 2007 includes a new feature that allows to convert documents from one format (content type) to another. You can use document conversions to transform your content to suit your business requirements. You can invoke the conversion from the user interface or programmatically via the SharePoint Object Model.

Built-in Document Converters in SharePoint

SharePoint includes several document converters that you can use out of the box:

  • .DOCX (Office Open XML) to HTML web page (also .DOCM to web page)
  • InfoPath to HTML web page
  • XML to HTML web page

Converting a Word Document (DOCX) to a Web Page using a built-in MOSS document converter.

Screenshot - image002.jpg

Need for More Document Converters

The set of document converters included with MOSS is limited. You can only convert DOCX, InfoPath and XML documents into web pages.

There are many possible scenarios where additional converters might be required:

  • When a draft document is stored in one format (Microsoft Word DOC) and the final document is published in another format (Adobe PDF) to a customer-facing site.
  • When the main format for documents is DOCX inside the organization, but it needs to make the documents available to its customers and partners as DOC documents and vice versa.
Extensible Document Converter Framework

Thankfully, the document converters framework in SharePoint is extensible. It allows for custom converters to be implemented and seamlessly integrated into SharePoint allowing for any required content type conversion to be supported.

There is a good section Document Converters Overview in MSDN about document converters. Although it is geared towards developers implementing a custom document converter, it makes a good reading for any IT professional who has a task of planning or support document converters in SharePoint.

Summary of Document Converters in SharePoint

To summarize the features of Document Converters in SharePoint:

  • Extensible. Custom converters can be added to facilitate almost any content conversion.
  • A Document Converter is an executable. You can develop one or find a suitable commercial product.
  • Document conversions are usually resource intensive, they run on the server(s) and are controlled by the SharePoint load balancer service.
  • Documents that are results of the conversion can be versioned and they maintain a link to the original document in their metadata, history, properties etc.

Aspose to the Rescue

Aspose provides a great line of .NET and Java components. Trusted by thousands of customers worldwide, the products include File Format Components, Reporting Products, Visual Components and Utility Components.

Aspose File Format Components include products such as Aspose.Words, Aspose.Cells, Aspose.Pdf, Aspose.Slides and so on that allow to programmatically open, modify, generate, save, merge, convert etc documents in various formats including DOC, RTF, WordML, HTML, PDF, XLS, PPT and others. These products are .NET class libraries that developers use when building their .NET or Java applications that require access to documents in different formats.

Aspose File Format Components are often choosen for their superior performance, scalability and stability in a server environment over Microsoft Office Automation. Microsoft Office Automation is not recommended on the server, here are reasons why.

While Aspose components cannot be directly used as document converters for SharePoint out of the box, this article shows how you can easily create a small .NET application that wraps an Aspose component and works as a document converter for SharePoint.

Create a Document Converter for MOSS

A document converter for SharePoint (MOSS) is a custom executable that SharePoint calls with command line arguments. The arguments specify the input, output, configuration and log files. The command line arguments are described in detail in Document Converter Run Command in MSDN.

We are going to create a simple console application in Visual Studio 2005 that supports the command line arguments passed by SharePoint and performs the DOC to PDF conversion using Aspose.Words and Aspose.Pdf.

In this example we are using Visual Studio 2005 and the application will be built for .NET 2.0, but you can also use Visual Studio 2003 and the document converter will be built for .NET 1.1, it will also work fine. SharePoint has no requirements regarding .NET version to document converters, in fact, a document converter does not have to be a .NET application at all, it just needs to be an executable.

Download and Install Aspose Components

You need to download Aspose.Words for .NET and Aspose.Pdf for .NET from Aspose Downloads. You can either download Aspose.Total or download the individual products.

Install Aspose.Words and Aspose.Pdf on your development computer. All Aspose components, when installed, work in evaluation mode. The evaluation mode has no time limit and injects watermarks into produced documents.

Create a Project

Start Visual Studio 2005 and create a new console application. This example will show a C# console application, but you can use VB.NET too.

Add References

This project will use Aspose.Words and Aspose.Pdf. Add references to both Aspose.Words and Aspose.Pdf to your project:

  1. Add a reference to C:\Program Files\Aspose\Aspose.Words\Bin\net2.0\Aspose.Words.dll
  2. Add a reference to C:\Program Files\Aspose\Aspose.Pdf\Bin\net2.0\Aspose.Pdf.dll

Screenshot - image004.jpg

Add Code

The following is the complete code of the document converter in the Class1.cs file.

Collapse
using System;
using System.IO;
namespace AsposeDoc2Pdf
{
/// <summary>
/// DOC2PDF document converter for SharePoint.
/// Uses Aspose.Words and Aspose.Pdf to perform the conversion.
/// </summary>
class Class1
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void
Main
(string[] args)
{
// Although SharePoint passes "-log <filename>" to us and we are
// supposed to log there, for the sake of simplicity, we will use 
// our own hard coded path to the log file.
// 
// Make sure there are permissions to write into this folder.
// The document converter will be called under the document 
// conversion account (not sure what name), so for testing purposes 
// I would give the Users group write permissions into this folder.
gLog = new StreamWriter(@"C:\Aspose2Pdf\log.txt", true);
try
{
gLog.WriteLine(DateTime.Now.ToString() + " Started");
gLog.WriteLine(Environment.CommandLine);
ParseCommandLine(args);
// Uncomment the code below when you have purchased licenses for 
// Aspose products. You can purchase licenses for individual 
// products such as Aspose.Words and Aspose.Pdf or you can 
// purchase Aspose.Total license for all Aspose components.
//
// You need to deploy the license in the same folder as your 
// executable, alternatively you can add the license file as an 
// embedded resource to your project.
//
// // Set license for Aspose.Words.
// Aspose.Words.License wordsLicense = new Aspose.Words.License();
// wordsLicense.SetLicense("Aspose.Total.lic");
// 
// // Set license for Aspose.Pdf.
// Aspose.Pdf.License pdfLicense = new Aspose.Pdf.License();
// pdfLicense.SetLicense("Aspose.Total.lic");
ConvertDoc2Pdf(gInFileName, gOutFileName);
}
catch (Exception e)
{
gLog.WriteLine(e.Message);
Environment.ExitCode = 100;
}
finally
{
gLog.Close();
}
}
private static void ParseCommandLine(string[] args)
{
int i = 0;
while (i < args.Length)
{
string s = args[i];
switch (s.ToLower())
{
case "-in":
i++;
gInFileName = args[i];
break;
case "-out":
i++;
gOutFileName = args[i];
break;
case "-config":
// Skip the name of the config file and do nothing.
i++;
break;
case "-log":
// Skip the name of the log file and do nothing.
i++;
break;
default:
throw new Exception("Unknown command line argument: " + s);
}
i++;
}
}
private static void ConvertDoc2Pdf(string inFileName, string outFileName)
{
// Load the DOC file into Aspose.Words.
// You can load not only DOC here, but any format supported by
// Aspose.Words: DOC, RTF, WordML, HTML.
Aspose.Words.Document srcDoc = new Aspose.Words.Document(inFileName);
// If the document contains any images, they will be saved as temporary 
// files (deleted by Aspose.Pdf when conversion is finished.
//
// Since we will be saving the document into a memory stream, Aspose.Words
// will attempt to save the images into the system temporary folder.
//
// This could bring up issues with permissions, therefore, it is better
// to specify where to save the images explicitly so we can be sure there 
// will be no problems with permissions. Lets save them into the same
// folder as the output file.
srcDoc.SaveOptions.ExportImagesFolder = Path.GetDirectoryName(outFileName);
// Save the DOC file as Aspose.Pdf.Xml in memory.
MemoryStream xmlDoc = new MemoryStream();
srcDoc.Save(xmlDoc, Aspose.Words.SaveFormat.AsposePdf);
xmlDoc.Position = 0;
// Read the document into Aspose.Pdf.
Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();
pdf.BindXML(xmlDoc, null);
// Instruct Aspose.Pdf to delete temporary image files.
pdf.IsImagesInXmlDeleteNeeded = true;
// Produce the PDF file.
pdf.Save(outFileName);
}
private static string gInFileName;
private static string gOutFileName;
private static StreamWriter gLog;
}
}

Select the Release configuration and rebuild the solution.

You now have the AsposeDoc2Pdf.exe executable that can be used as a document converter for SharePoint.

How to Build Converters for Other Formats

It is very easy to build more document converters. Aspose.Words supports DOC, RTF, WordML and HTML documents and can perform conversions between these formats in any direction. Conversions between Microsoft Word formats (DOC, RTF and WordML) are high-fidelity, meaning no content or formatting in the document is lost.

For example, to create an RTF to WordML converter with Aspose.Words, you will use the following code:

// Load the RTF file into Aspose.Words.
Aspose.Words.Document doc = new Aspose.Words.Document(inFileName);
// Save the document in the WordprocessingML format.
doc.Save(outFileName, Aspose.Words.SaveFormat.WordML);

Deploy a Document Converter to MOSS

The document converter for SharePoint must be packaged as a SharePoint Feature and deployed at the Web-application level.

If you need an overview of deploying document converters as SharePoint features, see the following topics in MSDN:

A Feature in SharePoint is a unit of functionality that can be added/removed to a SharePoint server. A feature is defined in an XML file that describes the feature, its name, scope and required files. The feature definition XML and accompanying files must be placed in a folder in the C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES folder.

Each feature needs to have a Feature.xml file that specifies the feature name, unique id, scope and the elements that comprise the feature.

Create a Folder for the Feature

Create the C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES\AsposeDoc2Pdf folder on the SharePoint server.

Create a Feature Definition XML File

In the feature folder, create the Feature.xml as shown below.

Content of the Feature.xml file.

<Feature xmlns=http://schemas.microsoft.com/sharepoint/
Id="{b4ce4c29-8aaf-4b80-bb63-d676e836f8ef}"
Title="DOC to PDF Converter (by Aspose)"
Description="Makes it possible to convert documents from DOC to PDF."
Scope="WebApplication">
<ElementManifests>
<ElementManifest Location="Elements.xml"/>
<ElementFile Location="AsposeDoc2Pdf.exe"/>
<ElementFile Location="Aspose.Words.dll"/>
<ElementFile Location="Aspose.Pdf.dll"/>
</ElementManifests>
</Feature>

If you create more converters later on, pick a different guid for the feature. The easiest way to generate a unique guid is to use the Tools / Generate GUID menu in Visual Studio.

Create a Document Converter Definition XML File

The ElementManifest element in the Feature.xml file refers to the Elements.xml file. This file contains the actual definition of the document converter. The definition of the document converter includes unique id, display name, the name of the executable to launch and the extensions of the source and destination content types.

In the feature folder, create the Elements.xml as shown below.

Content of the Elements.xml file.

<Elements xmlns="http://schemas.microsoft.com/sharepoint/">
<DocumentConverter Id="{a4df1dac-a22c-431a-bbf6-dcc91848fee9}"
Name="Word Document to PDF (by Aspose)"
App="AsposeDoc2Pdf.exe"
From="doc"
To="pdf"
/>
</Elements>

If you create more converters later on, pick a different guid for the converter. The easiest way to generate a unique guid is to use the Tools / Generate GUID menu in Visual Studio.

AsposeDoc2Pdf is now deployed as a SharePoint Feature.

Screenshot - image006.jpg

Enable Document Converters

You need to enable document conversions in SharePoint as they seem to come disabled by default.

Go to the Central Administrator / Application Management / Configure Document Conversion screen and enable document conversions.

Enable document conversions in MOSS.

Screenshot - image008.jpg

It is a good idea to check that the document conversion services are installed and running. In my case they were installed and running.

Go to the Central Administration / Operations / Services on Server and make sure that the Document Conversions Launcher Service and Document Conversions Load Balancer Services are installed and running.

Check Document Conversion services are installed and running.

Screenshot - image010.jpg

Install the Document Converter Feature

Now we need to install the feature so the document converter becomes available in SharePoint. Execute the following command on the server:

"C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\STSADM.EXE"
-o installfeature -filename AsposeDoc2Pdf\Feature.xml –force

Activate the Document Converter Feature

Now we need to activate the document converter, execute the following command on the server:

"C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\STSADM.EXE"
-o activatefeature -name AsposeDoc2Pdf -url http://win2k3r2ee

Note that you need to specify a URL in the –url argument. I have not fully figured out what URL exactly must be specified, I just specified the name of my SharePoint server and it worked, making the document converter available to all SharePoint sites on this server.

Now it is a good time to verify that the feature indeed was installed and activated. In my case, I found I still needed to click the Activate button in the SharePoint Central Aministration / Application Management / Manage Web Applications Features window.

Make sure the new document converter feature is activated in the Manage Web Application Features window.

Screenshot - image012.jpg

Copy the Document Converter Files!

After installing and activating the document converter as a SharePoint Feature I was expecting that the conversions will just run. But the converions did not run (nothing was happening) and I had to examine the logs in the C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\LOGS folder.

The error message I was getting was that the Document Conversion Launcher Service was attempting to start my converter C:\Program Files\Microsoft Office Servers\12.0\TransformApps\AsposeDoc2Pdf.exe, but there was no such file in that directory.

Most likely, I have done something wrong in the Feature Definition XML file so the files of the feature were not copied to the correct location, but I could not find a solution here, so I copied the files manually.

Copy the following files from C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES\AsposeDoc2Pdf to C:\Program Files\Microsoft Office Servers\12.0\TransformApps:

  • AsposeDoc2Pdf.exe
  • Aspose.Words.dll
  • Aspose.Pdf.dll
Make Sure the Converter is Enabled for the Site

Go to your SharePoint home page, click Site Actions, Site Settings, Modify All Site Settings, Site Content Types, Document, Manage Document Conversion for This Content Type and make sure your document converter is enabled.

Checking Document Converter is enabled for Documents on this SharePoint Site.

Screenshot - image014.jpg

Test Your Conversion

Finally, we can test if the conversion works.

Upload a test DOC file to the server. I uploaded a document called "Distributable VHD Image EULA.doc".

Upload a DOC file to MOSS.

Screenshot - image016.jpg

Click on your test file so the context menu opens, select Convert Document, Word Document to PDF (by Aspose).

Selecting to convert a DOC document to PDF.

Screenshot - image018.jpg

Click OK in the window confirming the request for conversion.

Note the conversion is managed by the document conversion schedule and document conversion load balancer service so it might not happen instantenously. By default, the document conversion kicks off every minute.

Just refresh the page with the list of documents after several seconds until you see the converted document appears in the list.

The document that is a result of the conversion appears in the list.

Screenshot - image020.jpg

Click on the document to download it. It is a PDF document and will fire up Adobe Reader on your machine.

Adobe Reader displays the PDF document downloaded from the MOSS site.

Screenshot - image022.jpg

Just to verify that Aspose.Words and Aspose.Pdf did a great job at accurately converting DOC to PDF, open the original DOC in Microsoft Word and compare with what you have in Adobe Reader.

The original DOC file opened in Microsoft Word to compare how well it was converted to PDF.

Screenshot - image024.jpg

Troubleshooting

If the conversion does not work, check the MOSS log files, they are pretty detailed. The log files are in C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\LOGS.

Summary

In this article I have shown how to use Aspose.Words and Aspose.Pdf to add the DOC2PDF conversion feature to MOSS. I have also shown it is very easy to add many more types of conversions to SharePoint using Aspose components.

As of the time of writing (April 2007), Aspose.Words does not yet support Open Office XML (DOCX) format, but we are working on it. Aspose.Words export to DOCX is planned for Q2 2007 and DOCX import is planned for later in 2007. Having support for DOCX in Aspose.Words will allow to do even more high-fidelity document conversions on the server, so bear with us. Information about new releases of Aspose.Words is published in the Aspose.Words Blog.

If you feel that building your own document converter as shown in this article is too much for you and you would prefer a finished product with a simple installer, let us know. We might package it as a product eventually, say Aspose.Words for SharePoint.

If you are using Microsoft SQL Server Reporting Services, make sure to check our another great product Aspose.Words for Reporting Services that makes possible generating true DOC, RTF and WordprocessingML reports in Microsoft SQL Server 2005 Reporting Services.

Please excuse any technical inaccuracies regarding SharePoint (if you find any) because my experience with SharePoint before this article was nil and I had to spend some time grasping many concepts that were new to me such as sites, web applications, document libraries and so on and what they mean in the context of MOSS.

Any questions or comments are welcome in the Aspose.Words Forums.

posted on 2007-10-08 16:52  AllenFeng  阅读(2270)  评论(0编辑  收藏  举报