lazyload the images

最新精選 Featured post

異世界穿越作品整合

訂閱

訂閱 FB 專頁

每月雙數周日為固定發佈日

訂閱FB 專頁,有新發佈時將會立即看見。
Youtube 頻道經常會有電玩錄影

18/01/2014

C#, doc to pdf (MS Office, Spire, OpenOffice)

文章分類: , , , ,

CSharp, doc to pdf in ways, with microsft word, Spire or OpenOffice
今次為大家分享下如何在 c# 將*.doc, *.docx轉移成*.pdf

事源就是工作需要,不然也不會四出查探。
可惜最後任何方案都沒有被接納,算了,就當是知道多些事情。
既然都做了那就不要浪費,整理一下,放上來,日後再查看也很方便


Example下載檔案

Download: c# - doc to pdf in 3 ways
3 Projects
  • CSharp-MSOffice-doc2pdf
  • CSharp-Spire-doc2pdf-withPwd
  • CSharp-OpenOffice-doc2pdf

c# - Microsoft word doc to pdf

首先,第一件事就想到Microsoft Word,不是宣傳。只是辦公室軟件那最平常不過,那麼大型的全球性企業,一個形影不離的關係(MSoffice <--> windows)。
應該有些 API 什麼的,很順利真的找到了。

Microsoft.Office.Interop.Word - Version 11=2003
Microsoft.Office.Interop.Word - Version 12=2007
Microsoft.Office.Interop.Word - Version 14=2010
Microsoft.Office.Interop.Word - Version 15=2013

right click solution / project > add reference

Add Reference


Microsoft Visual Studio 2012

Microsoft Visual Studio 2008
using Microsoft.Office.Interop.Word;

Coding

            ApplicationClass wordApplication = new ApplicationClass();
            Document wordDocument = null;
            object paramSourceDocPath = @"importDocument.doc";
            object paramMissing = Type.Missing;

            string paramExportFilePath = @"exportDocument.pdf";
            WdExportFormat paramExportFormat = WdExportFormat.wdExportFormatPDF;
            bool paramOpenAfterExport = false;
            WdExportOptimizeFor paramExportOptimizeFor =
                WdExportOptimizeFor.wdExportOptimizeForPrint;
            WdExportRange paramExportRange = WdExportRange.wdExportAllDocument;
            int paramStartPage = 0;
            int paramEndPage = 0;
            WdExportItem paramExportItem = WdExportItem.wdExportDocumentContent;
            bool paramIncludeDocProps = true;
            bool paramKeepIRM = true;
            WdExportCreateBookmarks paramCreateBookmarks =
                WdExportCreateBookmarks.wdExportCreateWordBookmarks;
            bool paramDocStructureTags = true;
            bool paramBitmapMissingFonts = true;
            bool paramUseISO19005_1 = false;

            try
            {
                // Open the source document.
                wordDocument = wordApplication.Documents.Open(
                    ref paramSourceDocPath, ref paramMissing, ref paramMissing,
                    ref paramMissing, ref paramMissing, ref paramMissing,
                    ref paramMissing, ref paramMissing, ref paramMissing,
                    ref paramMissing, ref paramMissing, ref paramMissing,
                    ref paramMissing, ref paramMissing, ref paramMissing,
                    ref paramMissing);

                // Export it in the specified format.
                if (wordDocument != null)
                    wordDocument.ExportAsFixedFormat(paramExportFilePath,
                        paramExportFormat, paramOpenAfterExport,
                        paramExportOptimizeFor, paramExportRange, paramStartPage,
                        paramEndPage, paramExportItem, paramIncludeDocProps,
                        paramKeepIRM, paramCreateBookmarks, paramDocStructureTags,
                        paramBitmapMissingFonts, paramUseISO19005_1,
                        ref paramMissing);
            }
            catch (Exception ex)
            {
                // Respond to the error
            }
            finally
            {
                // Close and release the Document object.
                if (wordDocument != null)
                {
                    wordDocument.Close(ref paramMissing, ref paramMissing,
                        ref paramMissing);
                    wordDocument = null;
                }

                // Quit Word and release the ApplicationClass object.
                if (wordApplication != null)
                {
                    wordApplication.Quit(ref paramMissing, ref paramMissing,
                        ref paramMissing);
                    wordApplication = null;
                }

                GC.Collect();
                GC.WaitForPendingFinalizers();
                GC.Collect();
                GC.WaitForPendingFinalizers();
            }
它的原理是背後用MS word開啟你指定的doc再另存成pdf,如果你根本沒有安裝。一開始宣告便會出錯。
解決方法是安裝MS Word,好像在說廢話,但那是唯一辦法

c# - Spire doc to pdf

如果不想花錢在MSOffice,剛好聽見别人說Spire 今次真是好太多,不需額外安裝任何軟件。只需要在開發時參照幾個dll,執行放到同一目錄下即可。

Add reference



Spire 提供不同 .NET version 的API




(.NET Framework backward compatible)
.NET Framework 向下相容,要注意自己正在使用的版本
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using Spire.Pdf;
// add if need to encrypt the pdf
using Spire.Pdf.Security;

Coding

            Spire.Doc.Document document = new Spire.Doc.Document();
            document.LoadFromFile(@"importDoc.doc");

            // add "open password" to importDoc.doc and save as docx
            document.Encrypt("doc");
            document.SaveToFile(@"encryptedDoc.docx", Spire.Doc.FileFormat.Docx);

            //Save importDoc.doc file to pdf.
            document.SaveToFile(@"exportPDF.pdf", Spire.Doc.FileFormat.PDF);
            document.Close();

            // import "importDoc.doc" and export to pdf with password
            PdfDocument pdfDoc = new PdfDocument();
            pdfDoc.LoadFromFile(@"exportPDF.pdf", Spire.Pdf.FileFormat.PDF);

            pdfDoc.Security.KeySize = PdfEncryptionKeySize.Key128Bit;

            pdfDoc.Security.OwnerPassword = "123456";

            pdfDoc.Security.UserPassword = "654321";

            pdfDoc.Security.Permissions = PdfPermissionsFlags.Print | PdfPermissionsFlags.FillFields;

            pdfDoc.SaveToFile(@"exportPDFwithPwd.pdf");

            pdfDoc.Close();
以上是將 importDoc.doc 輸出成 encryptedDoc.docx 並加入開啟密碼"doc"
之後又將 importDoc.doc 輸出成 exportPDF.pdf
最後讀取 importDoc.doc 輸出成 exportPDFwithPwd.pdf 並加入開啟密碼"654321"

哇!那正正是我想要的功能。而且程式碼十分簡短易明,開心一會很快明白到天下沒有免費的午餐,Spire只是免費試用,在付款前會加入一個醒目信息

因為 doc 轉 pdf 加了一句,為 pdf 加密又一句
所以有兩句很醒目︰此文件透過Spire轉換
除此之外Spire的授權更昂貴,以年為單位購買的
Evaluation Warning in Spire free trial

c# - OpenOffice doc to pdf

碰壁便調頭走,一向都是取易不取難的原則
又找到個介符MSOffice, Spire兩者之間的東西

在雲端興起之前,MSOffice的潛在敵人是OpenOffice。打著open source的旗號一定會無私地奉送一個將doc轉換成pdf的方法。

有了清晰的下一步,GO! GO! GO~~~
OpenOffice有提供.NET Library,開發和執行的環境同樣需要5個dll。

Add reference


using System.IO;
using System.Diagnostics;
using System.Threading;
using uno;
using uno.util;
using unoidl.com.sun.star.lang;
using unoidl.com.sun.star.uno; // not in http://tinyway.wordpress.com/
using unoidl.com.sun.star.bridge; //not in http://tinyway.wordpress.com/
using unoidl.com.sun.star.frame;
using unoidl.com.sun.star.beans;

Coding

            ConvertToPdf("importDoc.doc", "exportPdf.pdf");
            
            
        public static void ConvertToPdf(string inputFile, string outputFile)
        {
            if (ConvertExtensionToFilterType(Path.GetExtension(inputFile)) == null)
                throw new InvalidProgramException("Unknown file type for OpenOffice. File = " + inputFile);

            StartOpenOffice();

            //Get a ComponentContext
            var xLocalContext = Bootstrap.bootstrap();
            //Get MultiServiceFactory
            unoidl.com.sun.star.lang.XMultiServiceFactory xRemoteFactory = (unoidl.com.sun.star.lang.XMultiServiceFactory)
               xLocalContext.getServiceManager();
            //Get a CompontLoader
            XComponentLoader aLoader = (XComponentLoader)xRemoteFactory.createInstance("com.sun.star.frame.Desktop");
            //Load the sourcefile

            XComponent xComponent = null;
            try
            {
                xComponent = initDocument(aLoader,
                   PathConverter(inputFile), "_blank");
                //Wait for loading
                while (xComponent == null)
                {
                    System.Threading.Thread.Sleep(1000);
                }

                // save/export the document
                saveDocument(xComponent, inputFile, PathConverter(outputFile));

            }
            catch { throw; }
            finally { xComponent.dispose(); }
        }

        private static void StartOpenOffice()
        {
            Process[] ps = Process.GetProcessesByName("soffice.exe");
            if (ps != null)
            {
                if (ps.Length > 0)
                    return;
                else
                {
                    Process p = new Process();
                    p.StartInfo.Arguments = "-headless -nofirststartwizard";
                    p.StartInfo.FileName = "soffice.exe";
                    p.StartInfo.CreateNoWindow = true;
                    bool result = p.Start();
                    if (result == false)
                        throw new InvalidProgramException("OpenOffice failed to start.");
                }
            }
            else
            {
                throw new InvalidProgramException("OpenOffice not found.  Is OpenOffice installed?");
            }
        }


        private static XComponent initDocument(XComponentLoader aLoader, string file, string target)
        {
            PropertyValue[] openProps = new PropertyValue[1];
            openProps[0] = new PropertyValue();
            openProps[0].Name = "Hidden";
            openProps[0].Value = new uno.Any(true);


            XComponent xComponent = aLoader.loadComponentFromURL(
               file, target, 0,
               openProps);

            return xComponent;
        }


        private static void saveDocument(XComponent xComponent, string sourceFile, string destinationFile)
        {
            PropertyValue[] propertyValues = new PropertyValue[2];
            propertyValues = new PropertyValue[2];
            // Setting the flag for overwriting 
            propertyValues[1] = new PropertyValue();
            propertyValues[1].Name = "Overwrite";
            propertyValues[1].Value = new uno.Any(true);
            //// Setting the filter name 
            propertyValues[0] = new PropertyValue();
            propertyValues[0].Name = "FilterName";
            propertyValues[0].Value = new uno.Any(ConvertExtensionToFilterType(Path.GetExtension(sourceFile)));
            ((XStorable)xComponent).storeToURL(destinationFile, propertyValues);
        }


        private static string PathConverter(string file)
        {
            if (file == null || file.Length == 0)
                throw new NullReferenceException("Null or empty path passed to OpenOffice");

            return String.Format("file:///{0}", file.Replace(@"\", "/"));

        }

        public static string ConvertExtensionToFilterType(string extension)
        {
            switch (extension)
            {
                case ".doc":
                case ".docx":
                case ".txt":
                case ".rtf":
                case ".html":
                case ".htm":
                case ".xml":
                case ".odt":
                case ".wps":
                case ".wpd":
                    return "writer_pdf_Export";
                case ".xls":
                case ".xlsb":
                case ".ods":
                    return "calc_pdf_Export";
                case ".ppt":
                case ".pptx":
                case ".odp":
                    return "impress_pdf_Export";

                default: return null;
            }
        }
第一句才是重點,其它照複製便可

不過今次需要安裝OpenOffice才能轉檔 ~_~
至少今次是免費的午餐。經過查找,OpenOffice同樣可以為 pdf 加密,而那是 Microsoft.Office.Interop.Word 不可以的。

不過該功能預設沒有載入,官方的說法︰加入密碼保護、指定輸出頁面範圍……那些都是沒有意義的啊

如果想為pdf加密,官方沒有為c#提供具體方法,java才有提供實例。

總結一下

想說的是COPY+PASE有點累人,太長篇要你等了。整理一下你該如何選擇那種轉換方法

Mircrosoft Word Spire OpenOffice
$
  • Execute computer need MS Word with convert function.
  • Microsoft Word Viewer cannot export to pdf
  • Spire.Doc Pro Edition support export PDF convertion($799/Year)
  • Spire.PDF Pro Edition($599/Year), if u need PDF encryption
OpenOffice and OpenOffice SDK are free
Development Environment reference external dll(1), inside your visual studio dll(5), need not install if you have those dll dll(5), need not install if you have those dll
PDF convertion(import type) doc, docx Doc/Docx to XML, RTF, EMF, TXT, XPS, EPUB, HTML and vice versa DOC, DOCX, XLS, XLSX, PPT, PPTX...
PDF Encryption N Yes Yes on Java


除了以上方法,商用軟件比較常採納的方法是透過pdf virtual printer。那個有機會下次再詳述。

Exception

以下是一些你可能也會遇到的問題,以及解決方法

Microsoft Word

Microsoft.Office.Interop.Word 那項在Visual Studio Express 是沒有的

Microsoft Word 2007 先要安裝插件(2007 Microsoft Office Add-in: Microsoft Save as PDF ),方能匯出成PDF格式
2010, 2013已經預載,不必理會。只有你看見以下轉換選項就不用理會


確保你的電腦有安裝MS Word,如果你的程式放在伺服器那伺服器也需要安裝MS Word,如果你根本沒有安裝。一開始宣告便會出錯。

OpenOffice

X__X就試過不斷出現System.Runtime.InteropServices.COMException
電腦一開始就安裝了OpenOffice 4.0.0,今天要玩那個功能就再去安裝OpenOffice SDK
但一直Throw Exception,卡了一天才醒覺FUXX

屏幕截圖 也忘了記錄,應該是System.Runtime.InteropServices.COMException


開發時使用的SDK版本,必須和運行電腦上的Open Office版本一置

No comments:

Post a Comment