使用DocumentFormat.OpenXML实现Excel表格转Word表格

最新推荐文章于 2023-03-04 11:46:44 发布

置顶路漫漫其修远兮-

最新推荐文章于 2023-03-04 11:46:44 发布

阅读量3.2k

点赞数

分类专栏：开源组件文章标签： Excel转Word ClosedXML OpenXML Docx C#

本文链接：https://blog.csdn.net/always987905790/article/details/79515444

版权

开源组件专栏收录该内容

3 篇文章 0 订阅

订阅专栏

特别申明：本帖为原创，在不影响版本的前提下，欢迎转载。

入题之前，我们先看一下什么是DocumentFormat.OpenXML？

由于Open XML Format格式是开放的，而且基础是大家熟悉的ZIP和XML技术，因此对开发人员来说非常有用。微软也在System.IO.Packaging命名空间里提供了一个开发库，用于访问这些文件，并作为WinFX技术的一部分，同时也是Open XML Format SDK开发套装的基础。

当然，以上我们不用去深究，只需要知道这是一个处理Office文档的动态链接库。在这之下，有处理Excel表格的，有处理Word文档的，等等。今天我们需要用到它下面的两个组件：ClosedXML和Docx（注：要下载最新版请访问官网www.gitHub.com搜索这两个组件就可以下载）。之所以选择这两个组件是因为他们在处理数据时是解析的XML数据，不需要Com环境的支持，这就意味着你根本不需要安装Office，就可以用这个组件进行文档读写；当然，总所周知的是：Office只有在03版本之后的才是Xml格式的数据，也就意味着，如果你的表格是03版本或者03版本之前的版本（**.xls），那么ClosedXML组件将不支持。同理，**.doc文件也是一样。

好了，前面啰嗦的有点多了，主要为了完全没接触过DocumentFormat.OpenXML的人感觉不那么直接，毕竟有因才有果。下面入题：

先把问题甩出来：

1、需要注意或者提前说明的一些点？

2、我们知道Office文档的XML解析后的数据大致分为几大类：格式、文本、杂项（其他的统称）【可在网上下载XML解析工具查看里面的结构】。那么，是不是要分两次包装，一次只负责文本，一次只涵盖格式？

3、有一个细节问题可能大家会忽略，但是实际上手起来就会遇到。Word和Excel关于合并单元格的问题有一个区别：

上图中Excel和Word中关于完整的行（或者列）合并后的情况，在Excel中占两行，Word中占一行，这种情况怎么解决？

正式开始：

1、需要注意的点：

在引用ClosedXmL和Docx的同时也要将DocumentFormat.OpenXml引用到项目中，因为这两个组件都是建立在DocumentFormat.OpenXml之上，需要它的支撑，如图：

2、采取两次复制的方式，一次只包括格式，一次只涵盖文本

3、因为两种表格在合并后都是只保留了区域内的第一个单元格文本的数据，所以我们采取右下优先的原则来合并对应的单元格就能完美解决这个问题。理论补充：

/// <summary>
        /// Excel转Word
        /// </summary>
        /// <param name="Filename">工作薄名</param>
        /// <param name="SheetName">工作表名</param>
        /// <param name="Catalog">新生成的Word文档存放目录</param>
        public void ExcelToWord(string Filename, string SheetName, string Catalog)
        {
            #region 设置异常
            if (!File.Exists(Filename))
                throw new Exception("未找到地址为" + Filename + "的工作簿");
            if (!(Path.GetExtension(Filename).Equals(".xlsx") || Path.GetExtension(Filename).Equals(".xlsm")))
                throw new Exception("扩展名不支持，支持的扩展名为'.xlsx'和'.xlsm'");
            if (!Directory.Exists(Catalog))
                throw new Exception("未找到地址为" + Catalog + "的目录");
            XLWorkbook TempExcel = new XLWorkbook(Filename);
            bool flag = false;
            foreach (var sheet in TempExcel.Worksheets)
                if (sheet.Name == SheetName)
                    flag = true;
            if (flag == false)
                throw new Exception("工作薄中不存在名为" + SheetName + "的表格");
            #endregion
            #region 开始转换
            else
            {
                IXLWorksheet TempSheet = TempExcel.Worksheet(SheetName);
                //创建一个新的word文档
                DocX TempWord = DocX.Create(Catalog + "\\" + Path.GetFileNameWithoutExtension(Filename) + ".docx");
                //添加一个表格到word文档,设置word表格的行列数
                TempWord.InsertTable(TempSheet.Column(1).Cells().Count(), TempSheet.Row(1).Cells().Count());
                Dictionary<string, int> FontDic = new Dictionary<string, int>();
                for (int i = 0; i < FontFamily.Families.Length; i++)
                {
                    //字体名称与编号对应,以便通过名称来获取字体
                    FontDic[FontFamily.Families[i].Name] = i;
                }
                #region 单元格预处理，格式，字体，颜色，斜体，加粗，字体大小，居中方式，边框
                for (int row = 1; row <= TempSheet.LastRowUsed().RowNumber(); row++)
                {
                    for (int col = 1; col <= TempSheet.LastColumnUsed().ColumnNumber(); col++)
                    {
                        IXLCell xlsx = TempSheet.Cell(row, col);
                        Cell docx = TempWord.Tables[0].Rows[row - 1].Cells[col - 1];
                        //拷贝字体格式
                        Formatting formatting = new Formatting();
                        formatting.Bold = xlsx.Style.Font.Bold;//加粗
                        formatting.Italic = xlsx.Style.Font.Italic;//斜体
                        //字体颜色
                        var temp = xlsx.Style.Font.FontColor.ToString().Split(':').ToList();
                        if (xlsx.Style.Font.FontColor.ToString().IndexOf(":") > -1)
                        {
                            if (temp[1].Split(',').First().IndexOf("Text1")>-1)
                                formatting.FontColor = Color.Black;
                            else
                                throw new Exception("目前暂不支持"+ xlsx.Style.Font.FontColor.ToString() + "强调色");
                        }   
                        else
                            formatting.FontColor = xlsx.Style.Font.FontColor.Color;
                        //字体大小
                        formatting.Size = xlsx.Style.Font.FontSize;
                        formatting.FontFamily = new FontFamily(xlsx.Style.Font.FontName);
                        //复制
                        docx.Paragraphs[0].InsertText(xlsx.Value.ToString(), false, formatting);

                        #region 字体设置
                        //字体设置,因方法未提供，故取出Word中所有字体放入数据字典中，再设置(暂未实现)
                        docx.Paragraphs[0].Font(FontFamily.Families[FontDic[xlsx.Style.Font.FontName]]);
                        #endregion
                        #region 居中方式
                        switch (xlsx.Style.Alignment.Horizontal)
                        {
                            case XLAlignmentHorizontalValues.Left:
                                docx.Paragraphs[0].Alignment = Alignment.left;
                                break;
                            case XLAlignmentHorizontalValues.Right:
                                docx.Paragraphs[0].Alignment = Alignment.right;
                                break;
                            case XLAlignmentHorizontalValues.Center:
                                docx.Paragraphs[0].Alignment = Alignment.center;
                                break;
                        }
                        switch (xlsx.Style.Alignment.Vertical)
                        {
                            case XLAlignmentVerticalValues.Bottom:
                                docx.VerticalAlignment = VerticalAlignment.Bottom;
                                break;
                            case XLAlignmentVerticalValues.Top:
                                docx.VerticalAlignment = VerticalAlignment.Top;
                                break;
                            case XLAlignmentVerticalValues.Center:
                                docx.VerticalAlignment = VerticalAlignment.Center;
                                break;
                        }
                        #endregion
                        #region 边框
                        docx.SetBorder(TableCellBorderType.Bottom, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                        docx.SetBorder(TableCellBorderType.Left, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                        docx.SetBorder(TableCellBorderType.Top, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                        docx.SetBorder(TableCellBorderType.Right, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                        #endregion
                    }
                }
                #endregion

                #region 合并单元格的处理
                //List<string> mergedranges = new List<string>();
                var MergedRanges = TempSheet.MergedRanges.ToList();
                #region 确定合并顺序,依次放入列表中
                List<List<Point>> Temp = new List<List<Point>>();
                List<List<Point>> RangePoints = new List<List<Point>>();
                #region 取出区域的四个点放入列表中
                for (int i = 0; i < MergedRanges.Count(); i++)
                {
                    List<Point> point = new List<Point>();
                    string range = MergedRanges[i].ToString().Split('!').Last();
                    string StartCell = range.Split(':').First();
                    string EndCell = range.Split(':').Last();
                    int X1, X2, X3, X4, Y1, Y2, Y3, Y4;
                    X1 = X2 = TempSheet.Cell(StartCell).WorksheetRow().RowNumber() - 1;
                    X3 = X4 = TempSheet.Cell(EndCell).WorksheetRow().RowNumber() - 1;
                    Y1 = Y4 = TempSheet.Cell(StartCell).WorksheetColumn().ColumnNumber() - 1;
                    Y3 = Y2 = TempSheet.Cell(EndCell).WorksheetColumn().ColumnNumber() - 1;
                    Point P1 = new Point(Y1, X1);
                    Point P2 = new Point(Y2, X2);
                    Point P3 = new Point(Y3, X3);
                    Point P4 = new Point(Y4, X4);
                    point.Add(P1);
                    point.Add(P2);
                    point.Add(P3);
                    point.Add(P4);
                    RangePoints.Add(point);
                }
                #endregion
                #region 区域处理顺序排序
                for (;RangePoints.Count() > 0;)
                {
                    List<Point> temp = RangePoints[0];
                    int stata = -1;
                    for (int i = 1; i< RangePoints.Count() && RangePoints.Count()>1; i++)
                    {
                        if (RangePoints[i][2].X >= temp[2].X && RangePoints[i][2].Y >= temp[2].Y)
                        {
                            temp = RangePoints[i];
                            stata = i;
                        }
                        if (RangePoints[i][2].X >= temp[2].X && RangePoints[i][2].Y <= temp[2].Y)
                        {
                            if (RangePoints[i][3].X >= temp[1].X && RangePoints[i][3].Y >= temp[1].Y)
                            {
                                temp = RangePoints[i];
                                stata = i;
                            }
                        }
                        if (RangePoints[i][2].X <= temp[2].X && RangePoints[i][2].Y >= temp[2].Y)
                        {
                            if (RangePoints[i][1].X >= temp[3].X && RangePoints[i][1].Y >= temp[3].Y)
                            {
                                temp = RangePoints[i];
                                stata = i;
                            }
                        }
                    }
                    Temp.Add(temp);
                    if (stata >= 0)
                        RangePoints.Remove(RangePoints[stata]);
                    else
                        RangePoints.Remove(RangePoints[0]);
                }
                #endregion
                #endregion
                #region 合并单元格
                for (int i = 0; i < Temp.Count(); i++)
                {
                    int firstrow = Temp[i][0].Y;
                    int firstcolumn = Temp[i][0].X;
                    int lastrow = Temp[i][2].Y;
                    int lastcolumn = Temp[i][2].X;
                    //public void MergeCellsInColumn(int columnIndex, int startRow, int endRow);竖向合并
                    //单行不作处理
                    if(lastrow - firstrow > 0)
                    {
                        for (int j = 0; j <= lastcolumn - firstcolumn; j++)
                        {
                            TempWord.Tables[0].MergeCellsInColumn(firstcolumn + j, firstrow, lastrow);
                        }
                    }
                    //public void MergeCells(int startIndex, int endIndex);横向合并
                    //单列不作处理
                    if (lastcolumn - firstcolumn > 0)
                    {
                        for (int j = 0; j <= lastrow - firstrow; j++)
                        {
                            TempWord.Tables[0].Rows[firstrow + j].MergeCells(firstcolumn, lastcolumn);
                        }
                    }
                    Cell docx = TempWord.Tables[0].Rows[firstrow].Cells[firstcolumn];                  
                    //调整边框
                    docx.SetBorder(TableCellBorderType.Bottom, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                    docx.SetBorder(TableCellBorderType.Left, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                    docx.SetBorder(TableCellBorderType.Top, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                    docx.SetBorder(TableCellBorderType.Right, new Border(BorderStyle.Tcbs_thick, BorderSize.one, 0, Color.Black));
                    docx.SetBorder(TableCellBorderType.InsideV, new Border(BorderStyle.Tcbs_none, 0, 0, Color.White));
                    docx.SetBorder(TableCellBorderType.InsideH, new Border(BorderStyle.Tcbs_none, 0, 0, Color.White));
                    //设置单元格水平居中
                    docx.VerticalAlignment = VerticalAlignment.Center;
                }
                #endregion
                #endregion
                //#region 调整行高
                //if (TempWord.Tables[0].RowCount == TempSheet.LastRowUsed().RowNumber())
                //{
                //    for (int i = 0; i < TempWord.Tables[0].RowCount; i++)
                //    {
                //        TempWord.Tables[0].Rows[i].Height = TempSheet.Row(i + 1).Height;
                //    }
                //}
                //#endregion
                //#region 调整列宽
                //if (TempWord.Tables[0].ColumnCount == TempSheet.LastColumnUsed().ColumnNumber())
                //{
                //    for (int i = 0; i < TempWord.Tables[0].ColumnCount; i++)
                //    {
                //        TempWord.Tables[0].Rows[0].Cells[i].Width = TempSheet.Column(i + 1).Width;
                //    }
                //}
                //#endregion
                TempWord.Save();
            }
            #endregion
        }

以上为源码，仅供参考，如有不适之处，请多指正！

特别说明：目前Docx组件在字体方面还不支持中文字体比如宋体（除了微软雅黑），但是这并不影响大部分功能的实现。值得一提的是，目前国外的一些开源组件比如PDFSharp等等，都是暂时不支持中文字体的，一个较好的解决办法是你得申明字体再引用，具体内容请上谷歌搜索！那么为什么ClosedXML就支持中文字体呢？我猜是因为之前有一个ClosedXML的负责人是中国人的原因吧！

路漫漫其修远兮-

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
使用DocumentFormat.OpenXML实现Excel表格转Word表格

特别申明：本帖为原创，在不影响版本的前提下，欢迎转载。入题之前，我们先看一下什么是DocumentFormat.OpenXML？由于Open XML Format格式是开放的，而且基础是大家熟悉的ZIP和XML技术，因此对开发人员来说非常有用。微软也在System.IO.Packaging命名空间里提供了一个开发库，用于访问这些文件，并作为WinFX技术的一部分，同时也是Open XML Form...
复制链接

扫一扫