【原创】开源Math.NET基础数学类库使用(03)C#解析Matlab的mat格式

开源Math.NET基础数学类库使用总目录：【目录】开源Math.NET基础数学类库使用总目录

前言

　　本人在09年使用该组件的时候，主要原因也是为了替代Matlab，进行相关数学计算，现在依然有很多人关注Matlab计算，特别是学生，而很多也在使用C#，所以这些人通常由于个人能力有限(无法精通某一个门语言来解决综合问题)，无法单纯的通过C#或者Matlab来解决问题，就想通过混合编程来调用完成，其实本人也做过大量的Matlab.NET混合编程研究，而且也个人制作了一套视频教程，编写过很多文章，可以参考如下文章：

1.国内第一部Matlab和C#.Net混合编程入门级视频教程【完全免费】

2.Matlab.NET混合编程调用Figure窗体

3.Matlab.NET混合编程技巧之——直接调用Matlab内置函数(附源码)

4.Matlab.NET混合编程技巧之——找出Matlab内置函数

5.Matlab与.NET基于类型安全的接口混合编程入门

6.Matlab与.NET混合编程解决人脸识别问题

　　鉴于此，我也提醒过很多人，在.NET中可以使用Math.NET组件来替代Matlab的相关工作，可能效果不太好。今天就来介绍一个比较适用的功能，利用Math.NET提供的功能，使用C#来读写Matlab的mat数据格式，这个功能的使用场景也很广泛，当然可能是研究偏多，大家思想可以放得更远。

　　如果本文资源或者显示有问题，请参考本文原文地址：http://www.cnblogs.com/asxinyu/p/4265972.html

1.Mat数据格式

　　用过一段matlab的人都知道，在matlab的工作空间中，可以将变量保存为mat数据格式，下次在程序中直接读取和进行计算，非常方便。以前也没有具体研究过这个格式，也趁这个写博客的机会，一起来看看这个东西的作用和组成。虽然使用Math.NET提供的程序读取和写入Mat文件都很简单，但简单之余，了解一点其他知识也是不错的。

　　Mat文件格式，实际上使用的是一种通用的数值数据存储格式Hierarchical Data Format(HDF)，该格式最先是由美国国家超级计算应用中心开发的，后来由HDF非盈利组织资助，进行不但完善和推广。这个格式的使用是非常广泛的(使用BSD许可证)，例如一些大名鼎鼎的商业和非商业软件LabVIEW,MATLAB,Scilab,Octave,Mathematica等都支持该格式，该格式目前主要有HDF4和HDF5。 Mat文件格式最新的7.3版是基于HDF5的。

有关HDF文件格式和Mat格式的资料如下：

wikipedia：http://en.wikipedia.org/wiki/Hierarchical_Data_Format

Matlab官方：http://cn.mathworks.com/help/matlab/import_export/mat-file-versions.html

HDF官方：http://www.hdfgroup.org/

　　Mat文件格式分为2个等级(目前我知道的) Level 4和 Level 5。Level 4 Mat文件格式支持只支持2维矩阵和字符串；而Level 5支持更多，如多维数组，字符串数组，Cell数组，稀疏矩阵，对象，结构等都支持。本文介绍的MathNet.Numerics.Data.Matlab是直接支持Level-5类型的，所有更强大。

2.Mat格式在Matlab中的使用

　　Matlab中mat数据的保存和读取非常简单，只需要使用Save和load命令即可。对Matlab熟悉的朋友可以随便打开matlab敲几个命令就可以了，由于电脑太慢，前段时间把Matlab卸载了，这里就只介绍mat格式读取和保存的语法，实际的使用也是比较简单的。

http://www.ilovematlab.cn/thread-78257-1-1.html

●save：将工作区中的所有变量保存在当前工作区中的文件中，文件名为 matlab.mat，MAT文件可以通过load函数再次导入工作区，MAT函数可以被不同的机器导入，甚至可以通过其他的程序调用。
●save('filename')：将工作区中的所有变量保存为文件，文件名由filename指定。如果filename中包含路径，则将文件保存在相应目录下，否则默认路径为当前路径。
●save('filename', 'var1', 'var2', ...)：保存指定的变量在 filename 指定的文件中。
●save('filename', '-struct', 's')：保存结构体s中全部域作为单独的变量。
●save('filename', '-struct', 's', 'f1', 'f2', ...)：保存结构体s中的指定变量。
● save('-regexp', expr1, expr2, ...)：通过正则表达式指定待保存的变量需满足的条件。
● save('..., 'format')，指定保存文件的格式，格式可以为MAT文件、ASCII文件等。

MATLAB中导入数据通常由函数load实现，该函数的用法如下：
●load：如果matlab.mat文件存在，导入matlab.mat中的所有变量，如果不存在，则返回error。
●load filename：将filename中的全部变量导入到工作区中。
●load filename X Y Z ...：将filename中的变量X、Y、Z等导入到工作区中，如果是MAT文件，在指定变量时可以使用通配符“*”。
●load filename -regexp expr1 expr2 ...：通过正则表达式指定需要导入的变量。
●load -ascii filename：无论输入文件名是否包含有扩展名，将其以ASCII格式导入；如果指定的文件不是数字文本，则返回error。
●load -mat filename：无论输入文件名是否包含有扩展名，将其以mat格式导入；如果指定的文件不是MAT文件，则返回error。

3.C#读取Mat数据格式

　　Math.NET中有关Mat数据格式读写的组件是MathNet.Numerics.Data.Matlab,Mat数据格式的读取主要用MatlabReader类，解析的功能函数就是下面这段代码：

 1 /// <summary>Extracts all matrix blocks in a format we support from a stream.</summary>
 2 internal static List<MatlabMatrix> ParseFile(Stream stream)
 3 {
 4     var matrices = new List<MatlabMatrix>();
 5 
 6     using (var reader = new BinaryReader(stream))
 7     {
 8         // skip header (116 bytes)
 9         // skip subsystem data offset (8 bytes)
10         // skip version (2 bytes)
11         reader.BaseStream.Position = 126;
12 
13         // endian indicator (2 bytes)
14         if (reader.ReadByte() != LittleEndianIndicator)
15         {
16             throw new NotSupportedException(Resources.BigEndianNotSupported);
17         }
18 
19         // set position to first data element, right after full file header (128 bytes)
20         reader.BaseStream.Position = 128;
21         var length = stream.Length;
22 
23         // for each data element add a MATLAB object to the file.
24         while (reader.BaseStream.Position < length)
25         {
26             // small format: size (2 bytes), type (2 bytes), data (4 bytes)
27             // long format: type (4 bytes), size (4 bytes), data (size, aligned to 8 bytes)
28 
29             DataType type;
30             int size;
31             bool smallBlock;
32             ReadElementTag(reader, out type, out size, out smallBlock);
33 
34             // read element data of the size provided in the element header
35             // uncompress if compressed
36             byte[] data;
37             if (type == DataType.Compressed)
38             {
39                 data = UnpackCompressedBlock(reader.ReadBytes(size), out type);
40             }
41             else
42             {
43                 data = new byte[size];
44                 reader.Read(data, 0, size);
45                 SkipElementPadding(reader, size, smallBlock);
46             }
47 
48             if (type == DataType.Matrix)
49             {
50                 using (var matrixStream = new MemoryStream(data))
51                 using (var matrixReader = new BinaryReader(matrixStream))
52                 {
53                     matrixReader.BaseStream.Seek(20, SeekOrigin.Current);
54                     var matrixDim = matrixReader.ReadInt32()/8;
55                     if (matrixDim > 2)
56                     {
57                         continue;
58                     }
59 
60                     matrixReader.BaseStream.Seek(10, SeekOrigin.Current);
61                     int matrixSize = matrixReader.ReadInt16();
62                     if (matrixSize == 0)
63                     {
64                         matrixSize = matrixReader.ReadInt32();
65                     }
66 
67                     var matrixName = Encoding.ASCII.GetString(matrixReader.ReadBytes(matrixSize));
68 
69                     matrices.Add(new MatlabMatrix(matrixName, data));
70                 }
71             }
72         }
73     }
74 
75     return matrices;
76 }

View Code

　　C#调用就更简单了，上面那些实现只是一个帮助，大家以后可以了解解析其他类似的数据格式。看看调用的代码：

 1 using MathNet.Numerics.LinearAlgebra;
 2 using MathNet.Numerics.Data.Matlab;
 3 
 4 //从collection.mat文件中，读取第一个double矩阵
 5 Matrix<double> m = MatlabReader.Read<double>("collection.mat");
 6 
 7 //从collection.mat中读取一个名称为 vd 的特定矩阵
 8 Matrix<double> m = MatlabReader.Read<double>("collection.mat", "vd");
 9 
10 //直接选择转换为其他格式
11 Matrix<Complex> m = MatlabReader.Read<Complex>("collection.mat");
12 
13 //将一个文件的所有矩阵及其名称存入字典中
14 Dictionary<string,Matrix<double>> ms = MatlabReader.ReadAll<double>("collection.mat");
15 
16 //读取名为 Ad和vd 的矩阵到字典
17 var ms = MatlabReader.ReadAll<double>("collection.mat", "vd", "Ad");

　　这样就可以直接在C#中进行相关计算了，也不用混合编程那么麻烦了。

4.C#保存Mat数据格式

　　Mat数据格式的写入主要用MatlabWriter类，核心功能函数就是下面代码：

 1 /// <summary>Writes all matrix blocks to a stream.</summary>
 2 internal static void FormatFile(Stream stream, IEnumerable<MatlabMatrix> matrices)
 3 {
 4     using (var buffer = new BufferedStream(stream))
 5     using (var writer = new BinaryWriter(buffer))
 6     {
 7         // write header and subsystem data offset (116+8 bytes)
 8         var header = Encoding.ASCII.GetBytes(HeaderText + DateTime.Now.ToString(Resources.MatlabDateHeaderFormat));
 9         writer.Write(header);
10         Pad(writer, 116 - header.Length + 8, 32);
11 
12         // write version (2 bytes)
13         writer.Write((short)0x100);
14 
15         // write little endian indicator (2 bytes)
16         writer.Write((byte)0x49);
17         writer.Write((byte)0x4D);
18 
19         foreach (var matrix in matrices)
20         {
21             // write data type
22             writer.Write((int)DataType.Compressed);
23 
24             // compress data
25             var compressedData = PackCompressedBlock(matrix.Data, DataType.Matrix);
26 
27             // write compressed data to file
28             writer.Write(compressedData.Length);
29             writer.Write(compressedData);
30         }
31 
32         writer.Flush();
33         writer.Close();
34     }
35 }

View Code

　　C#调用也很简单,调用的代码如下：　　

 1 var matrices = new List<MatlabMatrix>();
 2 m.Add(MatlabWriter.Pack(myFirstMatrix, "m1");
 3 m.Add(MatlabWriter.Pack(mySecondMatrix, "m2");
 4 MatlabWrier.Store("file.mat", matrices);
 5 
 6 //写入单个的"myMatrix"矩阵，并命名为"m1".
 7 MatlabWriter.Write("file.mat", myMatrix, "m1");
 8 
 9 //写入多个矩阵，注意 矩阵列表 和 名称列表
10 MatlabWriter.Write("file.mat", new[] { m1, m2 }, new[] { "m1", "m2" });
11 
12 //写入字典矩阵，和读取的原理类似
13 var dict = new Dictionary<string, Matrix<double>>();
14 dict.Add("m1", m1);
15 dict.Add("m2", m2);
16 MatlabWriter.Write("file.mat", dict);

5.资源

　　接下来的文章将继续介绍Math.NET的其他功能。　

　　如果本文资源或者文章显示有问题，请参考本文原文地址：http://www.cnblogs.com/asxinyu/p/4265972.html

posted @ 2015-02-13 20:25 数据之巅阅读(16828) 评论(7) 收藏举报

刷新页面返回顶部

C#.NET、Power BI、数据挖掘

.NET技术, 开源项目, 数据挖掘, 微软Power BI