# 恶意软件分析

Analysing the malware to understand what it does, how it got onto the system, what has it changed (what does it done ), what’s its purpose, who put it there? how it works, how to identify it, and how to defeat or eliminate it

X86、AMD64的机器码。

[Broken image link removed]

## 恶意代码类型

• 木马
会伪装成某个东西的软件，你从网上下载游戏、或PDF，可能会下载到你不想要的东西。

• 计算机病毒，蠕虫
Worm or virus，自我复制，感染其他计算机，

• 垃圾邮件
Spam-sending malware，在被害者的主机上发送大量垃圾邮件，卖这个发送服务可以赚钱。

• 间谍软件
Information-stealing malware，监视用户的一举一动，包括记录各种账号密码等信息、尤其是邮箱、银行卡信用卡等，将这些敏感信息发送给攻击者。可以使用：sniffers, password hash grabbers, and keyloggers。

• 勒索软件
Scareware，恐吓受害者，并勒索他们购买。也可能是加密你所有的文件让你支付比特币，也可能是告诉你你浏览了恶意网站让你支付罚款等。

• 僵尸网络
Botnet，取自机器人robot的后半部分和网络network的前半部分。因此，僵尸网络的表面意思是由机器人组成的网络。允许控制者访问系统，执行命令从控制命令服务器中。

• 下载器

• 启动器
Launcher，用来启动其他恶意代码，用一些非传统手段来保证隐秘性。

• 内核套件
Rootkit，嵌入进系统，它的功能是在安装目标上隐藏自身及指定的文件、进程和网络链接等信息，比较多见到的是Rootkit一般都和木马、后门等其他恶意程序结合使用。持久并毫无察觉地驻留在目标计算机中，对系统进行操纵、并通过隐秘渠道收集数据的程序。Rootkit的三要素就是：隐藏、操纵、收集数据。“Rootkit”中root术语来自于unix领域。由于unix主机系统管理员账号为root账号，该账号拥有最小的安全限制，完全控制主机并拥有了管理员权限被称为“root”了这台电脑。然而能够“root”一台主机并不意味着能持续地控制它，因为管理员完全可能发现了主机遭受入侵并采取清理措施。因此Rootkit的初始含义就在于“能维持root权限的一套工具”。
简单地说，Rootkit是一种特殊的恶意软件，它的功能是在安装目标上隐藏自身及指定的文件、进程和网络链接等信息，比较多见到的是Rootkit一般都和木马、后门等其他恶意程序结合使用。Rootkit通过加载特殊的驱动，修改系统内核，进而达到隐藏信息的目的。rootkit介绍Rootkit是一种奇特的程序，它具有隐身功能：无论静止时（作为文件存在），还是活动时，（作为进程存在），都不会被察觉。换句话说，这种程序可能一直存在于我们的计算机中，但我们却浑然不知，这一功能正是许多人梦寐以求的——不论是计算机黑客，还是计算机取证人员。黑客可以在入侵后置入Rootkit，秘密地窥探敏感信息，或等待时机，伺机而动；取证人员也可以利用Rootkit实时监控嫌疑人员的不法行为，它不仅能搜集证据，还有利于及时采取行动！
Rootkit 的目的在于隐藏自己以及不被其他软件发现。它可以通过阻止用户识别和删除攻击者的软件来达到这个目的。Rootkit 几乎可以隐藏任何软件，包括文件服务器、键盘记录器、Botnet 和 Remailer。许多 Rootkit 甚至可以隐藏大型的文件集合并允许攻击者在您的计算机上保存许多文件，而您无法看到这些文件。
Rootkit攻击方式多针对类似敏感数据剽窃这样的环节，那么某企业或政府组织“中央服务器”一类设备自然是植入Rootkit的首选目标，可这样的主机设备往往防护严密，不能轻易得手。我们知道数据并不是静止的存放在服务器中，它往往在机构的网络中流动。机构中级别较高的人员常会拥有对这些设备数据的读写权限，但他们所拥有的个人电脑的防护级别却通常比中央服务器要低，这就会给剽窃数据的黑客以可趁之机——将Rootkit程序植入相关人员的个人电脑，并默默的安家，不时地传回重要数据。

• 后门
Backdoor，后门是指绕过安全控制而获取对程序或系统访问权的方法。后门的最主要目的就是方便以后再次秘密进入或者控制系统，执行命令。主机上的后门来源主要有以下几种：
攻击者利用欺骗的手段，通过发送电子邮件或者文件，并诱使主机的操作员打开或运行藏有木马程序的邮件或文件，这些木马程序就会在主机上创建一个后门。攻击者攻陷一台主机，获得其控制权后，在主机上建立后门，比如安装木马程序，以便下一次入侵时使用。还有一种后门是软件开发过程中引入的。在软件的开发阶段，程序员常会在软件内创建后门以方便测试或者修改程序中的缺陷，但在软件发布时，后门被有意或者无意忽视了，没有被删除，那么这个软件天生就存在后门，安装该软件的主机就不可避免的引入了后门。大多数后门设法躲过日志，大多数情况下即使入侵者正在使用系统也无法显示他已在线。后门的引入无疑会形成重大安全风险。知道后门的人，日后可以对系统进行隐蔽的访问和控制，而且后门也容易被入侵者当成漏洞进行攻击。

## 1. 基础静态分析

Basic analysis, just looking at the ‘properties’ of the malware.

Static Analysis — Looking at the Malware ‘at Rest’

通过基本静态分析可以简单快速地判断出某个文件是否是恶意的，但对待高端的恶意代码往往无效。

### 反病毒引擎扫描【工具】

VirusTotal，是一个提供免费的可疑文件分析服务的网站。

https://www.virustotal.com

Imports

https://www.virustotal.com/gui/file/cc695909a072cb8c1da4dbb69385737465f59c31c12c0253a97516d4cf34105c/details

https://www.virustotal.com/gui/file/be1a5327e00826b456c8e8188e5a45e343d2d730320cc46fdc96347e472e5d12/details

Name显示曾经使用过的名字，

#### 恶意代码的指纹

256位长，可能会发生哈希碰撞，
2个文件的哈希完全一致，比较少见。

MD5与SHA-1算法比较常见。

### 查找字符串【工具】

Windows同时使用ASCII和Unicode字符，

• Looking at strings can reveal various things about the program
• Filenames
• Registry Keys
• Error messages (these can be very helpful)
• Names of functions called…

strings -n 5 C:\Windows\System32\kbd101a.dll

### 加壳

stub，文件仍然是可执行的。

stub里的代码负责解压文件，

#### PEiD【工具】

[Broken image link removed]

UPX的脱壳比较简单，下载upx.sourceforge.net，运行命令：upx -d xxx.exe

e.g. by using UPXPacker https://upx.github.io

PEiD(PE Identifier)是一款著名的查壳工具，其功能强大，几乎可以侦测出所有的壳，其数量已超过470种PE文档的加壳类型和签名。

Can use the program PEiD to detect if a file has been packed or not.

You can use PEiD to detect the type of packer or compiler employed to build an application, which makes analyzing the packed file much easier.

Development and support for PEiD has been discontinued since April 2011, but it’s still the best tool available for packer and compiler detection.

In many cases, it will also identify which packer was used to pack the file.

### PE文件格式

PE文件格式是一种数据结构，包含为Windows操作系统加载器管理可执行代码所必要的信息。

PE文件以文件头开始，其中包括代码信息，应用程序类型，所需的代码库与空间要求，这些信息非常有价值。

Portable是指对于不同的Windows版本和不同的CPU类型上PE文件的格式是一样的，当然CPU不一样了，CPU指令的二进制编码是不一样的。

PE文件使用的是一个平面地址空间，所有代码和数据都合并在一起，组成一个很大的结构。

PE是一种用于可执行文件、目标文件和动态链接库的文件格式，主要使用在32位和64位的Windows操作系统上。

PE文件格式封装了Windows操作系统加载可执行程序代码时所必需的一些信息。这些信息包括动态链接库、API导入和导出表、资源管理数据和线程局部存储数据。

PE（“portable executable”，可移植的可执行文件）文件格式，是微软WindwosNT,Windows95和Win32子集中的可执行的二进制文件的格式；在WindowsNT中，驱动程序也是这种格式。它还能被应用于各种目标文件和库文件中。

PE文件格式结构：
[Broken image link removed]

[Broken image link removed]

https://blog.csdn.net/shitdbg/article/details/49734495

https://blog.csdn.net/qq_30145355/article/details/78859214

https://blog.csdn.net/evileagle/article/details/11693499

https://wiki.osdev.org/PE

PE files begin with a header that includes information about the code, the type of application, required library functions, and space requirements. The information in the PE header is of great value to the malware analyst.

https://blog.csdn.net/StriveScript/article/details/6279488

https://www.77169.net/html/271468.html

https://www.sohu.com/a/278839463_653604

https://www.y4f.net/71203.html

https://zhuanlan.zhihu.com/p/31967907

https://www.cnblogs.com/mfm11111/archive/2009/04/18/1438474.html

https://www.cnblogs.com/mfm11111/archive/2009/04/18/1438848.html

https://www.4hou.com/posts/5QnB

https://www.cnblogs.com/qintangtao/archive/2013/01/11/2857179.html

https://my.oschina.net/u/4293620/blog/3809007

https://www.hetianlab.com/expc.do?ec=ECID172.19.104.182015051313294800001

PE文件头比只查看导入函数提供更多的有价值的信息，
PE文件格式包含一个PE文件头，

[Broken image link removed]

PE文件的第一个字节起始于MS-Dos头部，

1. 入口点，Entry Point；
2. 文件偏移地址，File Offset；
4. 基地址，ImageBase；
5. 相对虚拟地址，Relative Virual Address，简称:RVA，代码在内存中相对于基地址的偏移，RVA = VA - ImageBase；
6. AddressOfEntryPoint（RVA）：这是PE文件开始执行的位置，通常会落在 .text section.此域适用于exe或dll。A pointer to the entry point function, relative to the image base address. For executable files, this is the starting address. For device drivers, this is the address of the initialization function. The entry point function is optional for DLLs. When no entry point is present, this member is zero.

PE装载器准备运行的PE文件的第一个指令的RVA。

代码段的开始地址，表示程序中的Code Section从何开始。Code Section通常在Data Section之前，在PE表头之后。微软链接器所产生的exes中，此值通常为0x1000。Borland 的TLINK32则通常指定此值为0x10000。因为预设情况下TLINK时以64k为对齐粒度的，而MS用的是4k。A pointer to the beginning of the code section, relative to the image base.

• This works, but slightly complicated by the fact that the addresses are all
stored in memory as Relative Virtual Addresses

• Need to add the address of the base of the DLL onto them before using
them

• To both the name pointers and the address of the function

• If disassembling shell code, keep an eye out for techniques like this…

[Broken image link removed]

ImageBase：

The preferred address of the first byte of the image when it is loaded in memory. This value is a multiple of 64K bytes. The default value for DLLs is 0x10000000. The default value for applications is 0x00400000, except on Windows CE where it is 0x00010000.

ImageBase包含了什么？
Where the program has been compiled to be loaded to. Contains the address to load the program.

ImageBase是映像的基地址，这个基地址是建议，对于DLL来说，如果无法加载到这个地址，系统会自动为其选择地址。

Assuming the program is not relocated when loaded, where will the program start executing code?

The memory location where the first instruction will be placed can be found using the following formula: EP (Memory) = AddressOfEntryPoint + ImageBase

A说法

B说法
ImageBase是程序在虚拟空间中被装载的位置，exe加载到内存的时候,所在的地址???

C说法
ImageBase是程序载入内存的初始地址，也就就整个PE文件的最前面入口地址，通过这个地址可以定位到PE的任何结构数据???

D说法

E说法

F说法

#### DLL文件结构

• Once we’ve found the address where the DLL has been loaded

• Can then parse the PE file format (in memory) to find the address of
any functions of interest

• PE has several tables we need to consult

• Main table of interest is the Export Address Table
• Contains the address of each function

• But it is indexed by ordinals — need to find the ordinal for the function of interest

Ordinal is just a number
Export table might not necessarily start at ordinal 1 — need to consult the base ordinal details

https://docs.microsoft.com/en-us/windows/win32/debug/pe-format

##### DLL表

• Another table is the Export Name Pointer Table — list of the names of functions

• Can search this to find the function name (manually!)

• But the index into this table is not the ordinal

• Fortunately, there is another table we can use — Export Ordinal Table

• Can use the index of the name in the Export Name Pointer Table

• To find the ordinal in the Export Ordinal Table

• Then use that ordinal in the Export Address Table to find the address

Remember you don’t know where strcmp is so you’ll need to provide your own…

Can be done quickly by reading longwords and comparing with constants in m/code
e.g. the constant 0x50746547 would match the first four bytes of GetProcAddress

This section records the exports of the image (yes, EXEs can export things). This takes the form of:

The export address table: an array of length N holding the addresses of the exported functions/data (the addresses are stored relative to the image base). Indexes into this table are called ordinals.

The export name pointer table: an array of length M holding pointers to strings that represent the name of an export. This array is lexically ordered by name, to allow binary searches for a given export.

The export ordinal table: a parallel array of length M holding the ordinal of the corresponding name in the export name pointer table.

A piece of code is being delivered by an exploit (such as a buffer overflow) to a remote machine, explain how the exploit might make use of the Portable Exe- cutable header structure of DLLs to call Windows API functions.

Code will need to find the address of Windows API manually by walking the PE file format
• Can find the start of a DLL, by using the PEB to find the DLLs.
• Once the correct DLL has been found, we can use the PE file format to find
• Export Name Pointer Table to find the name of function
• Map offset of the function in name in the Export name table to ordinal in Export ordinal Table
• Then use the ordinal to find address of function in the Export Address Table
• Call the address to execute function

in the standard fashion (using LoadLibrary)
manually via the Process Environment Block

#### 分节（Section）

PE文件格式把可执行文件分成若干个数据节（section ），不同的资源被存放在不同的节中。

[Broken image link removed]

##### .text（代码段）

也是唯一包含代码的section

##### .idata（数据段）

.idata包含可执行文件所使用的外来的DLL函数、文件、数据等信息，即输入表。

##### .reloc

Contains information for relocation of library files

.bss

.crt

.tls

.sdata

.pdata

.didat

### PE View【工具】

Windows使用PE格式来存储文件和程序，

[Broken image link removed]

1显示PE文件头信息，

2显示关于文件的基本信息。
3显示编译时间，这里可以作假，很老的编译时间意味着古老的攻击，也许杀毒软件可以cover的住。但Delphi程序的编译时间统一为1992-6-19。

Subsystem，子系统，指出是控制台程序是IMAGE_SUBSYSTEM_WINDOWS_CUI还是图形界面程序IMAGE_SUBSYSTEM_WINDOWS_GUI。

[Broken image link removed]

1的虚拟大小表示在加载过程中需要分配多少空间给1个section。
2的原始数据大小表示在磁盘上这个section的大小规模，

[Broken image link removed]
[Broken image link removed]

### PE Studio【工具】

PeStudio是一款验证应用程序的免费工具，

可以直观显示被标记为黑名单的导入导出函数。

### 链接库与函数

#### DLL文件

DLL(Dynamic Link Library)文件，又叫动态链接库文件。

DLL和EXE文件一样，

Windows系统使用DLL文件来实现操作系统的[方法/函数/功能]，在DLL文件中可以调用Windows的API供我们编程使用，Win32 API，64等。

①扩展应用程序

②便于程序员合作

③节省内存

④共享程序资源

⑤解决应用程序本地化问题

#### 静态、运行时、动态

[Broken image link removed]

#### Dependency Walker【工具】

可以查看exe程序使用的DLL及函数，
DLL文件里有多少个函数，
以及EXE调用了哪个DLL的哪些函数？

[Broken image link removed]

2显示了程序导入的DLL列表，点击2后，

4显示这个DLL中所有可被导入的函数，对我们不是特别有用，3，4的列中有序号，可执行文件可以根据序号，而不是根据名字来导入函数。

5、6显示运行程序时装载的DLL版本额外信息和报告的错误。

Dependency Walker是Microsoft Visual C++ 中提供的非常有用的PE模块依赖性分析工具。

https://www.pianshen.com/article/2385335/

#### 导入函数

WINDOWS IMPORTS
• The imported functions from the Win32 API can tell us a lot about what the program potentially does
• Can use several tools to list the imports in a PE file
• At the very least, they should pose some questions to direct further analysis
• As you become more proficient, you will start to recognise various imports and the potential behaviour they indicate
• But at the start you’ll end up having to look up the functions in the documentation (see MSDN)

[Broken image link removed]

Example taken from Practical Malware Analysis, p19
Explain the W/A suffix… whether the function takes ASCII or Unicode chars
Demo where to find documentation

https://docs.microsoft.com/en-us/windows/win32/api/

QUESTIONS ARISING
• It’s searching for files — which files and why?
• It’s creating/manipulating files — which files and why?
• It’s hooking into Windows — what for?
• It has a GUI — how do we activate it, what options does it give?
• Potentially via a hot key (see RegisterHotKey()) — which key combo?
• Does the .rsrc section contain something interesting?

#### 导出函数

PE文件中包含一个文件中导出了哪些函数的信息。因为DLL文件本身就是实现一些导出函数然后被exe可执行文件使用的，因此导出函数在DLL文件中是最常见的，而在exe文件中却很少见。

## 2. 基础动态分析

Dynamic Analysis — Looking at the malware ‘as it runs’

但对待高端的恶意代码往往无效。

先静态，后动态，在静态走到不能走之后再做动态分析。

### 沙盒

Cuckoo Sandbox

Hybrid analysis，帮你运行恶意文件，并生成报告。
https://www.hybrid-analysis.com/

Including doing some static analysis

### 运行恶意代码

[Broken image link removed]
cl arguments should be visible as strings in the static analysis

DLL有点特殊，可以用rundll32.exe进行运行。

C:>rundll32.exe DLLname, Export arguments

Export arguments必须是DLL导出函数列表中的函数名或序号（例如#1，#5）。

### Process Monitor【工具】

[Broken image link removed]

在进行监控分析之前，
先暂停捕获，File->Capture Events
先clear一下屏幕，Edit->Clear Display

可以使用filter来进行过滤，不然内容太多了，
并不能阻止消耗内存。

### Process Explorer【工具】

[Broken image link removed]
[Broken image link removed]

### 网络监听

tcpdump or WireShark

FakeNet-NG
https://www.fireeye.com/services/freeware/fakenet-ng.html

The approach below runs completely outside the VM, and as such requires multiple VMs to be setup. It is likely that you may need to use both approaches.

But it is also relatively straight-forward to build an isolated LAN environment for testing, this page contains details of how to implement an isolated LAN environment.

Golden Rule: Check you are isolated before you run anything! traceroute/ping are your friends here.

#### Isolated Network Environments

Lots of malware makes network connections to local and remote machines. Can be instructive to watch this traffic to see what is happening.
e.g. by using tools such as tcpdump or WireShark

However, we do not want the malware to infect other computers or phone home, so we need to be able to run it within an isolated network environment.

Relatively straight-forward to build an isolated LAN.

Virtual Machine Hypervisors allow us to create virtual network switches, and to connect VMs to these switches. It is optional whether these switches are then connectd to the host computer, or the wider network environment (Internet). If we only connect the VMs to the virtual network switch, we create a contained network environment that allows the VMs to talk to each other, but nothing else.

But we want to be able to see traffic going to remote machines on the Internet…

Need to build an isolated ‘Internet’…

Not as hard as it sounds!

We only have to ensure that the VM can send/receive packets to the addresses it tries to, we don’t have to emulate the whole Internet!

The routes packets take across the Internet can change, so even if the malware author were to know the IP address of the instituion it is unlikely they’d be able to detect that something had changed.

Can always introduce network latency to the connection to simulate travel over the open Internet.

The following assumes that you are building this using virtual machines, but the same principles apply if you want to build it using real hardware (e.g. to watch malware on a phone handset via an isolated wireless network).

There are also some tools, such as FakeNet-NG, that can be used to simulate the network from within the test VM. The approach below runs completely outside the VM, and as such requires multiple VMs to be setup. It is likely that you may need to use both approaches.

This page details how to setup the environment for malware analysis, there are many tutorials available online that detail how to configure Linux/OpenBSD/etc. as routers, DHCP servers or DNS servers and so that information is not duplicated here. The links above are purely given as examples and were the first hits I came across on google… Other documentation is available — google it 😃

Golden Rule: Check you are isolated before you run anything! traceroute/ping are your friends here

#### Isolated Internet Topology

Aiming for a topology similar to the one shown in the following diagram:

[Broken image link removed]
Technically, only the VMs in the dashed box need to be completely isolated, but I’d advise caution and connect the other bits to the real Internet only when needed.

Three main components of the network.

#### Isolated Network

(represented by the dashed box)

This is where we run our isolated VMs, important that this network does not connect to the host machine in anyway.

Notice, I’ve assigned this an IP range in an RFC1918 space but this is entirely optional — remember this network is not on the Internet so we can use any IP addresses we like 😃

Need to create this as a separate isolated network in the hypervisor. Using VMWare as an example, this is a two step process, which mimics the physical process of setting up a network:

1. Create a new network (under Preferences)
[Broken image link removed]

Ensure that all the options are turned off, we do not want the host computer to be able to access this and we’ll provide DHCP/DNS via our router.

1. Attach the isolated VMs’ NICs to this network alone:
[Broken image link removed]

Make sure this is the only NIC in these VMs otherwise they’ll be able to talk to the real internet.

Other Hypervisor environments (VirtualBox, Hyper-V, etc.) offer similar facilities.

#### Router VM

Routes packets from the isolated network to elsewhere.

Standard VM, could run Windows, Linux, (Free|Open|Net|Dragonfly)BSD etc. — any OS that can route packets.

You may need to enable packet fowarding with sysctl.

My preference here is to use OpenBSD.

Designed to be secure by default!

Also has some nice networking features (such as allowing multiple virtual routing tables) that can be useful as setups get more advanced…

Also pf, OpenBSD’s firewall, is much simpler to configure than iptables IMO!

Will have multiple (virtual) NICs.

One in the isolated network.

One to connect to the network containing any services being provided to the isolated network.

Note down the MAC addresses of these NICs when creating them — very useful to help find which NIC is which in the router OS.

Provides DNS/DHCP services to our isolated network
Need to be able to provide both a recursive DNS server for isolated VMs to use
li>

DNS server logs will tell us what domains our machines are looking at.

Can be clever and return a honeypot address for all domains queried.

But watch out for malware that checks for this!

Do you want this to fetch real addresses from the Internet? Interesting question, DNS can be used as a back channel for communication.

And also, an authorative DNS server so we can add/override our own entries.

Lots of software options here, but I’d probably start with djbdns for DNS…

Install WireShark, tcpdump etc.

Can be used to sniff traffic leaving the isolated network. Depending on the implementation of the hypervisor, you might be able to sniff all network traffic on the isolated network.

Note that it is possible to use tcpdump from the command line in the router VM to capture network packets to a file which can then be opened on another machine in WireShark for later perusal.

Even if hypervisor doesn’t let you sniff all traffic on the isolated network, it is possible to architect your network around this (create an individual isolated network for each VM and bridge together in the router…)

#### VMs to provide copies of Internet Serviecs

These are optional, you don’t necessarily need them.

Again exist on a separate isolated network in your hypervisor. Create this in the same manner as before.

Again, the diagram shows I’ve used RFC1918 IP addresses here. Note though that I chose a vastly different set of IPs to make it clear which network is which!

Probably makes sense to use static IP allocation here, so you know which VM is providing which service.

Can use whatever OS you like.

Setup services as normal, e.g. use Apache WWW server to create a ‘Fake Google’…

Remember: we don’t need to duplicate all the functionality of the service, only the parts the malware acccesses.

Again we would probably develop this iteratively. Run the malware, see what it accesses, set up a duplicate server, then rerun the malware.

Configure DNS server on router to provide the server’s IP for any domain (e.g. www.google.com should return 192.168.42.23 for our ‘Fake Google’ example in the diagram above).

Some machines will attempt to connect directly to a known IP, e.g. to 172.217.23.14 (instead of www.google.com)

This is relatively straight-forward to cope with.

Set up the desired IP as an alias on the loopback interface of the machine providing the service with a netmask of 255.255.255.255 (/32), so for the example in the diagram, we’d create the alias using:

OpenBSD: ifconfig lo0 inet alias 172.217.23.14 netmask 255.255.255.255.

Tell the router to route packets to that IP, to the IP address of the VM providing the service.

Route will now send any packets for 172.217.23.14 to the correct VM.

Turn on packet forwarding on the VM providing the service, via sysctl, otherwise the VM won’t answer the packets.

Same approach will work for SSL/TLS encryped connections, but you’ll need to add your signing certificate to the infected VMs.

Will probably end up adding more services as you see what the malware tries to access.

Run them on separate VMs and build up a library of common services to reuse (via VM snapshots/clones).

Could also allow some packets to access the real Internet, but be careful! (Add another virtual NIC to the router VM that is connected to the Internet).

### 未整理

[Broken image link removed]
Instructions encode the algorithms
Manipulate data, change values in memory, update the values in registers

[Broken image link removed]
[Broken image link removed]
[Broken image link removed]

[Broken image link removed]
Stuxnet’s 21 DLL exports
Taken from the Symantec analysis report on Stuxnet
W32.Stuxnet Dossier by Nicolas Falliere, Liam O Murchu, and Eric Chien

Often DLL behaviour is contained within a DllMain function — called when the DLL started

DllMain与DllEntryPoint：

DllEntryPoint:
A DLL entry point, typically called DllMain

DllMain: The DLL entry point. The name DllMain is a placeholder for the library-defined function name. The DirectShow implementation uses the name DllEntryPoint.

[Broken image link removed]

First one is a bit like the old ‘spot the difference’ game
Last one won’t necessary tell us what the program does, but we can use it to see if the program has been monkeyed about with Look at process memory space exploration in the lab

[Broken image link removed]
A bit like ‘spot the difference’
What do we mean by state — files, registry etc. see if any have changed Fortunately we can automate this

[Broken image link removed]

Our static analysis might reveal ‘files of interest’ — if so we can start off by just looking at them — but we might miss something Hashes will tell us which files have changed, but we wouldn’t have the original file to compare with.
e.g. by using a tool such as Regshot

[Broken image link removed]
It might not always do the same thing each time it is run Enables us to confirm our suspicions, for example
if static analysis suggested this was a key logger
Dynamic analysis might reveal the file that’s created containing the logged keys
Can then rollback and run again typing in a specific message to search for in the created file to confirm
What are the limitations of this approach… — get them to think

[Broken image link removed]
What if the malware changes something Then restores the original value
Saw that with SolarWinds — changed the DLLHost.exe registry key Then put it back afterwards
rootkits can hide files from programs running
But not if we look at the file system externally，我们可以看看感染系统之外的的disk image，这样就能看到被rootkit隐藏的信息。

[Broken image link removed]
Outside world == anything outside the program

Network easy to intercept Win32 API calls harder

[Broken image link removed]
Assuming we let it talk to the internet
We can feed in our own packets and create a fake internet for the malware to talk to, capturing the data it sends and feeding in our own replies Or proxy the connections to see what it is talking to (And how)

[Broken image link removed]

IF networking is easy to monitor, monitoring API calls is harder Look at the tools on Wednesday

WIN32 API CALLS

[Broken image link removed]

[Broken image link removed]

[Broken image link removed]
e.g. might see that malware makes a DNS lookup to somewhere
Then go and rerun it (from the same starting point) logging connections made to that machine Then find out that the data uses encryption (such as HTTPS)
So run it again using something

## 3. 高级静态分析

[Broken image link removed]

### 机器码

#### 操作码 opcode

opcode：一条机器指令，

Prefix (1 bytes optional) !
Opcode can be 1, 2 or 3 bytes!
ModR/M is one byte (optional) — modifies which registers/memory are used! SIB 1 Byte optional !
Immediate value (1, 2 or 4 bytes) — immediate value if used by the instruction! 64-bit mode can have an extra byte in here too!!

[Broken image link removed]

• x86 opcodes can vary in length from 1 to 15 bytes long
• Difficult to fetch from memory, since length is not know till you are three or four bytes into decoding it
• eip points to the first byte of the next instruction, updated as instructions read
• Possibly to use the structure to make it difficult for a disassembler to find code
• But not the CPU…

#### 机器码指令的类型

• Basic operations
• Maths（加减乘除）
• Boolean Algebra
• Comparisons
• Memory Access — usually including support for stacks
• Flow control — jumps and branches (both conditional and unconditional)
• Subroutines，子程序，子过程，调用其他函数，其他函数称为子过程，call谁谁就是谁的子过程。
• And others

Other instructions depend on the type of CPU — RISC CPUs will often just have a minimal subset of the above (perhaps with instructions to switch processor modes)!

CISC CPUS (such as the x86) add loads — AES encoding, string manipulations etc.

### x86体系结构

Windows系统在x86和ARM的CPUs上均可运行，
x86是迄今为止最常见的平台，无论是32位还是64位。

[Broken image link removed]
Control Unit通过寄存器（指令指针）从内存中获得指令去执行，寄存器负责存储指令地址。

ALU（Arithmetic Logic Unit）执行从内存中获取的指令，并将运算结果放回寄存器或内存中。

#### CISC架构

CISC的英文全称为“Complex Instruction Set Computer”，即“复杂指令系统计算机”，从计算机诞生以来，人们一直沿用CISC指令集方式。

CISC架构的服务器主要以IA-32架构(英特尔架构)为主，而且多数为中低档服务器所采用。

Little-endian：将低序字节存储在起始地址（低位编址）

#### CPU

CPU中央处理器主要负责执行代码。

• 1978年6月，Intel推出了8086微处理器，标志着第三代微处理器问世。它采用16位寄存器、16位数据总线和29000个3微米技术的晶体管，售价360美元。不过当时由于360美元过于昂贵，大部分人都没有足够的钱购买使用此芯片的电脑，于是Intel在1年后推出了8位数据总线的微处理器8088。IBM公司1981年生产的第一台电脑就是使用的这种芯片。

• 70年代末，因特尔生产了著名的16位8086处理器，之后又推出了80186与80286；

• 1985年，因特尔继摩托罗拉之后，第二个研制出32位的微处理器80386；

• 1989年，因特尔推出80486处理器，具有浮点运算功能；

• 1993年，因特尔推出奔腾处理器，不再以数字命名其产品；

在工业界和学术界，
大家仍然习惯性的把因特尔的CPU称为X86系列，
X作为通配符代替前面的数字。

x86正式一点的名字是IA-32（Intel Architecture 32-bit）。

x86架构的特点是CPU的寄存器是32位的，
因此也叫32位CPU。

32位操作系统也通常被称为x86系统。

AMD皓龙，64位

[Broken image link removed]

X86和X86_64和AMD64

Intel开始向64位架构发展，那么有2选择：

1. 向下兼容x86
2. 完全重新设计指令集，不兼容x86

i386

i386通常被用来作为对Intel（英特尔）32位微处理器的统称。

##### 寄存器

寄存器是CPU中数据的临时基本存储单元，
访问寄存器的速度要高于访问内存的速度，
CPU通过寄存器很多时候不再需要访问内存，
从而节省了时间。

寄存器中既可以存储数据，又可以存储地址。

1. 可将寄存器内的数据执行算术及逻辑运算。
2. 存于寄存器内的地址可用来指向内存的某个位置，即寻址。

x86寄存器种类：

• 通用寄存器（general registers），8个32位(或64位)，用于CPU执行；
• 段寄存器（segment registers），6个16位，用于定位内存节；
• 状态标志，Status Register (EFLAGS)，用于条件判断；
• 指令指针，Instruction Pointer (EIP)，用于定位要执行的下一条指令；

64位模式添加了另一个8寄存器r9-r15!

For example, string instructions use the contents of the ECX, ESI, and EDI registers as operands!

[Broken image link removed]
Could access the ax register as two 8-bit registers (high byte ah and low byte al)!

The same register was then extended to 32-bits, as eax — but you could still access the 16-bits (ax) or the 8-bit (ah and al) ! (Around the time of 80386)!

The same register was then extended to 64-bits, as rax — but you could still access the 32bits (eax) or the 16-bits (ax) or the 8-bit (ah and al) ! With the AMD Opteron!

[Broken image link removed]
[Broken image link removed]

[Broken image link removed]

[Broken image link removed]

###### 通用寄存器

通用寄存器一般用于存储数据或内存地址，
而且经常交换着使用以完成程序。

乘法和除法指令只能使用EAX和EDX。

还有一些约定（convention）：
EAX通常存储了一个函数的返回值，看到一个函数调用后立刻使用EAX，可能是在操作返回值。

EAX:累加器(Accumulator)，

EAX是很多加法乘法的默认寄存器，

AX寄存器是算术运算的主要寄存器。

EBX：基地址寄存器(Base Register)，

ECX：计数寄存器（Count Register），它的低16位即是CX，而CX又可分为高8位CH和低8位CL。

EDX：数据寄存器（Data Register），

ESI/EDI：分别叫做源/目标索引寄存器(Source/Destination Index Register)，

DS:ESI指向源串，

EBP/BSP：分别是基址针寄存器（Base Pointer Register）/堆栈指针寄存器（Stack Pointer Register），低16位是BP、SP，其内存分别放着一个指针，该指针永远指向系统栈最上面一个栈帧的栈顶/底部。

SP为堆栈指针(Stack Pointer)寄存器，

###### 标志寄存器

[Broken image link removed]

###### 指令指针

EIP的唯一作用就是告诉处理器下面该做什么，

#### 指令

movecx0x42

x86有一个寄存器-内存架构，

在32位系统中，每个地址都是4字节长，
因此后面用到参数偏移offset时，需要加4。

##### 操作码和字节序

mov ecx, 0x42的操作码是B9 42 00 00 00。

IP地址127.0.0.1会表示为：

x86的CPU是有点无字节序的，

##### 操作数Operand

• 立即数（immediate）
固定的值，如0x42。

• 寄存器（register）
指向寄存器，如ecx。

• 内存地址（memory address）
指向内存地址，一般由方括号内中的值、寄存器或方程式（计算内存地址）组成，如[eax]，指向内存地址为EAX处的数据。使用方程式来计算内存地址可以节省空间，不需要额外的指令来计算公式，不加方括号就是一条非法指令。要明确一点，内存地址是可以用来计算的。

##### 常见指令

###### 赋值指令

mov destination, source
移动数据，复制，用于读写内存，
将数据从一个位置移动到另一个位置，
将数据移动到寄存器或内存，
注意，这里是直接覆盖，并不会累加。
（mov是最简单常见的指令）

【是复制还是剪切？复制】

mov eax, ebx

mov eax, 0x42

mov eax, [0x4037C4]
将内存地址为0x4037C4处的4个字节复制到EAX

mov eax, [ebx]
将EBX指向的内存地址处的4个字节复制到EAX

mov eax, [ebx + esi * 4]
将ebx + esi * 4的计算结果指向的内存地址处的4个字节复制到EAX。

lea distinatioon, source

lea eax, [ebx+8]
将EBX+8的值给EAX，这个值可能是个地址。

mov eax, [ebx+8]
将内存地址为EBX+8处的数据给EAX。

[Broken image link removed]

[ebx+8]的意思是计算内存地址，

0x00B30040 + 8 = 0x00B30048，

lea ebx, [eax * 5 + 5]

###### 运算指令

sub destination, value

sub指令会修改2个标志：ZF和CF

sub eax, 0x10
EAX的值减去0x10

inc edx

Increments EDX by 1

dec ecx

Decrements ECX by 1

乘法与除法指令只能使用预先规定的寄存器EAX和EDX，
乘法与除法指令要操作的寄存器一般会在之前许多条指令的地方被赋值，
因此需要在程序的上下文中来寻找。

mul value
乘法，总是将EAX乘上value，

EDX存储高32位，EAX存储低32位。

mul 0x50

div value

将EDX与EAX合起来存储的64位值除以value，

除法的商将存储到EAX，余数存储在EDX。

div 0x75

imul与idiv是mul与div的有符号版本。

xor eax, eax
Clears the EAX register

xor ecx, ecx

or eax, 0x7575

[Broken image link removed]

shr destination, count

shl destination, count

ror，循环右位移，将最低位循环移动到最高位

rol，循环左位移，将最高位循环移动到最低位

mov eax, 0xA
shl eax, 2

mov bl, 0xA
ror bl, 2

###### 条件指令

test eax, eax

test指令执行完毕后去看ZF标志位。

test指令与and指令的功能一样，

test指令只设置标志位。

test eax,eax基本上和and eax,eax是一样的，

test只是改变FLAG寄存器的状态，

test eax,eax
je xxxxxxxx

test指令的操作是将目的操作数和源操作数按位与，

test指令操作是目的操作数和源操作数按位逻辑“与“操作

Flags
CF是进位标志,
PF是奇偶标志
AF是辅助进位标志
ZF是零标志
SF是符号标志
OF是溢出标志.

cmp [ebp+argc], 3

cmp [ebp+VersionInformatioon.dwplatformId], 2

cmp [ebp+VersionInformation.dwMajorVersion], 5

cmp指令与sub指令的功能一样，

cmp指令也是只用于设置标志位，

[Broken image link removed]

###### 分支指令

在汇编指令中没有if语句，只有条件跳转。
条件跳转使用标志位来决定是跳转，还是继续执行下一条指令。

jmp location

jmp short

jmp near
16位相对近转移

jz

jnz

[Broken image link removed]

###### 重复指令

rep

repe, repz

repne, repnz

[Broken image link removed]

rep cmpsb
rep stosb
rep movsb
repne scasb

[Broken image link removed]

[Broken image link removed]

###### 其他指令

http://www.intel.com/products/processor/ manuals/index.htm.

#### 内存RAM

[Broken image link removed]

x86 MEMORY MODEL
• x86 has various models for how we can access memory
• Segmented memory
• Address accessed built up from the value in a segment register, and the address specified in the instruction
• Fortunately, these days most operating systems use a flat memory model
• But the segment registers still exist…

(Including Windows)

##### 栈

栈是一种用来压和弹操作后入先出（LIFO）的数据结构。

在内存中，栈被分配成从上到下的（数据从上面压入到栈中），
最高的内存地址最先被使用，
持续的往栈中压入数据，则使用越低的内存地址。

内存地址也分大小，[最小的地址/低位内存]在上面，
[最大的地址/高位内存]在下面，
这也是为什么内存地址寻址要用加减法。

[Broken image link removed]

x86架构中，ESP与EBP原生支持栈，
ESP是栈指针（stack pointer），
包含了指向栈顶的内存地址。
一些数据被压入或弹出栈时，ESP的值相应改变。

EBP是个基指针（base pointer），
在一个函数中会保持不变，
因此程序可以使用它作为占位符来跟踪局部变量和参数的位置。

##### 栈相关指令

push, pop, call, leave, enter, ret.

使用push指令将函数的参数压入栈中，
参数1在最下面，参数2在参数1上面，以此类推。

pop ebx

##### 函数调用

每一次函数调用，就会产生一个新的栈帧。
函数维护它自己的栈帧，直到返回，
这时调用者的栈帧被恢复，执行权也返回给了调用函数。

许多函数包含一段序言（prologue），
它是在函数开始处的少数几行代码，
用户保存函数中要用到的栈和寄存器。

在函数结尾的结语（epilogue），
将相关的栈和寄存器恢复至函数被调用前的状态。

1. 使用push指令将参数压入栈中
2. 使用call指令来调用函数。此时，当前指令地址（EIP中的内容）被压入栈中。这个地址会在函数结束后，被用于返回到主代码。当函数开始执行时，EIP的值被设为函数的起始地址；
3. 通过函数的序言部分，分配栈中用于局部变量的空间，EBP（base pointer）也被压入栈中。这样就达到了为调用函数保存EBP的目的；
4. 函数自己的工作；
5. 通过函数的结语部分，恢复栈。调用ESP来释放局部变量，恢复EBP，以使得调用函数可以准确地定位它的变量。leave指令可以用作结语，因为它的功能是使ESP等于EBP，然后从栈中弹出EBP。
6. 函数通过调用ret指令返回，这个指令会从栈中弹出返回地址给EIP，因此程序会从原来调用的地方继续执行。
7. 调整栈，以移除此前压入的参数，除非它们在后面还要被使用。

FUNCTION CALLS
• Like almost all CPUs, x86 supports calling subroutines (function calls)
• Does this using the call instruction
• Address of next instruction can be specified
• Relative (to the current instruction)
• Absolute
• Indirectly (the address pointed to by…)

• Once the destination address is calculated call will then
• Current instruction pointer (eip/rip) pushed onto the stack
• Instruction pointer set to address of start of subroutine
• Can return from a subroutine using ret
• Pops the old instruction pointer from the stack
• Places it into eip/rip so next instruction carries on after call
• Optionally, can then add an offset to the stack pointer

FUNCTION CALLS AND THE STACK
• Note that ret fetches the return address off the stack
• Stack is also used for various other things such as local variables and arrays
• Possibly to overwrite the return value on the stack
• If malware can control where ret returns to, it can cause a program to do ‘something else’…
• Often used as a vector for initially executing malware code…
• Two mechanisms used
• Cause the stack to be overwritten with the code we want to execute
• Return-oriented programming

Hello, buffer overflow!
There are was to protect against code on the stack using the DEP bit.

FUNCTION CALL ARGUMENTS
• call enables us to call a subroutine, but how do we pass arguments?

• Several conventions used, most involve placing arguments onto the stack

• Caller and callee need to agree on the way arguments are passed…

• Return values passed in eax/rax

• Variations exist on whether it is the job of the caller or the callee to clean the arguments off the stack

• And the order the values are placed on the stack (left to right, or right to left)

• Other variants will use registers to pass values

Note that smaller data types can be promoted in size to larger sizes!

[Broken image link removed]
Cdecl default for C/C++ on Windows! Argument length in bytes!
Also clrcall for managed functions! Thiscall default for C++ methods! 64-bit has its own calling convention! As does ARM!

FUNCTION PROLOGUES/EPILOGUE
• Most C compilers will compile a function prologue at the start of the function
• Pushes the current value of ebp
• Sets ebp to value of esp
• Allocate space for local variables on the stack (using sub)
• Preserve registers
• At the end of the function, an equivalent epilogue is generated to restore the stack/registers
• Side-effect of this is that ebp can be used to trace back up the call stack

##### 栈的布局

[Broken image link removed]

[Broken image link removed]

##### 反汇编Disassembly

• Need to also generate labels so we know where branches and subroutine calls go to
• If you come across a branch, subroutine call etc. then you make a note of that as being
another place to start converting code from…
• Should (in theory) find all the code accessible in the program

But as there is more than one way to make a jump !
Possible to have portions of code the disassembler doesn’t find — might come across them and need to go back and reanalyze some more code! (Now what were to happen if we were to jump into the middle of an instruction…)

ASSEMBLY SYNTAX
Two syntaxes used for x86 assembly language
• Intel syntax
• AT&T syntax
• Usual to use Intel syntax for x86 assembly language in the DOS and Windows world
• Instructions tend to have two operands — the first is the destination
• So add eax, 2 would mean eax = eax + 2

AT&T common in the UNIX world (e.g. Linux)! Lets go look at some assembly in Visual Studio 😉!

### Ghidra(工具)

windows-define strings

command + shift + F

https://www.sohu.com/a/299745429_120054144

https://ghidra-sre.org/

https://www.shogunlab.com/blog/2019/04/12/here-be-dragons-ghidra-0.html

https://www.shogunlab.com/blog/2019/12/22/here-be-dragons-ghidra-1.html

https://ghidra-sre.org/CheatSheet.html

### IDA pro（工具）

https://www.cnblogs.com/sch01ar/p/9537760.html

https://blog.csdn.net/dyxcome/article/details/91345138

https://blog.csdn.net/wang010366/article/details/52505345

打开文件时选择手动加载，可以加载PE文件头和所有节，恶意代码经常会往里面隐藏一些信息。

IDA图形视图会有执行流，
Yes箭头默认为绿色，
No箭头默认为红色，

find all occurrences.

[Broken image link removed]

[Broken image link removed]

自动注释
[Broken image link removed]

左侧的函数列表中，可以过滤函数长度，复杂的函数可能会更长一些，F表示库函数（library functions）， 可以在识别函数时跳过这些编译器生成的函数。start函数一般是程序的入口。

#### 快捷键

按G键，可以直接跳转至某一个地址，

shift + F12，显示strings窗口，也可以view-subviews-strings。

#### 交叉引用xref

type = r代表读，type = w代表写。

#### 分析函数

var_或者右边是负数表示局部变量。
arg_或者右边是正数表示参数。

### Windows

#### PEB

PEB（Process Environment Block，进程环境块）是存放进程信息的结构体（一种数据结构），拥有很多字段，包括全局上下文，启动参数，程序映像加载器等。

• Part of the kernel’s data structures about each process

• Fortunately this one lives in user space, so we can access it

• Contains a field called Ldr
• Pointer to a PEB_LDR_DATA structure

In computing the Process Environment Block (abbreviated PEB) is a data structure in the Windows NT operating system family. It is an opaque data structure that is used by the operating system internally, most of whose fields are not intended for use by anything other than the operating system.[1] Microsoft notes, in its MSDN Library documentation — which documents only a few of the fields — that the structure “may be altered in future versions of Windows”.[2] The PEB contains data structures that apply across a whole process, including global context, startup parameters, data structures for the program image loader, the program image base address, and synchronization objects used to provide mutual exclusion for process-wide data structures.[1]

https://www.cnblogs.com/DeeLMind/p/6854986.html

https://blog.csdn.net/CSNN2019/article/details/113113347

https://blog.csdn.net/CSNN2019/article/details/113105811

https://www.jianshu.com/p/28c8689b22af

[Broken image link removed]

typedef struct _PEB
{
UCHAR BeingDebugged;                             // 02h    这里QAQ
UCHAR Spare;                                     // 03h
PVOID Mutant;                                    // 04h
PPEB_LDR_DATA Ldr;                               // 0Ch
PRTL_USER_PROCESS_PARAMETERS ProcessParameters;  // 10h
PVOID SubSystemData;                             // 14h
PVOID ProcessHeap;                               // 18h
PVOID FastPebLock;                               // 1Ch
PPEBLOCKROUTINE FastPebLockRoutine;              // 20h
PPEBLOCKROUTINE FastPebUnlockRoutine;            // 24h
ULONG EnvironmentUpdateCount;                    // 28h
PVOID* KernelCallbackTable;                      // 2Ch
PVOID EventLogSection;                           // 30h
PVOID EventLog;                                  // 34h
PPEB_FREE_BLOCK FreeList;                        // 38h
ULONG TlsExpansionCounter;                       // 3Ch
PVOID TlsBitmap;                                 // 40h
ULONG TlsBitmapBits[0x2];                        // 44h
PVOID AnsiCodePageData;                          // 58h
PVOID OemCodePageData;                           // 5Ch
PVOID UnicodeCaseTableData;                      // 60h
ULONG NumberOfProcessors;                        // 64h
ULONG NtGlobalFlag;                              // 68h    还有这里！_(:зゝ∠)_
UCHAR Spare2[0x4];                               // 6Ch
LARGE_INTEGER CriticalSectionTimeout;            // 70h
ULONG HeapSegmentReserve;                        // 78h
ULONG HeapSegmentCommit;                         // 7Ch
ULONG HeapDeCommitTotalFreeThreshold;            // 80h
ULONG HeapDeCommitFreeBlockThreshold;            // 84h
ULONG NumberOfHeaps;                             // 88h
ULONG MaximumNumberOfHeaps;                      // 8Ch
PVOID** ProcessHeaps;                            // 90h
PVOID GdiSharedHandleTable;                      // 94h
PVOID ProcessStarterHelper;                      // 98h
PVOID GdiDCAttributeList;                        // 9Ch
ULONG OSMajorVersion;                            // A4h
ULONG OSMinorVersion;                            // A8h
ULONG OSBuildNumber;                             // ACh
ULONG OSPlatformId;                              // B0h
ULONG ImageSubSystem;                            // B4h
ULONG ImageSubSystemMajorVersion;                // B8h
ULONG ImageSubSystemMinorVersion;                // C0h
ULONG GdiHandleBuffer[0x22];                     // C4h
PVOID ProcessWindowStation;                      // ???
}



### C语言

#### 主函数

int main(int argc, char ** argv)

argc是命令行参数的个数，包括程序名字本身，
argv是字符串数据指针，指向所有的命令行参数。

xxx.exe -r filename.txt

argc = 3
argv[0] = xxx.exe
argv[1] = -r
argv[2] = filename.txt

int main(int argc, char * argv[])
{
if (argc != 3) {return 0;}
if (strncmp(argv[1], "-r", 2) == 0) {
DeleteFileA(argv[2]);
}
return 0;
}



004113CE cmp [ebp+argc], 3 ;判断argc是否等于3
004113D2 jz short loc_4113D8

004113D4 xor eax, eax
004113D6 jmp short loc_411414
004113D8 mov esi, esp

004113DA push 2 ; MaxCount
004113DC push offset Str2 ; "-r"
004113E1 mov eax, [ebp+argv] ; argv数组的开始地址被载入eax
004113E4 mov ecx, [eax+4] ; 对eax加上4（这就是偏移）得到argv[1]

004113E7 push ecx; Str1
004113E8 call strncmp
004113F8 test eax, eax
004113FA jnz short loc_411412

004113FC mov esi, esp ; 如果命令行中有-r，则这里会被执行
004113FE mov eax, [ebp+argv]
00411401 mov ecx, [eax+8] ; 童年各国argv[]偏移8来获得argv[2]
00411404 push ecx ; lpFileName
00411405 call DeleteFileA


#### C语言函数

int strncmp(char str1, char str2, size_t n)

wsprintf
wsprintf()将一系列的字符和数值输入到缓冲区。

%d 格式化为十进制有符号整数输出到缓冲区
%ld格式化为十进制有符号长整型数输出到缓冲区
%i,li 等同 %d,%ld
%u 格式化为十进制无符号整数输出到缓冲区
%lu格式化为十进制无符号长整型数输出到缓冲区
%s 格式化为字符串输出到缓冲区
%c 格式化为单个字符输出到缓冲区
%x 格式化为无符号以十六进制表示的整数(a-f小写输出)输出到缓冲区
%X 格式化为无符号以十六进制表示的整数(A-F大写输出)输出到缓冲区
%0 格式化为无符号以八进制表示的整数输出到缓冲区
%p 格式化为十六进制指针地址输出到缓冲区
Ix 在64位上格式化为无符号以十六进制表示的长整型数,在32位上格式化为无符号以十六进制表示的整型数(a-f小写输出)
IX 在64位上格式化为无符号以十六进制表示的长整型数,在32位上格式化为无符号以十六进制表示的整型数(a-f大写写输出)

strcmp 字符串比较函数
strcmp函数是string compare(字符串比较)的缩写，用于比较两个字符串并根据比较结果返回整数。基本形式为strcmp(str1,str2)，若str1=str2，则返回零；若str1<str2，则返回负数；若str1>str2，则返回正数。

lstrcmp

lpString1：指向将被比较的第一个字符串。
lpString2：指向将被比较的第二个字符串。

memcmp 比较函数

malloc 分配内存空间

send 发送

atoi

ascii to integer

### ShellCode

ShellCode是一段用于利用软件漏洞而执行的代码，ShellCode为16进制的机器码，因为经常让攻击者获得shell而得名。

ShellCode常常使用机器语言编写。

## 名词解释

### Manifest

Manifest是个XML的描述文件，对于每个DLL有DLL的Manifest文件，对于每个应用程序Application也有自己的Manifest。

XP以前版本的windows，会像以前那样执行这个exe文件，寻找相应的dll，没有分别Manifest只是个多余的文件或资源，dll文件会直接到system32的目录下查找，并且调用。

EXE调用DLL的过程

This is where your three options for requestedExecutionLevel start to come out:

asInvoker: The application will run with the same permissions as the process that started it. The application can be elevated to a higher permission level by selecting Run as Administrator.

highestAvailable: The application will run with the highest permission level that it can. If the user who starts the application is a member of the Administrators group, this option is the same as requireAdministrator. If the highest available permission level is higher than the level of the opening process, the system will prompt for credentials.

requireAdministrator: The application will run with administrator permissions. The user who starts the application must be a member of the Administrators group. If the opening process is not running with administrative permissions, the system will prompt for credentials.

https://docs.microsoft.com/en-us/windows/win32/sbscs/application-manifests

Dynamic analysis enables us to see when it calls them
What sequence it is done in
But also potentially what parameters they have.

### 软件

#### 普通软件

##### Sysinternals

Sysinternals之前为Winternals公司提供的免费工具，Winternals原本是一间主力产品为系统复原与资料保护的公司，为了解决工程师平常在工作上遇到的各种问题，便开发出许多小工具。

Sysinternals Suite包含一系列免费的系统工具，其中有大名鼎鼎的Process Explorer、FileMon、RegMon等，如果把系统管理员比喻成战士的话，那么Sysinternals Suite就是我们手中的良兵利器。

##### UPX

UPX(the Ultimate Packer for eXecutables)是一个非常全面的可执行文件压缩软件，支持 dos/exe、dos/com、dos/sys、djgpp2/coff、 watcom/le、win32/pe、rtm32/pe、tmt/adam、atari/tos、linux/i386 等几乎所有平台上的可执行文件，具有极佳的压缩比，还可以对未压缩的文件和压缩完后进行比较。

#### 恶意软件

##### WannaCry

WannaCry（又叫Wanna Decryptor），一种“蠕虫式”的勒索病毒软件，大小3.3MB，由不法分子利用NSA（National Security Agency，美国国家安全局）泄露的危险漏洞“EternalBlue”（永恒之蓝）进行传播 [1] 。

### Cobalt Strike

Cobalt Strike一款以Metasploit为基础的GUI框架式渗透测试工具，集成了端口转发、服务扫描，自动化溢出，多模式端口监听，exe、powershell木马生成等。

java执行，浏览器自动攻击等。

Cobalt Strike 主要用于团队作战，可谓是团队渗透神器，能让多个攻击者同时连接到团体服务器上，共享攻击资源与目标信息和sessions。

Cobalt Strike 作为一款协同APT工具，针对内网的渗透测试和作为apt的控制终端功能，使其变成众多APT组织的首选。

### SolarWinds

https://www.4hou.com/posts/pBk6

https://www.microsoft.com/security/blog/2021/01/20/deep-dive-into-the-solorigate-second-stage-activation-from-sunburst-to-teardrop-and-raindrop

IT管理软件提供商SolarWinds，

SolarWinds网络安全管理软件产品，
SolarWinds正在改变各类规模的企业监控和管理其企业网络的方式。

2020年12月14日，据路透社和《华盛顿邮报》报道，SolarWinds旗下的Orion网络监控软件更新服务器遭黑客入侵并植入恶意代码，导致美国财政部、商务部等多个政府机构用户受到长期入侵和监视，

SolarWinds供应链攻击已经导致许多美国政府机构和私营公司破产。

Solorigate攻击

#### RainDrop

Raindrop的发现是SolarWinds攻击的重要一步，

[Broken image link removed]

https://blog.csdn.net/smellycat000/article/details/112914568

## 一些资料

https://www.isolves.com/it/aq/hk/

tor networking
onion routing protocol

http://www.xinhuanet.com//2017-07/21/c_1121360325.htm

gnome

DDOS攻击需要配合在服务器内安装一个rookit一起使用，效果更好。

DDOS攻击，挖矿，传播病毒，网络诈骗，垃圾邮件，网络钓鱼，个人隐私数据被盗，造成被勒索或身份盗用，出租或出售僵尸机器的等等。

1、HTTP协议Content Lenth限制漏洞导致拒绝服务攻击

2、为了提高用户使用浏览器时的性能，现代浏览器还支持并发的访问方式，浏览一个网页时同时建立多个连接，以迅速获得一个网页上的多个图标，这样能更快速完成整个网页的传输。HTTP1.1中提供了这种持续连接的方式，而下一代HTTP协议：HTTP-NG更增加了有关会话控制、丰富的内容协商等方式的支持，来提供更高效率的连接。

[1]实际上，真实的网络犯罪分子会通过漏洞正面进攻没有及时打补丁的脆弱系统，也会通过木马病毒侧面进攻缺乏网络安全意识的人，从而获得计算机系统的控制权限，简称“肉鸡”或“僵尸主机”。那么黑客控制了成百上千台“僵尸主机”之后，他会做什么事情呢？这就涉及到僵尸网络的应用场景，最常见的应该是DDoS分布式拒绝服务。举一个不恰当的例子，生活中我们抢口罩、抢红包、抢优惠券，总是抢不到，为什么呢？因为资源是有限的，而竞争者却太多了。拒绝服务的本质和其是类似的，都是通过抢占资源，来迫使服务不可正常使用，僵尸网络发起的拒绝服务攻击，瞬间就可以使中小企业的网站瘫痪。

[Broken image link removed]

1万台“肉鸡”可以发送450万个数据包，

# 一些文件

## exe

svchost.exe
svchost.exe通常是services.exe的子进程，

winlogon.exe
Windows Logon Process（即winlogon.exe)，是Windows NT 用户登陆程序，用于管理用户登录和退出。该进程的正常路径应是C:\Windows\System32，且是以 SYSTEM 用户运行，若不是以上路径且不以 SYSTEM 用户运行，则可能是 W32.Netsky.D@mm 蠕虫病毒，该病毒通过 EMail 邮件传播，当你打开病毒发送的附件时，即会被感染。winlogon.exe是潜在被入侵的受害者。

explorer.exe

wupdmgr.exe
wupdmgr.exe是windows update manger的缩写，是自动升级的程序，存在于c:\windows\system32下，被删除或被重命名后能立即自动生成。

winup.exe
（infecter.undef.65501）是一款infecter被感染文件，infecter作为传统的病毒技术之一，至今仍广泛地使用。

## dll

Kernel32.dll
This is a very common DLL that contains core functionality, such as access and manipulation of memory, files, and hardware.

Kernel32.dll顾名思义就是内核相关的功能，

Advapi32.dll

User32.dll
This DLL contains all the user-interface components, such as buttons, scroll bars, and components for controlling and responding to user actions.

User32.dll中包含的则是用于执行用户界面任务的函数，比如把用户的鼠标点击操作传递给窗口，以便窗口根据用户的点击来执行预定的事件;

Gdi32.dll
This DLL contains functions for displaying and manipulating graphics.

GDI32.dll的名称用了缩写，全称是Graphical Device Interface(图形设备接口)，包含用于画图和显示文本的函数，比如要显示一个程序窗口，就调用了其中的函数来画这个窗口。

Ntdll.dll
This DLL is the interface to the Windows kernel. Executables generally do not import this file directly, although it is always imported indirectly by Kernel32.dll. If an executable imports this file, it means that the author intended to use functionality not normally available to Windows programs. Some tasks, such as hiding functionality or manipulating processes, will use this interface.

WSock32.dll and Ws2_32.dll
These are networking DLLs. A program that accesses either of these most likely connects to a network or performs network-related tasks.

Wininet.dll
This DLL contains higher-level networking functions that implement protocols such as FTP, HTTP, and NTP.

sfc_os.dll

psapi.dll
psapi.dll是Windows系统进程状态支持模块。

urlmon.dll
urlmon.dll是微软Microsoft对象链接和嵌入相关模块。通常情况下是在安装操作系统过程中自动创建的，对于系统正常运行来说至关重要。在正常情况下不建议用户对该类文件（urlmon.dll）进行随意的修改。它的存在对维护计算机系统的稳定具有重要作用。

# 常见Windows函数

GetModleFileName

GetModuleBaseName

IsProcessorFeaturePresent

lstrcpy(LPWSTR lpString1, LPCWSTR lpString2)

common Win32 API calls

createDirectory

CreateProcessA

CreateFile
WriteFile

MoveFile

CopyFile
DeleteFile

FindFirstFileW

FindNextFileW

FindFirstFile/FindNextFile
Used to search through a directory and enumerate the filesystem.

RegisterClassExW
SetWindowTextW
ShowWindow

SetWindowsHookExW

LowLevelKeyboardProc，LowLevelMouseProc

每次有新的键盘鼠标输入事件时，系统会调用该它们。

LowLevelKeyboardProc与LowLevelMouseProc函数是由SetWindowsHookExW函数用于指定当键盘鼠标事件发生时调用哪个函数。

Hook Procedure有点像事件监听器，一旦它们被注册，当特定事件发生时，它们会被调用，LowLevelKeyboardProc是一个响应键盘动作的hook。

-这是因为这个函数是由恶意软件dll导出的，因此它们可以被系统自动扫描、导入。

Windows的各个进程的地址空间是相互隔离的,所以一个进程代码是无法到另一个进程的地址空间去运行的.但是通过在进程中安装全局钩子的方法,钩子函数所在的DLL就有可能被操作系统加载到其它进程的地址空间中去,进而实现了DLL注入.实现了DLL注入之后,这个DLL的代码就可以在另一个进程的地址空间里做任何事.

1.调用SetWindowsHookEx安装系统范围内的钩子.

2.将钩子函数的实现写在DLL里

Windows的钩子Hook也是用来钩东西的，比较抽象的是他是用来钩Windows事件或者消息的。最常见的就是鼠标和键盘钩子，用Hook钩子钩住鼠标、键盘，当你的鼠标、键盘有任何操作时，通过Hook就能知道他们都做了什么了。我们可以在同一个钩子上挂很多东西。

http://blog.sina.com.cn/s/articlelist_1585708262_3_1.html

RegisterHotKey

RegCloseKey
RegDeleteValueW
RegOpenCurrentUser
RegOpenKeyExW
RegQueryValueExW

RegCreateKey

RegSetValue

Software\Microsoft\Windows\CurrentVersion\Run，恶意软件常用，控制windows启动时会自动装载哪些程序。

[Broken image link removed]

accept
Used to listen for incoming connections. This function indicates that the program will listen for incoming connections on a socket.

OpenProcessToken+AdjustTokenPrivileges+SeDebugPrivilege+LookupPrivilegeValue，可以4个连着一起用，出现在恶意软件中，比较经典，用来确保有权限去调用某些函数。

OpenProcessToken

AdjustTokenPrivileges
Used to enable or disable specific access privileges. Malware that performs process injection often calls this function to gain additional permissions.

SeDebugPrivilege权限
By default, users can debug only processes that they own. In order to debug processes owned by other users, you have to possess the SeDebugPrivilege privilege. But don’t grant this privilege casually, because once you do, you gave away the farm. If you let users debug processes owned by other users, then they can debug processes owned by System, at which point they can inject code into the process and perform the logical equivalent of net localgroup administrators anybody /add, thereby elevating themselves (or anybody else) to administrator.

LookupPrivilegeValue

Attaches the input processing for one thread to another so that the second thread receives input events such as keyboard and mouse events. Keyloggers and other spyware use this function.

bind
Used to associate a local address to a socket in order to listen for incom- ing connections.

BitBlt
Used to copy graphic data from one device to another. Spyware some- times uses this function to capture screenshots. This function is often added by the compiler as part of library code.

CallNextHookEx
Used within code that is hooking an event set by SetWindowsHookEx. CallNextHookEx calls the next hook in the chain. Analyze the function calling CallNextHookEx to determine the purpose of a hook set by SetWindowsHookEx.

CertOpenSystemStore
Used to access the certificates stored on the local system.

CheckRemoteDebuggerPresent
Checks to see if a specific process (including your own) is being debugged. This function is sometimes used as part of an anti-debugging technique.

CoCreateInstance
Creates a COM object. COM objects provide a wide variety of functional- ity. The class identifier (CLSID) will tell you which file contains the code that implements the COM object. See Chapter 7 for an in-depth explanation of COM.

connect
Used to connect to a remote socket. Malware often uses low-level func- tionality to connect to a command-and-control server.

ConnectNamedPipe
Used to create a server pipe for interprocess communication that will wait for a client pipe to connect. Backdoors and reverse shells sometimes use ConnectNamedPipe to simplify connectivity to a command-and-control server.

ControlService
Used to start, stop, modify, or send a signal to a running service. If mal- ware is using its own malicious service, you’ll need to analyze the code that implements the service in order to determine the purpose of the call.

CreateFile
Creates a new file or opens an existing file.

CreateFileMapping
Creates a handle to a file mapping that loads a file into memory and makes it accessible via memory addresses. Launchers, loaders, and injec- tors use this function to read and modify PE files.

CreateMutex
Creates a mutual exclusion object that can be used by malware to ensure that only a single instance of the malware is running on a system at any given time. Malware often uses fixed names for mutexes, which can be good host-based indicators to detect additional installations of the malware.

OpenMutex
Opens a handle to a mutual exclusion object that can be used by malware to ensure that only a single instance of malware is running on a system at any given time. Malware often uses fixed names for mutexes, which can be good host-based indicators.

The program creates a Mutex to ensure only one instance is running

CreateProcess
Creates and launches a new process. If malware creates a new process, you will need to analyze the new process as well.

CreateService
Creates a service that can be started at boot time. Malware uses CreateService for persistence, stealth, or to load kernel drivers.

CreateToolhelp32Snapshot
Used to create a snapshot of processes, heaps, threads, and modules. Malware often uses this function as part of code that iterates through processes or threads.

Process32First/Process32Next
Used to begin enumerating processes from a previous call to CreateToolhelp32Snapshot. Malware often enumerates through processes to find a process to inject into.

OpenProcess
Opens a handle to another process running on the system. This handle can be used to read and write to the other process memory or to inject code into the other process.用于打开要寄生的目标进程。

GetCurrentProcess

GetProcessHeap

CreateRemoteThread
Used to start a thread in a remote process (one other than the calling process). Launchers and stealth malware use CreateRemoteThread to inject code into a different process and inject DLL.远程加载DLL的核心内容，用于控制目标进程调用API函数。创建远程线程。

WriteProcessMemory
Used to write data to a remote process. Malware uses WriteProcessMemory as part of process injection.用于在目标进程中写入要加载的DLL名称。

VirtualAllocEx
A memory-allocation routine that can allocate memory in a remote process. Malware sometimes uses VirtualAllocEx as part of process injection.用于在目标进程中分配/释放内存空间。

LocalAlloc

uBytes：[in]指定要分配的字节数。
uFlags：[in] Specifies how to allocate memory. If zero is specified, the default is the LMEM_FIXED flag. The following table shows possible values.

GlobalAlloc

LocalFree

CryptAcquireContext
Often the first function used by malware to initialize the use of Windows encryption. There are many other functions associated with encryption, most of which start with Crypt.

CryptGenKey
https://docs.microsoft.com/en-us/previous-versions/aa925731(v=msdn.10)

This function generates a random cryptographic session key or a public/private key pair for use with the cryptographic service provider (CSP). The function retrieves a handle to the key in the phKey parameter. This handle can then be used as needed with any of the other CryptoAPI functions requiring a key handle.

When calling this function, the application must specify the algorithm. Because this algorithm type is kept bundled with the key, the application does not need to specify the algorithm later when the actual cryptographic operations are performed.

CryptEncrypt
https://docs.microsoft.com/en-us/previous-versions/aa925235(v=msdn.10)
This function encrypts data. The key held by the cryptographic service provider (CSP) and referenced by the hKey parameter specifies the algorithm used to encrypt the data parameter.

CryptImportKey
https://docs.microsoft.com/en-us/previous-versions/aa919782(v=msdn.10)

CryptExportKey
https://docs.microsoft.com/en-us/previous-versions/windows/embedded/ms884452(v=msdn.10)
This function exports cryptographic keys from of a cryptographic service provider (CSP) in a secure manner.

The caller passes to the CryptImportKey function a handle to the key to be exported and gets a key binary large object (BLOB). This key BLOB can be sent over a nonsecure transport or stored in a nonsecure storage location. The key BLOB is useless until the intended recipient uses the CryptImportKey function, which imports the key into the recipient’s CSP.

DeviceIoControl
Sends a control message from user space to a device driver. DeviceIoControl is popular with kernel malware because it is an easy, flexible way to pass information between user space and kernel space.

An exported function that indicates that the program implements a COM server.

DllGetClassObject
An exported function that indicates that the program implements a COM server.

DllInstall
An exported function that indicates that the program implements a COM server.

DllRegisterServer
An exported function that indicates that the program implements a COM server.

DllUnregisterServer
An exported function that indicates that the program implements a COM server.

EnableExecuteProtectionSupport
An undocumented API function used to modify the Data Execution Pro- tection (DEP) settings of the host, making it more susceptible to attack.

EnumProcesses
Used to enumerate through running processes on the system. Malware often enumerates through processes to find a process to inject into.

EnumProcessModules
Used to enumerate the loaded modules (executables and DLLs) for a given process. Malware enumerates through modules when doing injection.

GetVolumeInformation

FindWindow
Searches for an open window on the desktop. Sometimes this function is used as an anti-debugging technique to search for OllyDbg windows.

FtpPutFile
A high-level function for uploading a file to a remote FTP server.

Used to obtain information about the network adapters on the system. Backdoors sometimes call GetAdaptersInfo as part of a survey to gather information about infected machines. In some cases, it’s used to gather MAC addresses to check for VMware as part of anti-virtual machine techniques.

GetAsyncKeyState
Used to determine whether a particular key is being pressed. Malware sometimes uses this function to implement a keylogger.

GetDC
Returns a handle to a device context for a window or the whole screen. Spyware that takes screen captures often uses this function.

GetForegroundWindow
Returns a handle to the window currently in the foreground of the desktop. Keyloggers commonly use this function to determine in which window the user is entering his keystrokes.

gethostbyname
Used to perform a DNS lookup on a particular hostname prior to making an IP connection to a remote host. Hostnames that serve as command- and-control servers often make good network-based signatures.

gethostname
Retrieves the hostname of the computer. Backdoors sometimes use gethostname as part of a survey of the victim machine.

GetKeyState
Used by keyloggers to obtain the status of a particular key on the keyboard.

GetModuleFilename
Returns the filename of a module that is loaded in the current process. Malware can use this function to modify or copy files in the currently running process.

GetModuleHandle
Used to obtain a handle to an already loaded module. Malware may use GetModuleHandle to locate and modify code in a loaded module or to search for a good location to inject code.

GetStartupInfo
Retrieves a structure containing details about how the current process was configured to run, such as where the standard handles are directed.

GetSystemDefaultLangId
Returns the default language settings for the system. This can be used to customize displays and filenames, as part of a survey of an infected victim, or by “patriotic” malware that affects only systems from certain regions.

GetWindowsDirectory
Returns the file path to the Windows directory (usually C:\Windows). Malware sometimes uses this call to determine into which directory to install additional malicious programs.

GetTempPath
Returns the temporary file path. If you see malware call this function, check whether it reads or writes any files in the temporary file path.

SHGetSpecialFolderPath

Returns the context structure of a given thread. The context for a thread stores all the thread information, such as the register values and current state.

GetTickCount
Retrieves the number of milliseconds since bootup. This function is sometimes used to gather timing information as an anti-debugging tech- nique. GetTickCount is often added by the compiler and is included in many executables, so simply seeing it as an imported function provides little information.

GetVersionEx
Returns information about which version of Windows is currently run- ning. This can be used as part of a victim survey or to select between dif- ferent offsets for undocumented structures that have changed between different versions of Windows.

IsDebuggerPresent
Checks to see if the current process is being debugged, often as part of an anti-debugging technique. This function is often added by the com- piler and is included in many executables, so simply seeing it as an imported function provides little information.

Checks if the user has administrator privileges.

IsWoW64Process
Used by a 32-bit process to determine if it is running on a 64-bit operat- ing system.

Low-level function to load a DLL into a process, just like LoadLibrary. Normal programs use LoadLibrary, and the presence of this import may indicate a program that is attempting to be stealthy.

LoadLibrary
Loads a DLL into a process that may not have been loaded when the program started. Imported by nearly every Win32 program.

GetProcAddress
Retrieves the address of a function in a DLL loaded into memory. Used to import functions from other DLLs in addition to the functions imported in the PE file header.

GetProcAddress+LoadLibrary表示尝试调用一些DLL的函数。那我们要思考，哪些DLL会被动态加载？之后会调用哪些函数。

LoadResource
Loads a resource from a PE file into memory. Malware sometimes uses resources to store strings, configuration information, or other malicious files.

FindResource
Used to find a resource in an executable or loaded DLL. Malware sometimes uses resources to store strings, configuration information, or other malicious files. If you see this function used, check for a .rsrc section in the malware’s PE header.

SizeofResource
aaa

LsaEnumerateLogonSessions
Enumerates through logon sessions on the current system, which can be used as part of a credential stealer.

MapViewOfFile
Maps a file into memory and makes the contents of the file accessible via memory addresses. Launchers, loaders, and injectors use this function to read and modify PE files. By using MapViewOfFile, the malware can avoid using WriteFile to modify the contents of a file.

MapVirtualKey
Translates a virtual-key code into a character value. It is often used by keylogging malware.

Similar to GetProcAddress but used by kernel code. This function retrieves the address of a function from another module, but it can only get addresses from ntoskrnl.exe and hal.dll.

Module32First/Module32Next
Used to enumerate through modules loaded into a process. Injectors use this function to determine where to inject code.

Submits a request for a program to be run at a specified date and time. Malware can use NetScheduleJobAdd to run a different program. As a mal- ware analyst, you’ll need to locate and analyze the program that will be run in the future.

NetShareEnum
Used to enumerate network shares.

NtQueryDirectoryFile
Returns information about files in a directory. Rootkits commonly hook this function in order to hide files.

NtQueryInformationProcess
Returns various information about a specified process. This function is sometimes used as an anti-debugging technique because it can return the same information as CheckRemoteDebuggerPresent.

NtSetInformationProcess
Can be used to change the privilege level of a program or to bypass Data Execution Prevention (DEP).

OleInitialize
Used to initialize the COM library. Programs that use COM objects must call OleInitialize prior to calling any other COM functions.

OpenSCManager
Opens a handle to the service control manager. Any program that installs, modifies, or controls a service must call this function before any other service-manipulation function.

OutputDebugString
Outputs a string to a debugger if one is attached. This can be used as an anti-debugging technique.

PeekNamedPipe
Used to copy data from a named pipe without removing data from the pipe. This function is popular with reverse shells.

QueryPerformanceCounter
Used to retrieve the value of the hardware-based performance counter. This function is sometimes using to gather timing information as part of an anti-debugging technique. It is often added by the compiler and is included in many executables, so simply seeing it as an imported func- tion provides little information.

QueueUserAPC
Used to execute code for a different thread. Malware sometimes uses QueueUserAPC to inject code into another process.

Used to read the memory of a remote process.

recv
Receives data from a remote machine. Malware often uses this function to receive data from a remote command-and-control server.

RegisterHotKey
Used to register a handler to be notified anytime a user enters a partic- ular key combination (like CTRL-ALT-J), regardless of which window is active when the user presses the key combination. This function is some- times used by spyware that remains hidden from the user until the key combination is pressed.

RegOpenKey
Opens a handle to a registry key for reading and editing. Registry keys are sometimes written as a way for software to achieve persistence on a host. The registry also contains a whole host of operating system and application setting information.

Resumes a previously suspended thread. ResumeThread is used as part of several injection techniques.

RtlCreateRegistryKey
Used to create a registry from kernel-mode code.

RtlWriteRegistryValue
Used to write a value to the registry from kernel-mode code.

SamIConnect
Connects to the Security Account Manager (SAM) in order to make future calls that access credential information. Hash-dumping programs access the SAM database in order to retrieve the hash of users’ login passwords.

SamIGetPrivateData
Queries the private information about a specific user from the Security Account Manager (SAM) database. Hash-dumping programs access the SAM database in order to retrieve the hash of users’ login passwords.

SamQueryInformationUse
Queries information about a specific user in the Security Account Man- ager (SAM) database. Hash-dumping programs access the SAM database in order to retrieve the hash of users’ login passwords.

send
Sends data to a remote machine. Malware often uses this function to send data to a remote command-and-control server.

SetFileTime
Modifies the creation, access, or last modified time of a file. Malware often uses this function to conceal malicious activity.

Used to modify the context of a given thread. Some injection techniques use SetThreadContext.

SetWindowsHookEx
Sets a hook function to be called whenever a certain event is called. Commonly used with keyloggers and spyware, this function also provides an easy way to load a DLL into all GUI processes on the system. This function is sometimes added by the compiler.

Used to disable Windows file protection and modify files that otherwise would be protected. SfcFileException can also be used in this capacity.

ShellExecute
Used to execute another program. If malware creates a new process, you will need to analyze the new process as well.

StartServiceCtrlDispatcher
Used by a service to connect the main thread of the process to the service control manager. Any process that runs as a service must call this func- tion within 30 seconds of startup. Locating this function in malware tells you that the function should be run as a service.

Suspends a thread so that it stops running. Malware will sometimes sus- pend a thread in order to modify it by performing code injection.

system
Function to run another program provided by some C runtime libraries. On Windows, this function serves as a wrapper function to CreateProcess.

Used to iterate through the threads of a process. Injectors use these functions to find an appropriate thread to inject into.

Used to read the memory of a remote process.

URLDownloadToFile
A high-level call to download a file from a web server and save it to disk. This function is popular with downloaders because it implements all the functionality of a downloader in one function call. 如果有这个函数，则需要看一下字符串里是否有URL，如果有的话，很可能是从这里下载的。但我们要考虑，这个文件下载后会写在哪里?

VirtualProtectEx
Changes the protection on a region of memory. Malware may use this function to change a read-only section of memory to an executable.

WideCharToMultiByte
Used to convert a Unicode string into an ASCII string.

WinExec
Used to execute another program. If malware creates a new process, you will need to analyze the new process as well.执行可执行文件，我们要思考，哪个进程被开始新执行？

WlxLoggedOnSAS (and other Wlx* functions)
A function that must be exported by DLLs that will act as authentication modules. Malware that exports many Wlx* functions might be performing Graphical Identification and Authentication (GINA) replacement, as discussed in Chapter 11.

Wow64DisableWow64FsRedirection
Disables file redirection that occurs in 32-bit files loaded on a 64-bit sys- tem. If a 32-bit application writes to C:\Windows\System32 after calling this function, then it will write to the real C:\Windows\System32 instead of being redirected to C:\Windows\SysWOW64.

WSAStartup
Used to initialize low-level network functionality. Finding calls to
WSAStartup can often be an easy way to locate the start of network- related functionality.

GetCurrentProcessId

CloseHandle

## 网络

Converts an IP address string like 127.0.0.1 so that it can be used by func- tions such as connect. The string specified can sometimes be used as a network-based signature.

InternetOpen
Initializes the high-level Internet access functions from WinINet, such as InternetOpenUrl and InternetReadFile. Searching for InternetOpen is a good way to find the start of Internet access functionality. One of the parameters to InternetOpen is the User-Agent, which can sometimes make a good network-based signature.

InternetOpenUrl
Opens a specific URL for a connection using FTP, HTTP, or HTTPS. URLs, if fixed, can often be good network-based signatures.

InternetCloseHandle

Reads data from a previously opened URL.

InternetWriteFile
Writes data to a previously opened URL.

InternetGetConnectedState

ServiceMain，

sc delete MySQL

# 其他资料

http://www.ruanyifeng.com/blog/2018/01/assembly-language-primer.html

https://moodle.nottingham.ac.uk/mod/page/view.php?id=4850840；

https://moodle.nottingham.ac.uk/pluginfile.php/7236075/mod_resource/content/0/2020.COMP4101.04.Lab02.pdf

https://moodle.nottingham.ac.uk/pluginfile.php/7236075/mod_resource/content/0/2020.COMP4101.04.Lab02.pdf

B站大学
https://www.bilibili.com/video/BV1e4411r7VP?p=1

https://blog.csdn.net/qq_44370676

https://search.bilibili.com/all?keyword=%E6%B1%87%E7%BC%96&from_source=nav_suggest_new

https://blog.csdn.net/baidu_41108490/article/details/80323492

https://blog.csdn.net/baidu_41108490/article/details/80298973

https://blog.csdn.net/m0_37442062/article/details/102926761

xxx.DLL有若干个导入导出函数，想在动态分析时运行这个DLL，如何确定选择哪个函数当做rundll32.exe的参数？

If xxx.DLL has several import and export functions, and I want to dynamic analysis and run this DLL, how could I be sure which function is suitable for the parameter of rundll32.exe when I wanna run this DLL?

You would need to identify the functions which can be run from your static analysis (e.g. via Dependency Walker)

32位文件？64位文件？

W/A suffix是什么？

[Broken image link removed]

C++的库都是以下划线开头的

__crtTerminateProcess

C Runtime library，用C语言编程，

Practical Malware Analysis Lab7，9，11
https://blog.csdn.net/qq_35713009/article/details/88388609

https://malware-guide.com/blog/how-to-remove-guesswho-file-extension-ransomware

https://www.pcrisk.com/removal-guides/15384-guesswho-ransomware#creating-data-backups

## 基础静态分析

【综合】

1. 先上传到VirusTotal和hybrid-analysis工具上看看特征，看看反毒引擎的描述，大概判断它的类型；
2. 通过PEview、PE Studio工具查看相关信息；
3. 函数名不一定都是大写开头，也有都是小写；
4. CreateProcess与Sleep普遍在后门程序中出现，它们与与exec、sleep字符串组合出现，exec字符串通过网络给后门程序传送命令，sleep用于命令后门程序进入休眠。

【判断文件是否加壳加密？】

• 使用PEiD与VirusTotal双保险，看是否有提示，Microsoft Visual C++不是加壳标识；
• 使用PEview或VirusTotal，看节名称是否为UPX0，UPX1，UPX2，如果是的话代表被加壳，没有名字的节也代表被加壳。此外，节中虚拟大小远超过原始大小，代表是加壳的；
• 导入函数特别少，不超过10个（Hello World也会比这个多一些），有可能是加壳的，超过10个导入函数少代表是小程序；
• UPX脱壳命令：upx -o xxx.exe -d Lab01-02.exe
• size比大小

【字符串里的东西】

• 里面有xxx.exe，代表可能对.exe文件进行操作，可能是受害者，被攻击的对象，目标文件，一些类似以OpenProcess的操作可能都是针对这个xxx.exe的；

• 有公网IP或网址代表是网络传播恶意软件，字符串里可能用障眼法1和l来迷惑我们，字符串里有的信息可以帮助你去检查被感染机器内的线索，例如Malservice等自定义关键字；

• 路径信息可能是注册表的key；

【检查脱壳后的信息】
主要是看导入导出函数以及字符串

• 通过导入导出函数来推测其功能和含义，使用Dependency Walker和VirusTotal来查看导入导出函数，kernel32.dll与msvcrt.dll几乎被每个文件导入LoadResource、FindResource、SizeofResource表示对资源进行操作or对数据进行提取，访问.rsrc分区中的数据，此刻我们要考虑，是什么样的数据，这些数据代表了什么，这些提取出来的数据可以被新建成1个新文件之后完全当成1个新的程序来运行（嵌入在里面的可执行文件）。使用Resource Hacker打开文件，如果是二进制内容，但有一行"This program cannot be run in DOS mode"，这个是在所有PE文件头都会包含的错误信息，这表示在这个文件中包含了另一个文件，使用Action->save resource as binary file存储，之后再用PEview等去分析，WinExec执行磁盘上的可执行文件，碰到涉及到注册表的函数，可以搜一下字符串，看是否有相关的注册表键值；

之后看看有没有对注册表进行操作的函数调用。 （如果没有操作注册表的话就不会使用注册表来进行持久化）

sfc_os.dll，可疑

OpenProcessToken+AdjustTokenPrivileges+SeDebugPrivilege+LookupPrivilegeValue，可以4个连着一起用，出现在恶意软件中，比较经典，用来确保有权限去调用某些函数。

\system32\wupdmgr.exe，

GetTempPath有关，可能用这2个函数在system32路径中用wupdmgr.exe做些什么坏事。
%s%s，跟printf有关

EnumProcessModules
psapi.dll
GetModuleBaseNameA
psapi.dll
EnumProcesses
psapi.dll

wupdmgr.exe可能不是，

winup.exe可能不是，

wupdmgrd.exe可能不是，原因同上。

在字符串中，这句话出现2次是非常奇怪的： !This program cannot be run in DOS mode. 在字符串中，第2次出现这个的地方下面很可能是一个全新的可执行文件， 可能存在文件嵌套的情况。

一般的套路：先建1个文件，再把它写在某个地方。

用其他的可执行文件替换windows update manager（wupdmgr.exe）可以进行持久化。

## 基本动态分析

【前置准备工作】
0. 先进行静态分析；

1. 运行Process Monitor，设置过滤恶意代码名称，运行前清空所有事件，先停止Capture Events再运行；
2. 运行Process Explorer；
3. 使用Regshot进行注册表的第1次快照；
4. 使用xxx去模拟虚拟网络（暂时没有）；
5. 设置WireShark记录网络行为（暂时没有）；

【开始运行】

【运行结束】

1. 停止Process Monitor的事件捕获；
2. 使用Regshot进行注册表的第2次快照；
3. 检查是否有DNS请求（暂时没有）；
4. 查看Process Monitor的监视结果，并不是所有的操作都有意义，双击条目，如果是创建了1个新文件且相同，可能是把自己复制到了新的位置，可以用在Process Explorer中发现的PID进行过滤；
5. 比较Regshot的2次快照的变化，如果是往这里添加HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run，代表系统启动自动运行。

1. 使用Process Explorer来检查进程，以确定是否产生互斥或监听端口接受外来连接。单击xxx.exe进程，选择view->lower pane view->handles，可以看到创建的互斥量（mutant），选择view->lower pane view->DLL，可以看到恶意代码动态装载的DLL文件；
2. 查看网络通信（暂时没有）

## 高级静态分析

开启Ida pro的自动注释和机器码功能，非常有用。

有的程序太大太复杂，
千万不要一个指令一个指令的找线索，非常浪费时间，
把握整体，抽象成块状结构，可以只看call指令。

在分析dll文件时，导出函数列表非常有用。

看到可疑的字符串与函数时时，可以用IDA pro的交叉引用，看哪些地方调用了。

.text:004012BC 52
.text:004012BD 6A 01
.text:004012BF 8B 85 FC FE FF

### 反抗反汇编

需要打开左侧的机器码显示功能。

+1，+2的那种是无效地址，因为位于2条指令中间。

jz short near ptr unk_40126D
jnz short near ptr unk_40126D
【相同目标的跳转指令】
2个紧挨着的跳转指令跳转到相同的位置，

call near ptr CB4C550XX
call的地址感觉是荒谬的，不可跳转的，

jmp short near ptr loc_401215+1
jmp要跳转的地方紧挨着自己

test esp, esp
jnz short near ptr loc_401010+1

xor eax, eax
jz short near ptr loc_401010+1
【固定条件的跳转指令】

db 0E8h（理论上EB代表call，在E8后面紧跟着的内容就是目的地。E8的出现是为了欺骗反汇编器的陷阱）
db 0E9h （jmp）
db 0EBh

db 8Bh
db 45h
db 0Ch
db 0Fh

db 824648Bh, 0A164h等等

jz short near ptr loc_4012E6+2

db转化成代码，

### 一些常见指令

3行指令可以是组合起来做1件事情。

EAX通常存储了一个函数的返回值，看到一个函数调用后立刻使用EAX，可能是在操作返回值。

EBP通常引用局部变量、传进来的参数。
EBP通常引用局部变量、传进来的参数。
EBP是个基指针，在一个函数中会保持不变，

ESP是栈指针，包含了指向栈顶的内存地址，

add esp, 12



ascii_stricmp是实现，文档跟stricmp一样。

cmp [ebp+argc], 3
; argc是个参数，[ebp+argc]表示参数开始地址，与3进行比较。

push 2
push offset Str2 ;"-r"

mov eax, [ebp+argv]
; 数组的开始地址被加载进EAX

mov ecx, [eax+4]
; 对eax里保存的内存地址加4做偏移，
; 获得数组里第2个下标，加8就是第3个下标。

push ecx ; Str1
; strncmp(argv[1], "-r", 2)



在指令中的dword_40CF60表示变量存储在，
0x40CF60这个内存地址中。

2个全局变量相加，需要将x先复制到eax，
之后在eax上加y，再把eax复制回x。

int x = 1;
int y = 2;
void main() {
x = x+y;
printf("Total = %d\n", x);
}

mov eax, dword_40CF60
# 全局变量x通过dword_40CF60来标记，在0x40CF60处的内存位置。
# 全局变量y通过dword_40C000来标记，实行x+y。
mov dword_40CF60, eax
# 把x+y赋值给x
mov ecx, dword_40CF60
# 装载成参数，准备压入栈中
push ecx
push offset aTotalD ;"total = %d\n"

call printf


void main(){
int x = 1;
int y = 2;
x = x+y;
printf("Total = %d\n", x);
}

# 未标记
mov dword ptr [ebp-4], 0
mov dword ptr [ebp-8], 1
mov eax, [ebp-4]
mov [ebp-4], eax
mov ecx, [ebp-4]
push ecx
push offset aTotalD ; "total = %d\n"
call printf

# 标记后
mov [ebp+var_4], 0
mov [ebp+var_8], 1
mov eax, [ebp+var_4]
mov [ebp+var_4], eax
mov ecx, [ebp+var_4]
push ecx
push offset aTotalD ; "total = %d\n"
call printf



int a = 0;
int b = 1;
a = a + 11;
a = a - b;
a--;
b++;
b = a % 3;

mov [ebp+var_4], 0
mov [ebp+var_8], 1
mov eax, [ebp+var_4]
# 11转成16进制是0B。

mov [ebp+var_4], eax

mov ecx, [ebp+var_4]
sub ecx, [ebp+var_8]
mov [ebp+var_4], ecx

mov edx, [ebp+var_4]
sub edx,1
mov [ebp+var_4], edx

mov eax, [ebp+var_8]
mov [ebp+var_8], eax

mov eax, [ebp+var_4]
cdq
mov ecx, 3
idiv ecx
mov [ebp+var_8], edx



if跳转，对于跳转最好是看图形化界面，更容易理解：

int x = 1;
int y = 2;
if(x == y){
printf("x equals y.\n");
}else{
printf("x is not equal to y.\n");
}

mov [ebp+var_8], 1
mov [ebp+var_4], 2
mov eax, [ebp+var_8]
cmp eax, [ebp+var_4]
jnz short loc_40102B
# 如果x!=y，则跳转会发生，否则不会发生。
push offset aXEqualsY_ ; "x equals y.\n"
call printf
jmp short loc_401038
# 无条件跳转
loc_40102B:
push offset aXIsNotEqualToY ; "x is not equal to y.\n"
call printf



int x = 0;
int y = 1;
int z = 2;
if(x == y){
if(z==0){
printf("z is zero and x = y.\n");
}else{
printf("z is non-zero and x = y.\n");
}
}else{
if(z==0){
printf("z zero and x != y.\n");
}else{
printf("z non-zero and x != y.\n"); }
}

# 详略



int i;
for(i=0; i<100; i++) {
printf("i equals %d\n", i);
}

mov [ebp+var_4], 0
# 初始化i
jmp short loc_401016
loc_40100D:
mov eax, [ebp+var_4]
mov [ebp+var_4], eax
# i自增1
loc_401016:
cmp [ebp+var_4], 64h
# 比较i与100，16进制的64=10进制的100
jge short loc_40102F
# 如果不符合条件就跳出循环

mov ecx, [ebp+var_4]
push ecx
push offset aID ; "i equals %d\n"
call printf
# 打印i
jmp short loc_40100D
# 跳转回去继续循环你



[Broken image link removed]

while循环：

int status=0;
int result = 0;
while(status == 0){
result = performAction();
status = checkResult(result);
}

mov [ebp+var_4], 0
mov [ebp+var_8], 0
loc_401044:
cmp [ebp+var_4], 0
jnz short loc_401063
call performAction
mov [ebp+var_8], eax
mov eax, [ebp+var_8]
push eax
call checkResult
mov [ebp+var_4], eax
jmp short loc_401044



nop

[Broken image link removed]

call
把call指令的下一条指令的地址入栈，调用函数，函数执行完毕后，执行ret指令，将返回地址弹出到栈的顶部，并将它载入指令指针寄存器中，执行刚好返回到call后面的指令。

pop edi

PTR 运算符可以用来重写一个已经被声明过的操作数的大小类型。只要试图用不同于汇编器设定的大小属性来访问操作数，那么这个运算符就是必需的。

lodsb 指令:从esi指向的源地址中逐一读取一个字符,送入AL中; (然后,可以先判断这个字符是什么字符,如0dh,0ah之类等,再执行相应的操作);

(1) lodsb、lodsw：把DS:SI指向的存储单元中的数据装入AL或AX，然后根据DF标志增减SI。

(2) stosb、stosw：把AL或AX中的数据装入ES:DI指向的存储单元，然后根据DF标志增减DI

## Practical Malware Analysis纸质书

www.practicalmalwareanalysis.com
https://nostarch.com/malware.htm

https://en.wikipedia.org/wiki/Virtual_machine_escape

