An Empirical Study of Real-world Polymorphic Code Injection Attacks_potentially polymorphic call. the code may be inop-CSDN博客

本文链接：https://blog.csdn.net/iiprogram/article/details/4792476

本文分析了超过20个月期间针对真实互联网主机检测到的120多万个多态攻击。研究集中在攻击活动、目标网络服务、多态壳代码结构及其实际有效载荷的不同操作上。尽管大多数攻击使用简单的加密或多态性，但我们也观察到了一些使用更复杂混淆策略的攻击。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

An Empirical Study of Real-world Polymorphic Code Injection Attacks

Michalis Polychronakis

FORTH-ICS, Greece, email: mikepo@ics.forth.gr

Kostas G. Anagnostakis

I²R, Singapore, email: kostas@i2r.a-star.edu.sg

Evangelos P. Markatos

FORTH-ICS, Greece, email: markatos@ics.forth.gr

Abstract:

Remote code injection attacks against network services remain one of the most effective and widely used exploitation methods for malware propagation. In this paper, we present a study of more than 1.2 million polymorphic code injection attacks targeting production systems, captured using network-level emulation. We focus on the analysis of the structure and operation of the attack code, as well as the overall attack activity in relation to the targeted services. The observed attacks employ a highly diverse set of exploits, often against less widely used vulnerable services, while our results indicate limited use of sophisticated obfuscation schemes and extensive code reuse among different malware families.

1 Introduction

Despite considerable advances in host-level security hardening and network-level defenses, remote code injection attacks against network services persist as one of the most common methods for system compromise. Along with the more recently popularized client-side attacks that exploit vulnerabilities in users' software such as browsers and media players [15], remote code execution vulnerabilities continue to plague even the latest versions of popular OSes and server applications [2] and are effectively being exploited by malware, resulting in millions of infected hosts [3].

Motivated by the illicit financial gain against their victims, cyber-criminals constantly try to improve the effectiveness and evasiveness of their attacks, with the aim to compromise as many systems as possible and keep them under control for as long as possible. Code obfuscation and polymorphism [20] are among the most widely used evasion techniques employed by attackers to circumvent virus scanners and intrusion detection systems.

When polymorphism is applied to remote code injection attacks, the initial attack code is mutated so that every attack instance acquires a unique pattern, thereby making fingerprinting of the whole breed a challenge. The injected code--often dubbed shellcode--is the first piece of code that is executed after the instruction pointer of the vulnerable process has been hijacked, and carries out the first stage of the attack, which usually involves the download and execution of a malware binary on the compromised host. Polymorphic shellcode engines [10,7,21,16,4,1] create different mutations of the same initial shellcode by encrypting it with a different random key, and prepending to it a decryption routine that makes the code self-decrypting. Since the decryption code itself cannot be encrypted, advanced polymorphic encoders also mutate the exposed part of the shellcode using metamorphism [20].

Although the design and implementation of polymorphic shellcode has been covered extensively in the literature [8,18,7,16,6,13,14], and several research works have focused on the detection of polymorphic attacks [11,23,13,14], the actual prevalence and characteristics of real-world polymorphic attacks have not been studied to the same extent [12]. In this work, we present an analysis of more than 1.2 million polymorphic attacks against real Internet hosts--not honeypots--detected over the course of more than 20 months. The attacks were captured by monitoring the traffic of thousands of production systems in research and education networks using network-level emulation [13,14]. Nemu, our prototype implementation, uses a CPU emulator to dynamically analyze every potential instruction sequence in the inspected traffic and identify the execution behavior of self-decrypting shellcode.

Our study focuses on the attack activity in relation to the targeted network services, the structure of the polymorphic shellcode used, and the different operations performed by its actual payload. Besides common exploits against popular OS services associated with well known vulnerabilities, we witnessed sporadic attacks against a large number of less widely used services and third-party applications. At the same time, although the bulk of the attacks use naive encryption or polymorphism, and extensive sharing of code components is prevalent among different shellcode types, we observed a few attacks employing more sophisticated obfuscation schemes.

2 Network-level Emulation

We briefly describe the design and operation of nemu, the detector used for capturing the attacks. The interested reader is referred to our previous work [13,14] for a more thorough description and implementation details.

The principle behind network-level emulation is that the machine code interpretation of arbitrary data results to random code, which, when it is attempted to run on an actual CPU, usually crashes soon, e.g., due to an illegal instruction. In contrast, if a network request actually contains polymorphic shellcode, then the shellcode runs normally, exhibiting a certain detectable behavior.

**Figure 1:** A typical execution of a polymorphic shellcode using network-level emulation.
$/includegraphics[width=/columnwidth]{figs/payload_reads.eps}$

Nemu inspects the client-initiated data of each network flow, which may contain malicious requests towards vulnerable services. Each input is mapped to a random memory location in the virtual address space of an IA-32 emulator, as shown in Fig. 1. The execution of self-decrypting shellcode is identified by two key runtime behavioral characteristics: the execution of some form of GetPC code, and the occurrence of several self references, i.e., read operations from the memory addresses of the input stream itself, as illustrated in Fig 1. The GetPC code is used by the shellcode for finding the absolute address of the injected code, which is mandatory for subsequently decrypting the encrypted payload, and involves the execution of an instruction from the call or fstenv instruction groups [13].

We should note that for all captured attacks, nemu was able to successfully decrypt the original shellcode, while so far has resulted to zero false positives.