yara初学入门 howto Writing YARA rules翻译自 https://yara.readthedocs.io/en/stable/writingrules.html

最新推荐文章于 2024-05-29 09:45:45 发布

4pri1

最新推荐文章于 2024-05-29 09:45:45 发布

阅读量309

点赞数 2

分类专栏： yara 文章标签：反病毒脚本语言类型标记

本文链接：https://blog.csdn.net/qq_20533167/article/details/116428579

版权

yara 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

最近在用yara，但是没找到有比较全面的资料，就看了看外文资料，导师给了这个，我看了一下觉得还不错，就给顺便翻译出来了，欢迎一起学习。原文是用印象笔记写的，看着比这个舒服。
个人能力有限，翻译的有的地方不到位，甚至词不达意，原文没删，还请各位顺带看一下原文，有啥不对不准确的地方也欢迎反馈。
有的地方觉得复杂凌乱的，直接意译了。

唔唔，翻译了能有八九个小时吧。
终于完全搞出来了，有的地方是机翻的。。
相应的pdf已经传到资源里了，需要自取。

翻译自 https://yara.readthedocs.io/en/stable/writingrules.html
印象笔记地址
https://app.yinxiang.com/shard/s28/nl/23053629/f8d5cebc-6c8b-420c-b906-bafe046fa712?title=Writing%20YARA%20rules%20%E2%80%94%20yara%204.1.0%20documentation

Writing YARA rules

YARA rules are easy to write and understand, and they have a syntax that resembles the C language. Here is the simplest rule that you can write for YARA, which does absolutely nothing:

yara的规则很好写，他的语法和c语言很像，以下是简化你可以写的yara规则（其实是啥也没有：））

rule dummy
{
    condition:
        false
}

Each rule in YARA starts with the keyword rule followed by a rule identifier. Identifiers must follow the same lexical conventions of the C programming language, they can contain any alphanumeric character and the underscore character, but the first character cannot be a digit. Rule identifiers are case sensitive and cannot exceed 128 characters. The following keywords are reserved and cannot be used as an identifier:

标识符（变量）

每个yara的规则是以yara这个关键字开始的。标识符必须遵循和c相同的词法，他们可以包含任何字母数字字符和下划线字符，但是数字不能开头。标识符是大小写敏感的，并且不能超过128个字符，以下的关键字是被保留的，不可以用作标识符。

Rules are generally composed of two sections: strings definition and condition. The strings definition section can be omitted if the rule doesn’t rely on any string, but the condition section is always required. The strings definition section is where the strings that will be part of the rule are defined. Each string has an identifier consisting of a $ character followed by a sequence of alphanumeric characters and underscores, these identifiers can be used in the condition section to refer to the corresponding string. Strings can be defined in text or hexadecimal form, as shown in the following example:

规则通常是字符串定义和条件两部分组成的，如果规则不依赖于任何字符串，则可以省略字符串定义部分，但条件部分始终是必需的。（字符串可以没有，但是条件必须写）每个字符串都有一个标识符，由 $字符组成，然后是一系列字母数字字符和下划线，这些标识符可用于条件部分，以指相应的字符串。字符串可以定义为文本或六角形，如下列示所示：

rule ExampleRule
{
    strings:
        $my_text_string = "text here"
        $my_hex_string = { E2 34 A1 C8 23 FB }
          //这两个string是一样的

    condition:
        $my_text_string or $my_hex_string
}

Text strings are enclosed in double quotes （双引号）just like in the C language. Hex strings are enclosed by curly brackets, and they are composed by a sequence of hexadecimal numbers that can appear contiguously or separated by spaces. Decimal numbers are not allowed in hex strings.

文本的字符串是像c语言一样被双引号括死的，十六进制的字符是被花括号括住的，它们由一系列十六进制数字组成，这些数字可以连续出现或由空格隔开，十进制数不能出现在十六进制字符串中。

The condition section is where the logic of the rule resides. This section must contain a boolean expression telling under which circumstances a file or process satisfies the rule or not. Generally, the condition will refer to previously defined strings by using their identifiers. In this context the string identifier acts as a boolean variable which evaluate to true if the string was found in the file or process memory, or false if otherwise.

条件部分是规则逻辑所在的位置，这一部分，必须包含关于什么条件下一个文件或者进程符合当前条件的布尔表达式（反正就是啥时候这条规则成立），通常条件要用之前的标识符（就比如$condition1 and $condition2就是这两标识符指向的东西同时成立才行）

字符串标识符充当布尔变量，如果字符串在文件或过程内存中发现，则评估为真实，如果发现字符串错误，则进行真实评估。

Comments （注释）

You can add comments to your YARA rules just as if it was a C source file, both single-line and multi-line C-style comments are supported.

你可以写注释，// 和 /* */都行

/*
    This is a multi-line comment ...
*/

rule CommentExample   // ... and this is single-line comment
{
    condition:
        false  // just a dummy rule, don't do this
}

Strings（字符串）

There are three types of strings in YARA: hexadecimal strings, text strings and regular expressions. Hexadecimal strings are used for defining raw sequences of bytes, while text strings and regular expressions are useful for defining portions of legible text. However text strings and regular expressions can be also used for representing raw bytes by mean of escape sequences as will be shown below.

有三种写字符串的方法：16进制字符串，文本字符串，和标准表达式。十六简直字符串用于定义一串一系列的字节，另外两种多用于定义那些能直观看到的（我的理解就是strings命令能打印出来的）文本字符串和常规表达式也可用于通过序列表示原始字节，如下所示。

Hexadecimal strings
Hexadecimal strings allow three special constructions that make them more flexible: wild-cards（通配符）, jumps, and alternatives.

十六进制的允许使用通配符、跳格、转译？ [这里没有准确翻译]

Wild-cards are just placeholders（占位符） that you can put into the string indicating that some bytes are unknown and they should match anything. The placeholder character is the question mark (?). Here you have an example of a hexadecimal string with wild-cards:

通配符就是可以放未知的任何东西的占位符。占位符就是？看例子

rule WildcardExample
{
    strings:
        $hex_string = { E2 34 ?? C8 A? FB }

    condition:
        $hex_string
}

As shown in the example the wild-cards are nibble-wise, which means that you can define just one nibble of the byte and leave the other unknown.

就像例子里那样，通配符是闭合的，就是说你可以只写一个

$hex_string = { E2 34 ?? C8 A? FB } 应该说的就是这个

Wild-cards are useful when defining strings whose content can vary but you know the length of the variable chunks, however, this is not always the case. In some circumstances you may need to define strings with chunks of variable content and length. In those situations you can use jumps instead of wild-cards:

通配符用处是很大的，有一些长度位置的，或者字符发生变化的地方，直接用它。。。

在某些情况下，您可能需要用大量可变内容和长度来定义字符串。在这些情况下，您可以使用跳跃而不是通配符：

跳过？？：) xiaosi

rule JumpExample
{
    strings: 
        $hex_string = { F4 23 [4-6] 62 B4 }   //跳过 4 to 6 bytes  
         大致意思就是这4-6bytes是可以任意的，看下一个例子

    condition:
        $hex_string
}

In the example above we have a pair of numbers enclosed in square brackets and separated by a hyphen, that’s a jump. This jump is indicating that any arbitrary sequence from 4 to 6 bytes can occupy the position of the jump. Any of the following strings will match the pattern:

大致意思就是这4-6bytes是可以任意的，看例子

F4 23 01 02 03 04 62 B4             4
F4 23 00 00 00 00 00 62 B4          5
F4 23 15 82 A3 04 45 22 62 B4       6（bytes）

Any jump [X-Y] must meet the condition 0 <= X <= Y. In previous versions of YARA both X and Y must be lower than 256, but starting with YARA 2.0 there is no limit for X and Y.

就是可以跳过[x,y]闭区间个byte ，这个区间内是可以任意，可以没有。

These are valid jumps:

这是可以的跳跃

FE 39 45 [0-8] 89 00
FE 39 45 [23-45] 89 00
FE 39 45 [1000-2000] 89 00

This is invalid:

这样是不行的

FE 39 45 [10-7] 89 00

If the lower and higher bounds are equal you can write a single number enclosed in brackets, like this:

如果就起始和终止是一样的，直接写一个数是可以的。如下

FE 39 45 [6] 89 00

The above string is equivalent to both of these:

上面这个和下面这个是一样的这仨一样

  FE 39 45 [6] 89 00
  FE 39 45 [6-6] 89 00
  FE 39 45 ?? ?? ?? ?? ?? ?? 89 00

FE 39 45 [6-6] 89 00
FE 39 45 ?? ?? ?? ?? ?? ?? 89 00

Starting with YARA 2.0 you can also use unbounded jumps:

如果是yara2.0你还可以用未定义的边界

FE 39 45 [10-] 89 00    [10-无穷]
FE 39 45 [-] 89 00      [0-无穷]

There are also situations in which you may want to provide different alternatives for a given fragment of your hex string. In those situations you can use a syntax which resembles a regular expression:

也有一些情况下，你可能想要提供不同的替代方案，为您的十六进制字符串的给定片段。在这些情况下，您可以使用类似于常规表达式的语法：

rule AlternativesExample1
{
    strings:
        $hex_string = { F4 23 ( 62 B4 | 56 ) 45 }

    condition:
        $hex_string
}

This rule will match any file containing F4 23 62 B4 45 or F4 23 56 45.

这样可以指代这两种

F4 23 62 B4 45      
F4 23 56 45

But more than two alternatives can be also expressed. In fact, there are no limits to the amount of alternative sequences you can provide, and neither to their lengths.

多余两个替换符的也可以表达，其实对长度和数量没有限制

rule AlternativesExample2
{
    strings:
        $hex_string = { F4 23 ( 62 B4 | 56 | 45 ?? 67 ) 45 }

    condition:
        $hex_string
}

As can be seen also in the above example, strings containing wild-cards are allowed as part of alternative sequences.

这仨可以像上面这样组合一起写

Text strings（文本的字符串）

As shown in previous sections, text strings are generally defined like this:

文本的字符串

rule TextExample
{
    strings:
        $text_string = "foobar"

    condition:
        $text_string
}

This is the simplest case: an ASCII-encoded, case-sensitive string. However, text strings can be accompanied by some useful modifiers that alter the way in which the string will be interpreted. Those modifiers are appended at the end of the string definition separated by spaces, as will be discussed below.

这是最简单的情况，一个ascii的编码的条件敏感字符串.但是，文本字符串可以附带一些有用的修饰符，从而改变字符串的解释方式。这些修饰符在按空格隔开的字符串定义的末尾附加，如下所述。

Text strings can also contain the following subset of the escape sequences available in the C language:

文本字符串还可以包含 C 语言中可用的转义序列。

" Double quote 双引号
\ Backslash 反斜杠
\r Carriage return \r 是回车，return
\t Horizontal tab 水平制表
\n New line 换行符

\xdd Any byte in hexadecimal notation
In all versions of YARA before 4.1.0 text strings accepted any kind of unicode characters, regardless of their encoding. Those characters were interpreted by YARA as raw bytes, and therefore the final string was actually determined by the encoding format used by your text editor. This never meant to be a feature, the original intention always was that YARA strings should be ASCII-only and YARA 4.1.0 started to raise warnings about non-ASCII characters in strings. This limitation does not apply to strings in the metadata section or comments. See more details here

在yera4.1.0之前，文本字符串可以接受任何的unicode字符，他们被yara作为比特串解释，因此，字符受你用的编辑器的最终的编码格式决定，这绝不是一个功能，初衷始终是YARA字符串应该是ASCII只和YARA 4.1.0开始提出警告非ASCII字符的字符串。此限制不适用于元数据部分或注释中的字符串。

Case-insensitive strings
Text strings in YARA are case-sensitive by default, however you can turn your string into case-insensitive mode by appending the modifier nocase at the end of the string definition, in the same line:

默认情况下，YARA 中的不敏感字符串文本字符串对案例敏感，但您可以通过在字符串定义末尾附加修饰符 nocase 以同一行：规则案例敏感文本，将字符串转换为对案例不敏感的模式

rule CaseInsensitiveTextExample
{
    strings:
        $text_string = "foobar" nocase  //加上这个就变成了不敏感模式

    condition:
        $text_string
}

With the nocase modifier the string foobar will match Foobar, FOOBAR, and fOoBaR. This modifier can be used in conjunction with any modifier, except base64 and base64wide.

就是加了这个nocase就大小写不敏感，比较模糊就可以匹配，比如Foobar, FOOBAR, and fOoBaR都可以匹配 ”foobar“

Wide-character strings （宽字节字符串）

The wide modifier can be used to search for strings encoded with two bytes per character, something typical in many executable binaries.

宽字节的修改可以用于搜索每个字符由两个字节编码的字符（在许多可执行的二进制文件中典型的东西）

For example, if the string “Borland” appears encoded as two bytes per character (i.e. B\x00o\x00r\x00l\x00a\x00n\x00d\x00), then the following rule will match:

例如，Borland可以用宽字节编码为 B\x00o\x00r\x00l\x00a\x00n\x00d\x00，所以它就可以编码为如下

 B\x00o\x00r\x00l\x00a\x00n\x00d\x00

  B  o    r    l    a    n    d   none

rule WideCharTextExample1
{
    strings:
        $wide_string = "Borland" wide   //这个wide就可以匹配二进制中的宽字节

    condition:
        $wide_string
}

However, keep in mind that this modifier just interleaves the ASCII codes of the characters in the string with zeroes, it does not support truly UTF-16 strings containing non-English characters. If you want to search for strings in both ASCII and wide form, you can use the ascii modifier in conjunction with wide , no matter the order in which they appear.

但是请记住，此修饰符只是将字符串中的ASCII代码与零进行交错，并不支持UTF-16的字符和非英语字符。如果你想在ASCII和宽形式中搜索字符串，您可以使用ascii修饰符与宽，无论它们出现的顺序如何。

rule WideCharTextExample2
{
    strings:
        $wide_and_ascii_string = "Borland" wide ascii

    condition:
        $wide_and_ascii_string
}

The ascii modifier can appear alone, without an accompanying wide modifier, but it’s not necessary to write it because in absence of wide the string is assumed to be ASCII by default.

ascii 修饰符可以单独出现，无需附带宽修饰符，但无需编写它，因为如果没有宽字符串，默认情况下，该字符串被假定为 ASCII。

XOR strings（异或字符串）

The xor modifier can be used to search for strings with a single byte XOR applied to them.

The following rule will search for every single byte XOR applied to the string “This program cannot” (including the plaintext string):

xor 修饰符可用于搜索单个条节 XOR 适用于它们的字符串。

以下规则将搜索应用于字符串"此程序不能"（包括纯文本字符串）上的每一个字节 XOR）：

rule XorExample1
{
    strings:
        $xor_string = "This program cannot" xor

    condition:
        $xor_string
}

The above rule is logically equivalent to:

上述规则在逻辑上等同于：

rule XorExample2
{
    strings:
        $xor_string_00 = "This program cannot"
        $xor_string_01 = "Uihr!qsnfs`l!b`oonu"
        $xor_string_02 = "Vjkq\"rpmepco\"acllmv"
        // Repeat for every single byte XOR
    condition:
        any of them
}

You can also combine the xor modifier with wide and ascii modifiers. For example, to search for the wide and ascii versions of a string after every single byte XOR has been applied you would use:

您还可以将 xor 修饰符与宽字符和ascii修饰符相结合。例如，在应用了每个字节 XOR 之后，要搜索字符串的宽和 ascii 版本，请使用：

rule XorExample3
{
    strings:
        $xor_string = "This program cannot" xor wide ascii
    condition:
        $xor_string
}

The xor modifier is applied after every other modifier. This means that using the xor and wide together results in the XOR applying to the interleaved zero bytes. For example, the following two rules are logically equivalent:

xor 修饰符在所有其他修饰符之后应用。这意味着，使用 xor 和宽聚在一起会导致 XOR 应用于交错的零字节。例如，以下两个规则在逻辑上是等价的：

rule XorExample4
{
    strings:
        $xor_string = "This program cannot" xor wide
    condition:
        $xor_string
}

rule XorExample4
{
    strings:
        $xor_string_00 = "T\x00h\x00i\x00s\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00 \x00c\x00a\x00n\x00n\x00o\x00t\x00"
        $xor_string_01 = "U\x01i\x01h\x01r\x01!\x01q\x01s\x01n\x01f\x01s\x01`\x01l\x01!\x01b\x01`\x01o\x01o\x01n\x01u\x01"
        $xor_string_02 = "V\x02j\x02k\x02q\x02\"\x02r\x02p\x02m\x02e\x02p\x02c\x02o\x02\"\x02a\x02c\x02l\x02l\x02m\x02v\x02"
        // Repeat for every single byte XOR operation.
    condition:
        any of them
}

Since YARA 3.11, if you want more control over the range of bytes used with the xor modifier use:

由于YARA 3.11，如果您想要对使用xor修饰符时使用的字节范围进行更多控制：

rule XorExample5
{
    strings:
        $xor_string = "This program cannot" xor(0x01-0xff)
    condition:
        $xor_string
}

The above example will apply the bytes from 0x01 to 0xff, inclusively, to the string when searching. The general syntax is xor(minimum-maximum).

上述示例将应用从0x01到0xff的字节，包括搜索时的字符串。一般语法为 xor（最小-----最大值）。

没写完。。。。。
不在这更新了，在印象笔记里，这里排版太难搞了：（
需要后文的请移步这里
印象笔记 https://app.yinxiang.com/shard/s28/nl/23053629/e79f33b2-5ce5-4375-84a1-24a51f67b27f?title=Writing%20YARA%20rules%20%E2%80%94%20yara%204.1.0%20documentation

4pri1

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
yara初学入门 howto Writing YARA rules翻译自 https://yara.readthedocs.io/en/stable/writingrules.html

最近在用yara，但是没找到有比较全面的资料，就看了看外文资料，导师给了这个，我看了一下觉得还不错，就给顺便翻译出来了，欢迎一起学习。原文是用印象笔记写的，看着比这个舒服。个人能力有限，翻译的有的地方不到位，甚至词不达意，原文没删，还请各位顺带看一下原文，有啥不对不准确的地方也欢迎反馈。有的地方觉得复杂凌乱的，直接意译了。翻译自 https://yara.readthedocs.io/en/stable/writingrules.html印象笔记地址https://app.yinxiang.c
复制链接

扫一扫

专栏目录