跳入雅尔根

最新推荐文章于 2024-10-09 09:20:12 发布

weixin_26636643

最新推荐文章于 2024-10-09 09:20:12 发布

阅读量218

点赞数

文章标签： python

原文链接：https://medium.com/bugbountywriteup/diving-into-yargen-9e8c00e18b65

版权

Last time, we talked about how to detect malware using YARA, and how to find YARA rules to use online:

上次，我们讨论了如何使用YARA检测恶意软件，以及如何找到要在线使用的YARA规则：

But if you can’t find YARA rules published online that suits your needs, you’ll need to write your own rules instead!

但是，如果找不到符合您需求的在线发布的YARA规则，则需要编写自己的规则！

YarGen简介 (Intro to YarGen)

YarGen is a tool for generating YARA rules. It is able to generate YARA rules given a malware file. It generates YARA rules by identifying the strings found in the malware file, while also removing known strings that also appear in non-malicious files. YarGen includes a big database of strings and opcode that are known to also appear in non-malicious files.

YarGen是用于生成YARA规则的工具。给定恶意软件文件，它可以生成YARA规则。它通过识别在恶意软件文件中找到的字符串来生成YARA规则，同时还删除在非恶意文件中也出现的已知字符串。 YarGen包含一个庞大的字符串和操作码数据库，已知它们也会出现在非恶意文件中。

You can find YarGen on Github here:

您可以在Github上找到YarGen：

安装YarGen (Installing YarGen)

First, download the latest version of YarGen in the release section of its Github page and unzip the archive. The source code is available as a zip file or a tarball.

首先，在其Github页面的发行部分中下载最新版本的YarGen并解压缩存档。源代码以zip文件或tarball的形式提供。

Next, make sure you have all the dependencies installed. You can run these commands:

接下来，确保已安装所有依赖项。您可以运行以下命令：

sudo pip install pefile cd
sudo pip install scandir lxml naiveBayesClassifier

Finally, cd into the YarGen directory and run the following command to download the built-in databases. The databases are saved into the ./dbs subdirectory.

最后，使用cd进入YarGen目录并运行以下命令以下载内置数据库。数据库将保存到./dbs子目录中。

python yarGen.py —-update

运行YarGen (Running YarGen)

YarGen has many options for rule generation. To see the command line parameters, you can run:

YarGen有许多用于规则生成的选项。要查看命令行参数，可以运行：

python yarGen.py —-help

To use the included database for rules generation, you can simply run the command:

要将包含的数据库用于规则生成，只需运行以下命令：

python yarGen.py -m PATH_TO_MALWARE_DIRECTORY

This command will scan and create rules for the malware files under PATH_TO_MALWARE_DIRECTORY. A file named yargen_rules.yar will be created in the current directory, containing the rules generated.

该命令将扫描并在PATH_TO_MALWARE_DIRECTORY下创建恶意软件文件的规则。将在当前目录中创建一个名为yargen_rules.yar的文件，其中包含生成的规则。

简单规则与超级规则 (Simple vs Super rules)

A YarGen rule can be either a simple rule or a super rule.

YarGen规则可以是简单规则也可以是超级规则。

If multiple sample files are used, YarGen will try to identify the similarities between the samples and combine the identified strings into a “super rule”.

如果使用了多个样本文件，YarGen将尝试识别样本之间的相似性，并将识别出的字符串组合成“超级规则”。

Super rules can be identified by a line in the meta section of the rule:

超级规则可以通过规则的meta部分中的一行来标识：

super_rule = 1

The process of combining multiple rules into a single super rule does not remove the simple rules generated for each file. This means that there will be an overlap of rule strings between the simple rules and the super rule. To delete the simple rules that are covered by the super rule, you can use the nosimple flag in your YarGen command:

将多个规则组合为一个超级规则的过程不会删除为每个文件生成的简单规则。这意味着在简单规则和超级规则之间规则字符串会重叠。要删除超级规则涵盖的简单规则，可以在YarGen命令中使用nosimple标志：

python yarGen.py -m PATH_TO_MALWARE_DIRECTORY --nosimple

You can also suppress super rule creation by using the flag nosuper:

您还可以通过使用标志nosuper禁止创建超级规则：

python yarGen.py -m PATH_TO_MALWARE_DIRECTORY --nosuper

规则创建标志 (Rule creation flags)

In addition to nosimple and nosuper, there are plenty of other flags that you can use to customize the behavior of YarGen! In particular, let’s look at the flags that are going to influence how YarGen approaches rule creation and output.

除了nosimple和nosuper之外 ，还有许多其他标志可用于自定义YarGen的行为！特别是，让我们看一下将影响YarGen如何处理规则创建和输出的标志。

Here are all of them from the YarGen help page:

以下是来自YarGen帮助页面的所有信息：

Rule Creation:
  -m M                  Path to scan for malware
  -y min-size           Minimum string length to consider (default=8)
  -z min-score          Minimum score to consider (default=0)
  -x high-scoring       Score required to set string as 'highly specific
                        string' (default: 30)
  -w superrule-overlap  Minimum number of strings that overlap to create a
                        super rule (default: 5)
  -s max-size           Maximum length to consider (default=128)
  -rc maxstrings        Maximum number of strings per rule (default=20,
                        intelligent filtering will be applied)
  --excludegood         Force the exclude all goodware stringsRule Output:
  -o output_rule_file   Output rule file
  -e output_dir_strings
                        Output directory for string exports
  -a author             Author Name
  -r ref                Reference (can be string or text file)
  -l lic                License
  -p prefix             Prefix for the rule description
  -b identifier         Text file from which the identifier is read (default:
                        last folder name in the full path, e.g. "myRAT" if -m
                        points to /mnt/mal/myRAT)
  --score               Show the string scores as comments in the rules
  --strings             Show the string scores as comments in the rules
  --nosimple            Skip simple rule creation for files included in super
                        rules
  --nomagic             Don't include the magic header condition statement
  --nofilesize          Don't include the filesize condition statement
  -fm FM                Multiplier for the maximum 'filesize' condition value
                        (default: 3)
  --globalrule          Create global rules (improved rule set speed)
  --nosuper             Don't try to create super rules that match against
                        various files

Specifically, let’s talk about — excludegood, — score, -rc and -z.

具体来说，我们来谈谈-excludegood，-score，-rc和-z 。

YarGen gives each string a “score” based on its ability to indicate a malware file. The higher the score of a string, the higher the probability that files that contain it are malware files.

YarGen根据其指示恶意软件文件的能力为每个字符串赋予一个“ 分数 ”。字符串的分数越高，包含该字符串的文件成为恶意软件文件的可能性就越高。

YarGen also does not completely remove the goodware strings from rules but includes them with a very low score. The — excludegood flag forces YarGen to exclude all of the goodware strings found in the YarGen database.

YarGen也不是完全从规则中删除好软件字符串，而是包括分数非常低的字符串。 -excludegood标志强制YarGen排除在YarGen数据库中找到的所有良好软件字符串。

By default, YarGen does not include these “scores” for each string in the resulting rule file. To see how each string is scored, use the — score flag to output the scores as comments in the rule file.

默认情况下，YarGen在结果规则文件中不为每个字符串包含这些“分数”。要查看每个字符串的评分方式，请使用-score标志将评分输出为规则文件中的注释。

The -rc (maxstrings) flag specifies the maximum number of strings to include in each rule. The default number is 20. Which means that each rule will include up to 20 of the highest scoring strings.

-rc (maxstrings)标志指定每个规则中要包括的最大字符串数。默认数字为20。这意味着每个规则将包含最多20个最高得分字符串。

Whereas the -z (min-score) flag determines the minimum score that a string needs to have in order to be included in the rule.

而-z (最小分数)标志确定字符串包含在规则中所需的最低分数。

解码输出：yargen_rules.yar (Decoding the Output: yargen_rules.yar)

Now that we’ve generated a few YARA rules using YarGen, let’s dive into the rules and learn how to read it!

现在，我们已经使用YarGen生成了一些YARA规则，让我们深入研究规则并学习如何阅读！

Each YARA rule generated via YarGen is composed of three sections: meta, strings, and condition.

通过YarGen生成的每个YARA规则都由三个部分组成： meta，string和condition 。

Image for post — https://github.com/Neo23x0/yarGen. https://github.com/Neo23x0/yarGen上的 YarGen文档。

元部分 (Meta Section)

The “meta” section of a rule contains the description, author, reference, date, and hash of the rule. You can specify the author of the rule via the -a flag:

规则的“元”部分包含规则的描述，作者，参考，日期和哈希。您可以通过-a标志指定规则的作者：

python yarGen.py -m PATH_TO_MALWARE_DIRECTORY -a "Vickie Li"

And you can specify the reference file or webpage of a rule via the -r flag:

您可以通过-r标志指定规则的参考文件或网页：

python yarGen.py -m PATH_TO_MALWARE_DIRECTORY -r "https://github.com/Neo23x0/yarGen"

弦乐节 (Strings Section)

The “strings” section of a rule specifies the strings that are used to identify that particular strain of malware. YarGen categorizes these rules based on the likelihood of them to be indicators of malware. There are three categories of these strings, marked by $s, $x, and $z.

规则的“字符串”部分指定用于标识特定种类的恶意软件的字符串。 YarGen根据这些规则成为恶意软件指示符的可能性对其进行分类。这些字符串分为三类，分别用$ s，$ x和$ z标记。

Strings that start with $s (“Highly Specific Strings”) are very specific strings that will not appear in legitimate software. These strings can include malicious server addresses, the names of hacking tools and malware, hacking tool outputs, and typos in common strings. For example, sometimes malware files will contain misspelled words like “Micorsoft” or “Monnitor” when it tries to masquerade itself as legitimate software.

以$ s开头的字符串(“高度特定的字符串”)是非常特定的字符串，不会出现在合法软件中。这些字符串可以包括恶意服务器地址，黑客工具和恶意软件的名称，黑客工具输出以及常见字符串中的错别字。例如，当恶意软件试图伪装成合法软件时，有时会包含诸如“ Micorsoft”或“ Monnitor”之类的拼写错误的单词。

Strings that start with $x (“Specific Strings”) are likely to be indicators of malware files, but might also appear in legitimate files.

以$ x开头的字符串(“特定字符串”)可能表示恶意软件文件，但也可能出现在合法文件中。

Lastly, strings that start with $z are likely to be ordinary but are not currently included in the goodware string database.

最后，以$ z开头的字符串可能很普通，但当前未包含在好软件字符串数据库中。

条件部分 (Condition section)

Conditions in YARA rules are boolean expressions that specify the additional conditions of that rule.

YARA规则中的条件是布尔表达式，用于指定该规则的其他条件。

YarGen uses a combination of a magic header, file size, and strings for the condition section. For example, the conditions in the rule above specify that a file also needs to satisfy the following conditions to be classified as a “backdoor”:

YarGen为条件部分使用魔术头，文件大小和字符串的组合。例如，以上规则中的条件指定文件还需要满足以下条件才能被分类为“后门”：

Has the magic header of 0x5a4d,
具有魔术头0x5a4d，
The file is smaller than 3785 kb,
文件小于3785 kb，
and all the strings specified in the “strings” section must be present.
并且“字符串”部分中指定的所有字符串都必须存在。

To understand more types of conditions that can appear in YARA rules, please read the YARA documentation here.

要了解YARA规则中可能出现的更多类型的条件，请在此处阅读YARA文档。

祝好运！ (Good Luck!)

You can also write YARA rules manually, but in doing that you risk writing rules that are either too specific or not specific enough. YarGen is a fast way of generating YARA rules that are both flexible and comprehensive.

您也可以手动编写YARA规则，但是这样做可能会导致编写过于具体或不够具体的规则。 YarGen是一种生成灵活且全面的YARA规则的快速方法。

Thanks for reading. Is there anything I missed? Feel free to let me know on Twitter: https://twitter.com/vickieli7.

谢谢阅读。我有什么想念的吗？随时在Twitter上告诉我： https : //twitter.com/vickieli7 。

Follow Infosec Write-ups for more such awesome write-ups.

关注 Infosec文章， 以获得更多此类出色的文章。