Writing Bazel rules: data and runfiles

Bazel has a neat feature that can simplify a lot of work with tests and executables: the ability to make data files available at run-time using data attributes. You may have seen these in rules like this:Bazel 有一个巧妙的功能,可以简化测试和可执行文件的大量工作:使用数据属性在运行时使数据文件可用。您可能已经在这样的规则中看到过这些:

cc_library(
    name = "server_lib",
    srcs = ["server.cc"],
    data = ["private.key"],
)

When a file is listed in a data attribute (or something that behaves like a data attribute), Bazel makes that file available at run-time to executables started with bazel run. This is useful for all kinds of things such as plugins, configuration files, certificates and keys, and resources.

In this article, we’ll add data attributes to the go_library and go_binary rules in rules_go_simple, the set of rules we’ve been working on. We’ll be working on the v3 branch. This won’t take long: we only need to add a few lines of code for each rule.

当文件在数据属性(或行为类似于数据属性的东西)中列出时,Bazel 会在运行时将该文件提供给使用 bazel run 启动的可执行文件。这对于各种事物都很有用,例如插件、配置文件、证书和密钥以及资源。

在本文中,我们将向 rules_go_simple 中的 go_library 和 go_binary 规则添加数据属性,这是我们一直在研究的规则集。我们将在 v3 分支上工作。这不会花很长时间:我们只需要为每个规则添加几行代码。

Data and runfiles

We can start by adding a data attribute to our rules. Here’s the new declaration for go_library. The attribute in go_binary is similar.

go_library = rule(
  implementation = _go_library_impl,
  attrs = {
      "srcs": attr.label_list(
          allow_files = [".go"],
          doc = "Source files to compile",
      ),
      "deps": attr.label_list(
          providers = [GoLibraryInfo],
          doc = "Direct dependencies of the library",
      ),
      "data": attr.label_list(
          allow_files = True,
          doc = "Data files available to binaries using this library",
      ),
      "importpath": attr.string(
          mandatory = True,
          doc = "Name by which the library may be imported",
      ),
      "_stdlib": attr.label(
          default = "//internal:stdlib",
          providers = [GoStdLibInfo],
          doc = "Hidden dependency on the Go standard library",
      ),
  },
  doc = "Compiles a Go archive from Go sources and dependencies",
)

Bazel tracks files that should be made available at run-time using runfiles objects. You can create new runfiles objects with ctx.runfiles. In order to actually make files available, you need to put one of these in the runfiles field in the DefaultInfo provider returned by your rule. Recall that DefaultInfo is used to list the output files and executables produced by a rule.

Bazel 使用 runfiles 对象跟踪应在运行时提供的文件。您可以使用 ctx.runfiles 创建新的 runfiles 对象。为了真正使文件可用,您需要将其中一个文件放入规则返回的 DefaultInfo 提供程序中的 runfiles 字段中。回想一下,DefaultInfo 用于列出规则生成的输出文件和可执行文件。
Here’s how we create the DefaultInfo provider for go_library. Again, go_binary is similar.

return [
    DefaultInfo(
        files = depset([archive]),
        runfiles = ctx.runfiles(collect_data = True),
    ),
    ...
]

The expression ctx.runfiles(collect_data = True) gathers the files listed in the data attribute and the runfiles returned by rules in the deps and srcs attributes. That means any library can have data files, and they will be available to tests and binaries run with bazel run that link that library. There are a few different ways to call ctx.runfiles. If you set collect_data = True, as we did above, Bazel will collect data runfiles from dependencies in the srcs, deps, and data attributes. If you set collect_default = True, Bazel will collect default runfiles from the same dependencies. I have no idea what the distinction is between data and default runfiles, but when you construct DefaultInfo, you can set the data_runfiles or default_runfiles fields explicitly. If you just set runfiles, your files will be treated as both data and default.

What if you want to build the list of files explicitly? This is useful if you want to collect files from non-standard attributes, or if you create files within your rule. ctx.runfiles accepts a files argument, which is a simple list of files. You can access runfiles from your dependencies with an expression like dep[DefaultInfo].data_runfiles, where dep is a Target. You can combine runfiles objects using runfiles.merge, which returns a new runfiles object. So we could have implemented go_library like this:表达式 ctx.runfiles(collect_data = True) 收集 data 属性中列出的文件以及 deps 和 srcs 属性中的规则返回的运行文件。这意味着任何库都可以有数据文件,并且它们将可用于使用 bazel run 运行的链接该库的测试和二进制文件。有几种不同的方法可以调用 ctx.runfiles。如果设置 collect_data = True(如上所述),Bazel 将从 srcs、deps 和 data 属性中的依赖项中收集数据运行文件。如果设置 collect_default = True,Bazel 将从相同的依赖项中收集默认运行文件。我不知道数据和默认运行文件之间的区别是什么,但是当您构造 DefaultInfo 时,您可以明确设置 data_runfiles 或 default_runfiles 字段。如果您只设置运行文件,您的文件将被视为数据和默认文件。

如果您想明确构建文件列表怎么办?如果您想要从非标准属性中收集文件,或者在规则中创建文件,这将非常有用。ctx.runfiles 接受文件参数,即一个简单的文件列表。您可以使用类似 dep[DefaultInfo].data_runfiles 的表达式从依赖项中访问运行文件,其中 dep 是目标。您可以使用 runfiles.merge 组合运行文件对象,它会返回一个新的运行文件对象。因此,我们可以像这样实现 go_library:

# Gather runfiles.
runfiles = ctx.runfiles(files = ctx.files.data)
for dep in ctx.attr.deps:
    runfiles = runfiles.merge(dep[DefaultInfo].data_runfiles)

# Return the output file and metadata about the library.
return [
    DefaultInfo(
        files = depset([archive]),
        runfiles = runfiles,
    ),
    ...
]

NOTE: When you have an attribute that is a label or label_list, you can access a list of all the files from all the labels using ctx.files (for example: ctx.files.data). This is almost always more convenient than going through ctx.attr (which gives you a Target or a list of Targets), since each target may have multiple files. If your label has allow_single_file = True set, you can also access the file through ctx.file. And if executable = True, you can access it through ctx.executable.注意:当您的属性是标签或 label_list 时,您可以使用 ctx.files(例如:ctx.files.data)访问所有标签中的所有文件列表。这几乎总是比通过 ctx.attr(它为您提供目标或目标列表)更方便,因为每个目标可能有多个文件。如果您的标签设置了 allow_single_file = True,您也可以通过 ctx.file 访问该文件。如果 executable = True,您可以通过 ctx.executable 访问它。

Testing data and runfiles

We test our new support for runfiles with a simple binary that depends on a library. Both binary and library have data files, and the test verifies they are present.

sh_test(
    name = "data_test",
    srcs = ["data_test.sh"],
    args = ["$(location :list_data_bin)"],
    data = [":list_data_bin"],
)

go_binary(
    name = "list_data_bin",
    srcs = ["list_data_bin.go"],
    deps = [":list_data_lib"],
    data = ["foo.txt"],
)

go_library(
    name = "list_data_lib",
    srcs = ["list_data_lib.go"],
    data = ["bar.txt"],
    importpath = "rules_go_simple/tests/list_data_lib"
)

You can run this test with bazel test //tests/…

Accessing runfiles, cross-platform

You should use a library to find and open runfiles, especially in tests. When Bazel executes a binary on Unix platforms, it creates a tree of symbolic links to the binary’s runfiles. If your code only ever runs on Unix platforms, you can open a runfile by opening its relative path within the workspace.
您应该使用库来查找和打开运行文件,尤其是在测试中。当 Bazel 在 Unix 平台上执行二进制文件时,它会创建一个指向二进制文件运行文件的符号链接树。如果您的代码只在 Unix 平台上运行,您可以通过在工作区内打开其相对路径来打开运行文件。
This is not generally safe because Bazel handles runfiles differently on Windows. In versions of Windows before about 2019, Windows required you to be an administrator to create symbolic links. Even now in consumer versions of Windows, you need to enable “Developer Mode” to create symbolic links, which requries administrator access. Creating a symbolic link on Windows is also surprisingly slow. To avoid these problems, Bazel uses another strategy: it creates a manifest file that maps logical runfile paths to absolute paths paths for the real files in Bazel’s cache. The manifest is pointed to by the RUNFILES_MANIFEST_FILE environment variable, which is set for tests. Nothing points to the manifest file for binaries run with bazel run, but you should find a file named MANIFEST in the initial working directory of the binary. (Incidentally, you can override this and force symbolic links with the Bazel flag --enable_runfiles).
这通常不安全,因为 Bazel 在 Windows 上处理运行文件的方式不同。在 2019 年之前的 Windows 版本中,Windows 要求您以管理员身份创建符号链接。即使现在在 Windows 的消费者版本中,您也需要启用“开发人员模式”来创建符号链接,这需要管理员访问权限。在 Windows 上创建符号链接也非常慢。为了避免这些问题,Bazel 使用了另一种策略:它创建一个清单文件,将逻辑运行文件路径映射到 Bazel 缓存中实际文件的绝对路径。清单由 RUNFILES_MANIFEST_FILE 环境变量指向,该变量是为测试设置的。对于使用 bazel run 运行的二进制文件,没有任何内容指向清单文件,但您应该在二进制文件的初始工作目录中找到一个名为 MANIFEST 的文件。(顺便说一句,您可以覆盖此文件并使用 Bazel 标志 --enable_runfiles 强制符号链接)。
It is best to use a library if one is available for your language, rather than parsing the manifest file on your own. Bazel’s runfile semantics change over time, as they are changing now with bzlmod, and using a library will keep your code working. Most languages provide such a library:

C++: @bazel_tools//tools/cpp/runfiles
Bash: @bazel_tools//tools/bash/runfiles
Java: @bazel_tools//tools/java/runfiles
Python: @rules_python//python/runfiles
Go: @io_bazel_rules_go//go/runfiles
Rust: @rules_rust//tools/runfiles
如果您的语言有可用的库,最好使用库,而不是自己解析清单文件。Bazel 的运行文件语义会随着时间的推移而发生变化,因为它们现在随着 bzlmod 而发生变化,使用库将使您的代码保持正常运行。大多数语言都提供了这样的库:

C++:@bazel_tools//tools/cpp/runfiles
Bash:@bazel_tools//tools/bash/runfiles
Java:@bazel_tools//tools/java/runfiles
Python:@rules_python//python/runfiles
Go:@io_bazel_rules_go//go/runfiles
Rust:@rules_rust//tools/runfiles

  • 14
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

糖果Autosar

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值