当我们在讨论OpenSSF的时候我们在讨论什么？

oe1019

已于 2023-06-12 22:49:48 修改

阅读量445

点赞数

文章标签：开源 github

于 2023-06-04 19:43:46 首次发布

本文链接：https://blog.csdn.net/oe1019/article/details/131036089

版权

文章探讨了在项目成长过程中如何应对OpenSSF标准，特别是在版本管理和pipeline自动化中的应用。作者提出短期通过版本控制满足OpenSSFbadge要求，中期关注SBOM和Sigstore的生态发展，长期则涉及供应链安全和项目分类标准。文章还讨论了生产端与消费端在OpenSSF标准采纳上的差异，以及依赖管理中使用tag或hash的权衡。

摘要由CSDN通过智能技术生成

综述

本文旨在对近日在日常聊天和开源社区工作时候的一些OpenSSF相关话题进行反思。
本文的创作目的并不带有批评，攻击等任何意味。
使用中英双语写作，方便开源社区成员阅读。

做一个计划

随着最近项目on board了CNCF sandbox，越来越多的人带着需求介入了我们的项目。在此之前，我们的项目仅仅依靠main branch来维护我们跨repo共享的github action CI/CD。现在人多了，项目壮大了，需求也多了。考虑pipeline as code我们至少需要开始使用stable version和development version来保证任何对于pipeline的新功能不会破坏现有的pipeline对外（其他repo）提供的CI/CD服务。因此，作为主要的Devops工程师之一，我不得不真正在代码层面面对，实现，考虑OpenSSF。
以及一个切实的问题：

你有在pipeline中结合OpenSSF的短期，中期，长期规划么？

从短期来讲

虽然现阶段我并没有特意为OpenSSF或某一种OpenSSF标准做计划，但是通过日常pipeline as code辅助以版本管理，我们可以满足OpenSSF badge多个子项的要求。因此，如果您想参考本文来进行您自己的OpenSSF实践，请先根据OpenSSF badge梳理您目前有哪些部分已经满足OpenSSF的要求。
在完成OpenSSF badge的过程中遇到的问题与思考

目前OpenSSF的分类标准并不是按照项目的使用目的建立的：
如果我的项目不是给生产使用的，纯工具类项目，我是否要严肃的通过所有OpenSSF问题清单？甚至拿到金牌？
比如KIND

kind is a tool for running local Kubernetes clusters using Docker container “nodes”.
kind was primarily designed for testing Kubernetes itself, but may be used for local development or CI.

OpenSSF badge在生产和消费端之间的Gap
从开源项目，作为生产端来说，很多项目使用github action，OpenSSF badge的表格目前并不自动扫描github action，同时很多脚本比如lint，可能使用make lint，make lints 或者合并在其他脚本中，我们目前还是需要手工回答这些问题。
但是作为消费者？

您上一次查看其他项目OpenSSF badge的细节是什么时候？
这些内容是否能满足您对于OpenSSF的理解？

从中期，长期来讲

SBOM在生产和消费端之间的Gap
谁来消费SBOM？我如何适配Sigstore？
从技术层面上讲，通过适配SBOM相关的一些github action，我们可以方便的生成SBOM。现阶段，我可以把SBOM的结果放哪儿就不管了。但是长远来看，我们需要更加健全的生态。来减少SBOM在生产和消费端之间的Gap。
因此目前我对于SBOM/SLSA或SigStore相关的PR持开放态度。但是对于新兴项目，我们确实有比SBOM/SLSA或SigStore更优先的功能需要在pipeline中实现。
特别的，我最近实验了一下SLSA相关的工程，这些工程由于image源在墙外，对于中国工程师并不方便。

周期，没必要天天盯着所有的依赖

软件版本号可以视作对于软件周期的描述。
考虑一段简单的pipeline脚本。

- uses: actions/checkout@v3

这段脚本的目的是clone我们的代码到pipeline的执行环境。
因此，以下写法都是可以满足的功能要求的。

- uses: actions/checkout@v2
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: actions/checkout@specific commit id

我最近看到一些讨论，考虑到

git上可以re-tag的功能。
nodejs或者container image的情况，使用某一个哈希ID而不是tag来指定依赖版本。

在这些情况下，我们希望使用哈希ID而不是tag来制定某个依赖的版本。
在我目前看来，指定tag可能足够应对大多数情况了。
使用tag：

满足了我clone代码的需要。
这段代码并不是我项目的构建或测试依赖。

因此，如果使用哈希

每次actions/checkout有一个新的release我都要更新么？
假设我有5个github action的依赖，每天我都要检查一遍？
每次提交一个git commit，仅仅是因为在这种情况下CI/CD的依赖更新？

你会使用tag还是哈希？为什么？
如果不考虑OpenSSF的规范，你会使用tag还是哈希？为什么？

纵向技术栈

类似区块链系统，供应链是条链，那么链必然有其头。
虽然我们使用统一的定制化github action的影响范围有限，因此作为一般性的思考和讨论，这里参考log4j那张经典的图，虽然log4j2是积木的第三层，但是我们依旧可以继续向下挖掘。
对于我遇到的实际问题
main repo依赖于customer github action依赖于github checkout action
java虚拟机依赖于jvm编译器依赖于操作系统
因为从纵向技术栈的角度我们总能不断追溯。
开源社区，OpenSSF生态在开源安全的作用是什么？

是否可以通过某一SBOM，SLSA，Sigstore生态或者合规从而认为这个开源组件是安全的？

总结

至此，我们从笔者的亲身经历，从规划，维护成本/更新频率，技术栈等不同视角出发。思考并讨论了现阶段开源社区Devops适配OpenSSF标准所面临的一些问题。
个人希望OpenSSF和其他开源社区协作，应该尽快完善，弥合（如SBOM,Sigstore）生产和消费之间的Gap从而构建完整生态。

在生产端，尽快提出针对项目用途而细化的分类分级标准。
在消费端，尽快提出一些合规标准和标准实践。
从而避免过度忧虑，更有效且有针对性的使用哈希ID。

What we are talking about when “OpenSSF”

Introduction

In this article contains daily chat with my friends or offline discussion on github with community members.
The creative purpose of this article does not imply criticism, attack, or any other meaning.
Written in both Chinese and English, for commmunity member reading.

Make a plan

With our project on boraded CNCF sandbox, more and more people know us and come to us with their needs. Before that, we just relay on main branch to maintain our customer github action as CI/CD pipeline accorss different repos. For now, more people, more socpe, more requriements, considering pipeline ascode, we have to use stable version and development version to ensure any new feature doesn’t break pipeline service to other repos. Hence as Devops maintainer, I have to face OpenSSF at implementation level.
Here is the questions:

Do you have plan to integrated OpenSSF in short, long term?

In the short term

Although I have not specifically planned for OpenSSF or a certain OpenSSF standard at current stage, thanks to daily pipeline as code with version management, we can meet the requirements of multiple sub items of OpenSSF Badge. Therefore, if you want to refer to this article for your own OpenSSF journel, the first place is to figure out which parts of the OpenSSF you currently have meet the requirements of OpenSSF badge.
Here are some problem and thinking when try to complete OpenSSF badge

Currently the openSSF is not categoried by porject usage.

If my project is not intended for production use and is purely a tool based project, should I seriously pass all OpenSSF problem lists? even golden level?
take KIND as an example:

kind is a tool for running local Kubernetes clusters using Docker container “nodes”.
kind was primarily designed for testing Kubernetes itself, but may be used for local development or CI.

The Gap between Production and Consumption of OpenSSF Badge

From an open-source project perspective, as a production side, many projects use github actions. OpenSSF Badge from currently do not automatically scan github actions, meanwhie many scripts such as make script may use ‘make lint’, ‘make lints’, or merge them into other scripts. After all, we have to manually answer these questions.
But as a consumer?

When was the last time you viewed the details of OpenSSF Badge for other projects?
Can these contents satisfy your requirements of OpenSSF?

In the medium to long term

SBOM’s Gap between Production and Consumption

Who will consume SBOM? How do I adapt to Sigstore?
From a technical perspective, by adapting some github actions related to SBOM, we can easily generate SBOM. At this stage, I can leave the SBOM results there and just attachment them into release page. But in the long run, we need a more completed ecosystem. To reduce the gap between SBOM production and consumption.
Therefore, currently I am open to PR related to SBOM/SLSA or SigStore. However, for start up projects, we do have higher priority features to implement in the pipeline than SBOM/SLSA or SigStore.
Especially, I recently experimented with SLSA related projects, which are not convenient for Chinese engineers due to the well known network issue.

Schedule, there’s no need to focus on all dependencies every day

The Software versioning can be regarded as a description of the software schedule/cycle.
Consider a simple pipeline script.

-Uses: actions/ checkout@v3

The purpose of this script is to clone our code into the pipeline’s execution environment.
Therefore, the following lines can meet the functional requirements.

-Uses: actions/ checkout@v2
-Uses: actions/ checkout@v3
-Uses: actions/ checkout@v4
-Uses: actions/ checkout@specific Commit id

I have joined some discussions recently, considering

Git has the feature of re-tag.
In the case of nodejs or container images, use a hash ID instead of a tag to specify the dependent version.

In these cases, we hope to use hash IDs instead of tags to formulate a dependent version.
In my current opinion, specifying a tag may be sufficient to handle most situations in pipeline code by
using tags:

Satisfied my clone code needs.
This code is not a build or test dependency of my project.

Therefore, if hashing is used

Do I need to update a new release every time there is an action/checkout?
Assuming I have 5 dependencies on Github actions that I need to check every day?
Is submitting a git commit only because of CI/CD dependency updates in this situation?

Do you use tags or hashes? Why?
If you don’t consider the OpenSSF specification, would you use tags or hashes? Why?

Vertical technology stack

Similar to blockchain systems, if the supply chain is a chain, then the chain must have its head.
Although the impact of using a common customized github action has limited inference, as a general consideration and discussion, we can refer to the classic diagram of log4j. Although log4j2 is the third layer of the building block, we can still continue to drill down.
For the practical problems I encountered
Main repo > Customer github action > Github checkout action
Java virtual machine > Jvm compiler > operating system
Because from the perspective of vertical technology stacks, we can continuously trace back.
What are the roles of the open source community, OpenSSF ecosystem in open source security?

Can this open source component be considered secure through a certain SBOM, SLSA, Sigstore ecosystem or compliance?

Summary

At this point, we start from different points such as planning, maintenance cost/update frequency, and technology stack based on the author’s personal experience. We have considered and discussed some of the issues faced by the current open-source community, Devops, in adapting to the OpenSSF standard.
I personally hope that OpenSSF and other open source communities can collaborate and improve as soon as possible to bridge the gap between production and consumption (such as SBOM,Sigstore) and build a completed ecosystem.

On the production side, propose detailed classification and grading standards for project purposes as soon as possible.
On the consumer side, propose some compliance standards and standard practices as soon as possible.
So that avoid excessive worry and use hash IDs more effectively and targeted.