webassembly_WebAssembly的设计

最新推荐文章于 2024-10-20 17:14:29 发布

cumi7754

最新推荐文章于 2024-10-20 17:14:29 发布

阅读量184

点赞数

文章标签：大数据编程语言 python 人工智能 java

原文链接：https://www.freecodecamp.org/news/the-design-of-webassembly-81f1dcabaddd/

版权

webassembly

by Patrick Ferris

帕特里克·费里斯(Patrick Ferris)

WebAssembly的设计 (The Design of WebAssembly)

I love the web. It is a modern-day superpower for the dissemination of information and empowerment of the individual. Of course, it has its downsides like trolling (largely possible through anonymity) and privacy issues, not to mention the problems of ownership and copyright infringement about to come into effect with the highly divisive article 13. But, let’s forget about that for just a moment and marvel at the technological innovation of the internet and the browsers which support it.

我喜欢网络。它是当今传播信息和增强个人能力的超级大国。当然，它有缺点，例如拖钓(很大程度上可能通过匿名)和隐私问题，更不用说高度分裂的条款13即将生效的所有权和版权侵权问题。但是，让我们暂时忘记这一点，惊叹于互联网和支持互联网的浏览器的技术创新。

I first learnt to code in Javascript and have since been ridiculed by many for liking it. Yes, I know there are weird bits like this gem: [] == ![] // true but it has become one of the most ubiquitous languages on the planet thanks to the internet, browsers and the interpreters that run the code (Google’s V8 and Firefox’s SpiderMonkey to name a few).

我最初学会了使用Javascript进行编码，此后由于喜欢它而被许多人嘲笑。是的，我知道有怪异这样的宝石位： [] == ![] // true ，但它已成为最普遍的语言地球，感谢互联网，浏览器和运行代码的解释上的一个( 谷歌的V8和Firefox的SpiderMonkey等 )。

As I got more into web development, I noticed a new name on the block: WebAssembly. As a computer science student and a developer, I believe one of the best ways to learn something is to try and understand why the engineers who built it made those design choices. So here is a brief look at some of the interesting design principles in WebAssembly and also why I think everyone should be excited.

随着我对Web开发的更多投入，我注意到了一个新名称：WebAssembly。作为计算机科学专业的学生和开发人员，我相信学习某些东西的最好方法之一就是尝试了解为什么建造它的工程师会做出这些设计选择。因此，这里简要介绍一下WebAssembly中一些有趣的设计原则，以及为什么我认为每个人都应该感到兴奋。

为什么我们需要WebAssembly？ (Why do we need WebAssembly?)

Okay, so first of all, to all my Javascript fans out there — no you shouldn’t be worried. When Javascript first came about, it was designed to be used in a lightweight way but has since gone on to do a lot of heavy lifting. Maybe it was used for manipulating some DOM elements, some client-side verification in forms but not everything that is trying to be done on the web now. Certainly not running fully-fledged games.

好的，首先，对所有我的Javascript爱好者来说，不，您不必担心。当Javascript第一次出现时，它被设计为以轻量级方式使用，但此后又进行了很多繁重的工作。也许它被用于操纵一些DOM元素，以形式进行一些客户端验证，但不是现在试图在网络上进行的所有操作。当然不是在运行成熟的游戏。

Why is Javascript not so fast or great? One of the main reasons is because it is an interpreted language. Scanning the code line by line and executing, luckily with Just-in-Time compilers, the efficiency improved massively but still there is only so much room to improve. But even then there’s the issue of Javascript’s dynamic typing causing another ceiling on performance

为什么Javascript没那么快或太好了？主要原因之一是因为它是一种解释语言。幸运的是，使用即时编译器逐行扫描代码并执行，效率得到了极大的提高，但仍有很大的提高空间。但是即使那样，仍然存在Javascript动态类型的问题，这又导致了性能的另一个上限

Alex Danilo discussed the improvements WebAssembly could make in his Google I/O talk in 2017. What really brought home the inefficiencies was his example add(a, b) function and the complexity that the Javascript interpreters have to go through in order to make sense of it.

Alex Danilo在2017年的Google I / O演讲中讨论了WebAssembly可以进行的改进。真正使效率低下的原因是他的示例add(a, b)函数以及Java解释器必须经过的复杂性才能使其有意义它的。

WebAssembly opens the door to compilation, which opens another door to optimisation. It’s ability to take C/C++ source language allows it to do some static type checking which helps improve speed. This is what the developers of the Mozilla Foundation realised and wanted to fix. To summarise this great video, Javascript was designed for humans and browsers were left to try and make it fast; WebAssembly was designed as a target language for compilers that browsers could already run quickly.

WebAssembly为编译打开了大门，为优化打开了另一扇门。它具有采用C / C ++源语言的能力，因此可以执行一些静态类型检查，这有助于提高速度。这就是Mozilla Foundation的开发人员意识到并希望解决的问题。总结这段精彩的视频，Javascript是为人类设计的，浏览器则可以尝试使其变得更快。 WebAssembly被设计为编译器的目标语言，使浏览器可以快速运行。

The realisation that we could have two choices of code run in the engines was an exciting prospect — and the four major browsers (Chrome, Safari, Firefox and IE) all began plans to let their engines run Javascript and WebAssembly. Again, let me reiterate… WebAssembly is not replacing Javascript.

我们意识到可以在引擎中运行两种代码选择，这是一个令人兴奋的前景-四种主要的浏览器(Chrome，Safari，Firefox和IE)都开始计划使其引擎运行Javascript和WebAssembly。再次，让我重申一下……WebAssembly 不会替代Javascript。

为什么要编译代码？ (Why compile code?)

Compiling code really means taking it from one (source) language and translating it into another (target) language. This is an incredibly simplified understanding of compilation. Most modern day compilation pipelines involve many more stages that allow us to really fine-tune and optimise our code making it faster and more energy-efficient.

编译代码实际上意味着从一种(源)语言中获取代码并将其翻译为另一种(目标)语言。这是对编译的极大简化。大多数现代编译管道涉及更多的阶段，这些阶段使我们能够真正地微调和优化我们的代码，从而使其更快，更节能。

The first steps usually include lexical, syntactic and semantic analysers to get the code into some kind of intermediate language that is perfect for optimisation. Then we optimise independently, generate the target code and then maybe optimise dependently on the hardware or environment.

第一步通常包括词法，句法和语义分析器，以将代码转换成最适合优化的某种中间语言。然后我们独立进行优化，生成目标代码，然后可能依赖于硬件或环境进行优化。

All projects need to start small first, and the engineers at Mozilla decided to begin with their source language being C/C++ and using an existing toolchain called LLVM (not an acronym) they would compile using that.

所有项目都需要从小处着手，Mozilla的工程师决定从其源语言为C / C ++开始，并使用一个称为LLVM (不是首字母缩写)的现有工具链，然后使用该工具链进行编译。

Initially, the search for a better performing web started with asm.js (at least in WebAssembly narrative. See PNaCL — Google’s earlier attempts) a small subset of Javascript that could be the compile target for C/C++ programs that used annotations and other clever tricks to improve the Javascript performance.

最初，搜索性能更好的Web始于asm.js (至少在WebAssembly叙述中。请参阅PNaCL -Google的早期尝试)一小部分Javascript 脚本，它可以成为使用注释和其他巧妙方法的C / C ++程序的编译目标。提高Javascript性能的技巧。

Unfortunately, it lacked one crucial design principle underlying what was wanted: Portability. Different Javascript engines gave different performance reviews, but it was a clear indication that this may be a good approach.

不幸的是，它缺少一个基本的设计原则：可移植性。不同的Javascript引擎给出了不同的性能评估，但这清楚地表明这可能是一个好方法。

The developers of WebAssembly decided their target representation would be a binary format that provided a “dense, linear encoding of the abstract syntax”… Which is a lot of words, so let’s unpack that.

WebAssembly的开发人员决定，他们的目标表示形式将是一种二进制格式，该格式提供“ 抽象语法的密集，线性编码 ”。。。这是很多单词，因此让我们对其进行解压缩。

The “dense” part refers to the high-level goal of achieving a size and load time efficient format. The internet is all about sending data along wires, and whilst there are lots of projects to improve the latency of this, one foolproof way of achieving this is to send less data. Another important aspect is the increased decoding speed thanks to array indexing over dictionary lookup (if they used compressed text format). Read more about this design choice here.

“密集”部分是指实现尺寸和加载时间高效格式的高级目标。互联网就是要通过电线发送数据，尽管有许多项目可以改善这种情况的延迟，但实现这一目标的一种万无一失的方法是发送更少的数据。另一个重要方面是由于通过字典查找进行数组索引(如果他们使用压缩文本格式)，从而提高了解码速度。在此处阅读有关此设计选择的更多信息。

什么是at？ (What is wat?)

The binary format that the C and C++ programs compile to are .wasm files, these have a 1:1 mapping straight to a (somewhat) human readable text format. These files are labelled .wat , this WasmExplorer is great for getting your head around text representation and how it relates to the original code. Let’s take a simple example.

C和C ++程序编译为的二进制格式是.wasm文件，这些文件具有1：1映射，直接映射到(某种程度上)人类可读的文本格式。这些文件被标记为.wat ，此WasmExplorer非常有助于您理解文本表示形式及其与原始代码的关系。让我们举一个简单的例子。

There’s a lot going on here so let’s take it slowly and explain the concepts as they come.

这里有很多事情，所以让我们慢慢来，并解释概念的提出。

First, there is this weird module word, where did that come from? Mejin Leechor gave a great talk on modules in Javascript and describes them as giving code “structure and boundaries”. This is very similar to the idea of WebAssembly modules (and there are plans in the future to try and integrate with es6 modules).

首先，有一个奇怪的module字，它是从哪里来的？ Mejin Leechor在Java语言中对模块进行了精彩演讲，并将其描述为给出了代码“结构和边界”。这与WebAssembly模块的思想非常相似(并且将来有计划尝试与es6模块集成)。

Straight from the docs, we have that the module is the “distributable, loadable, and executable unit of code in WebAssembly”. Modules can have the following sections each with their own unique responsibility: import, export, start, global, memory, data, table, elements, function and code. For now, let’s just look at what we have in our module.

直接从文档开始，我们认为该模块是“ WebAssembly中的代码的可分发，可加载和可执行单元 ”。模块可以具有以下各节，各节各有其各自的职责：导入，导出，启动，全局，内存，数据，表，元素，功能和代码。现在，让我们来看一下模块中的内容。

The first declaration is (type $type0 (func (param i32) (result i32))) . This is intimately linked to the table call on the next line. We are declaring a new type with the func signature that takes a 32 bit integer parameter and returns a 32 bit integer. If we were to make use of the function we wrote again, we would have to make a call_indirect into our table and then we could do some type-checking to make sure everything was correct. As part of the minimal viable product only one table is allowed, but there are future plans to allow multiple tables and for these to be indexed.

第一个声明是(type $type0 (func (param i32) (result i32))) 。这与下一行的表调用紧密相关。我们正在声明一个带有func签名的新类型，该类型带有32位整数参数并返回32位整数。如果要使用我们再次编写的函数，则必须在table进行call_indirect ，然后可以进行类型检查以确保所有内容正确。作为最低限度可行产品的一部分，只允许使用一个表，但是将来有计划允许多个表并为它们建立索引。

The next declaration is (table 0 anyfunc) . The table section is reserved for defining zero or more tables. A table is similar to a linear memory in the sense that they are resizable arrays which contain references. The 0 makes reference to the fact that we have nothing in our table, but we still need to provide the MVP’s only possible value of anyfunc (a function).

下一个声明是(table 0 anyfunc) 。表部分保留用于定义零个或多个表。表在某种意义上类似于线性内存，因为它们是包含引用的可调整大小的数组。 0表示表中没有任何东西，但是我们仍然需要提供MVP的anyfunc (函数)唯一可能的值。

The problem that the developers had was linked to security. If a function wanted to call another function, giving it direct access to a function stored in linear memory was unsafe. Instead functions are stored in the table ready to be indexed if needed. Lin Clark wrote a great article describing tables (as used in imports) in more detail and how they provide better security.

开发人员遇到的问题与安全性有关。如果一个函数要调用另一个函数，则直接访问线性存储器中存储的函数是不安全的。而是将函数存储在表中，以便在需要时进行索引。 Lin Clark写了一篇很棒的文章，更详细地描述了表(用于导入)以及它们如何提供更好的安全性。

We then have a declaration of (memory 1) , this is the linear memory used by the module and we declare that we need 1 page of memory (64KiB).

然后，我们有一个声明(memory 1) ，这是模块使用的线性内存，我们声明需要1页内存(64KiB) 。

The next declaration is (export "memory" memory) . An export is something that is returned to the host at instantiation time. Basically, the cool bits we want from the WebAssembly code.

下一个声明是(export "memory" memory) 。导出是实例化时返回给主机的内容。基本上，我们需要WebAssembly代码中的一些技巧。

The structure is quite simple (export <name-of-export> (<type> &lt;name/index>)) so here we are just exporting the memory we declared in the previous line. This allows for direct memory access within our Javascript code, as an ArrayBuffer which drastically improves the efficiency as there are no backwards and forwards calls across the WASM/JS border. Similarly we then export our function with (export "main" $func0) .

结构非常简单(export <name-of-export> (<type> &l t; name / index>))，因此这里我们只是导出在上一行中声明的内存。这允许在我们的cript code,进行直接内存访问cript code,因为ArrayBuffer可以大大提高效率，因为不会在WASM / JS边界上发生任何向后和向前的调用。同样，然后our function with (exp ort“ main” $ func0)导出our function with (exp 。

Now to the slightly more interesting bit, our code and its representation.

现在到稍微有趣的一点，我们的代码及其表示形式。

Before moving on, this is the perfect opportunity to introduce yet another design component: the stack machine.

在继续之前，这是引入另一个设计组件：堆栈机的绝佳机会。

注册与堆叠机 (Register versus Stack Machines)

Computers, at their simplest, consume inputs and produce outputs. As a ‘machine’ executes a program it can do so in multiple different ways. Two of the main approaches are register and stack machines. In a register machine, parameters to functions are kept in memory locations and are then manipulated depending on the program in execution.

最简单的计算机消耗输入并产生输出。当“机器”执行程序时，它可以以多种不同方式执行。两种主要方法是套准机和堆栈机。在寄存器机中，功能参数保存在存储器中，然后根据执行中的程序进行操作。

A simple, but somewhat flawed, analogy could be a kitchen and making a recipe. The ingredients are stored in different locations, you get them and make something which you might put somewhere for another day or immediately consume (yum). It’s far from perfect but hopefully you get the idea.

一个简单但有缺陷的类比可以是厨房和烹饪食谱。食材存储在不同的位置，您可以得到它们，然后制成一些可以放在另一天或立即食用(百胜)的东西。这远非完美，但希望您能理解。

Stack machines, on the other hand, employ a different model. Imagine you are a journalist or secretary, your job is to read and respond to letters. You ‘pop’ the top letter from your pile and begin writing a response whilst someone else comes along with more work and ‘pushes’ to the top of the pile. These are the ones you are going to have to do next. Again, grossly oversimplified but it should help visualise the mechanics.

另一方面，堆栈机采用不同的模型。假设您是新闻工作者或秘书，您的工作是阅读和回复信件。您从堆中“弹出”顶部的字母，开始写回覆，而其他人则伴随着更多的工作和“推动”到堆顶部。这些是您接下来必须要做的。再次，过度简化，但这应该有助于可视化机制。

WebAssembly uses a stack machine model for code execution. If you’re short of some great reading, and are into programming semantics, the paper “Bringing the Web up to Speed with WebAssembly” is really good. It also indicates why they choose the stack machine representation: “The stack organization is merely a way to achieve a compact program representation, as it has been shown to be smaller than a register machine” with reference to this paper which found “… the bytecode size of the register machine being only 26% larger than that of the corresponding stack one”.

WebAssembly使用堆栈计算机模型执行代码。如果您缺乏精通阅读，并且对编程语义有所了解，那么“ 使用WebAssembly使Web加速发展 ”一文确实不错。这也表明了他们为什么选择堆栈机表示的原因：“堆栈组织仅仅是实现紧凑程序表示的一种方法，因为它已显示出比寄存器机小”，并参考本文，发现“……字节码”套准机的大小仅比相应的堆栈大26％”。

Even though the stack machine approach isn’t necessarily faster, it offered smaller bytecode; an incredibly important design goal for internet-based transactions.

即使堆栈机方法不一定更快，它也提供了较小的字节码。基于Internet的交易的一项非常重要的设计目标。

So how can we understand the text-format as a stack machine. As we read the code line by line we end up pushing arguments to the stack, then popping them off, doing some computation and pushing the result back. And repeat.

那么我们如何才能将文本格式理解为堆栈机。当我们逐行读取代码时，最终将参数推入堆栈，然后将其弹出，进行一些计算并将结果推回。重复一遍。

At first it might seem a little odd to have a text format, if in the end it will be compiled to the binary format for compression. But, the internet has always had the policy of viewing the source and that’s why the developers behind WebAssembly produced the text format. To go one step further and avoid conflicts of syntax they used the Lisp-like s-expression style.

最初，使用文本格式似乎有些奇怪，如果最后将其编译为二进制格式以进行压缩。但是，互联网一直以来都有查看源代码的政策，这就是WebAssembly背后的开发人员生成文本格式的原因。为了更进一步并避免语法冲突，他们使用了类似Lisp的s-expression样式。

安全和沙箱 (Safety and Sandboxing)

One of the greatest sources of bugs (and exploits) in unsafe languages is buffer overflows. C and C++ are almost interchangeable with this idea and it is one of the first aspects you are taught when learning these languages. In exchange for a little overhead costs, WebAssembly adds this safety net by enforcing fixed-sized, indexed memory (although certain memory can be grown).

不安全语言中最大的错误(和漏洞利用)来源之一是缓冲区溢出。 C和C ++几乎可以与这个想法互换，这是学习这些语言时首先要教的内容之一。为了换来一点开销，WebAssembly通过强制使用固定大小的索引内存(尽管可以增加某些内存)来添加此安全网。

The local variables to our function, for example$var0 , are not referenced by address but instead are indexed providing a layer of security. Access is granted via the get_local and set_local commands which all happens within the index space of the local variables.

我们函数的局部变量(例如$var0 )没有被地址引用，而是被索引以提供一层安全性。通过get_local和set_local命令授予访问权限， set_local命令均发生在局部变量的索引空间内。

Memory security was a top priority when designing WebAssembly. Straight from the documentation: “Linear memory is sandboxed; it does not alias other linear memories, the execution engine’s internal data structures, the execution stack, local variables, or other process memory.” Lin Clark, again, wrote a great article describing this.

在设计WebAssembly时，内存安全是重中之重。直接来自文档：“ 线性内存已沙盒化；它不会混淆其他线性内存，执行引擎的内部数据结构，执行堆栈，局部变量或其他进程内存。” 林克拉克(Lin Clark)再次写了一篇很棒的文章对此进行了描述。

The basic idea is comparable to the Javascript ArrayBuffer object — resizable and bound-checked. What we’re trying to achieve is program isolation to prevent errors and malicious code from spreading and corrupting data it shouldn’t even have access to.

其基本思想可与Javascript ArrayBuffer对象媲美-可调整大小并进行边界检查。我们正在试图实现的是程序隔离，以防止错误和恶意代码传播和破坏它甚至不应该访问的数据。

WebAssembly可以做什么？ (What can WebAssembly do?)

One of the major end-goals for WebAssembly was revolutionising what was possible in terms of graphics on the web. The classic examples are ZenGarden by EpicGames and Tanks!.

WebAssembly的主要最终目标之一是彻底改变了网络图形的可能性。经典的例子是EpicGames和Tanks提供的ZenGarden ！。

Thanks to its design, WebAssembly marks a pivotal moment in web development. The internet has a new tool in its arsenal to create amazing experiences and share information. WebAssembly provides smaller code-sizes, faster execution, greater security and a lot of room for extensibility. With ideas like threads, single-instruction multiple-data (SIMD) primitives and zero-cost execution on the horizon, WebAssembly’s abilities look only set to expand.

由于其设计，WebAssembly标志着Web开发的关键时刻。互联网在其工具库中提供了一种新工具，可以创造令人惊奇的体验并共享信息。 WebAssembly提供了较小的代码大小，更快的执行速度，更高的安全性以及可扩展性的余地。有了线程，单指令多数据( SIMD )原语和零成本执行等想法，WebAssembly的功能似乎只会扩展。