创建编程语言
As someone who writes code, you undoubtedly do so using one or multiple programming languages. You probably enjoy writing code in some programming languages because of their elegance, expressive power or any other reason and you also have probably kept your distance from other programming languages because of, maybe, some of their features that you think are badly implemented.
作为编写代码的人,您无疑是使用一种或多种编程语言来编写的。 您可能喜欢用某些编程语言编写代码,因为它们的优雅,表达能力或任何其他原因,并且您可能也与其他编程语言保持了距离,这可能是因为您认为它们的某些功能实现不佳。
I think all curious developers asked it at least once. It is normal to be fascinated by how programming languages work. Unfortunately, most answers we read are very academic or theoretical. Some others contain too much implementation details. After reading them we still wonder how things work in practice.
我认为所有好奇的开发人员至少都会问一次。 对编程语言的工作方式着迷是很正常的。 不幸的是,我们阅读的大多数答案都是学术性的或理论性的。 其他一些包含太多的实现细节。 阅读它们之后,我们仍然想知道事情在实践中是如何工作的。
Have you, however, thought about how those programming languages we love and hate came to be? How those particular features we like (or do not like) are designed and implemented and why? How those magic black boxes that are compilers and interpreters work? How code written in JavaScript, Ruby, Python, etc, turns into an executable program? Or have you ever thought about building your own programming language?
但是,您是否考虑过我们喜欢和讨厌的那些编程语言是如何形成的? 我们喜欢(或不喜欢)这些特定功能的方式是如何设计和实现的,为什么? 那些作为编译器和解释器的神奇黑匣子如何工作? 用JavaScript,Ruby,Python等编写的代码如何变成可执行程序? 还是您曾经考虑过构建自己的编程语言?
Many people have difficulties or frustrations with the programming languages they use every day. Some want things to be handled more abstractly, while others dislike implementing features they wish were 'standard'. Whether you are an IT professional or just a hobbyist, many times you may find yourself wanting to create a new programming language.
许多人对每天使用的编程语言有困难或沮丧。 有些人希望对事物进行更抽象的处理,而另一些人则不喜欢实现他们希望是“标准”的功能。 无论您是IT专业人员还是业余爱好者,很多时候您可能会发现自己想要创建一种新的编程语言。
设计编程语言 (Designing a programming language)
If you want just to write your own compiler to learn how these things work, you can skip this phase. You can just take a subset of an existing language or come up with a simple variation of it and get started. However, if you have plans for creating your very own programming language, you will have to give it some thought.
如果您只想编写自己的编译器以了解这些工作原理,则可以跳过此阶段。 您可以只使用现有语言的一部分,也可以对其进行简单的修改然后开始使用。 但是,如果您打算创建自己的编程语言,则必须考虑一下。
I think of designing a programming language as divided two phases:
我认为将编程语言设计为两个阶段:
The big-picture phase
大图阶段
The refinement phase
精炼阶段
In the first phase we answer the fundamental questions about our language.
在第一阶段,我们回答有关我们语言的基本问题。
What execution paradigm do we want to use? Will it be imperative or functional? Or maybe based on state machines or business rules?
我们要使用什么执行范例? 它是必须的还是功能性的? 还是基于状态机或业务规则?
Do we want static typing or dynamic typing?
我们要静态输入还是动态输入?
What sort of programs this language will be best at? Will it be used for small scripts or large systems?
这种语言最适合哪种程序? 它将用于小型脚本还是大型系统?
What matters most to us: performance? Readability?
对我们来说最重要的是:性能? 可读性?
Do we want it to be similar to an existing programming language? Will it be aimed at C developers or easy to learn for who is coming from Python?
我们是否希望它类似于现有的编程语言? 它是针对C开发人员还是容易从谁那里学习Python?
Do we want it to work on a specific platform (JVM, CLR)?
我们是否希望它在特定平台(JVM,CLR)上运行?
What sort of metaprogramming capabilities do we want to support, if any? Macros? Templates? Reflection?
我们要支持什么样的元编程功能(如果有)? 宏? 模板? 反射?
In the second phase we will keep evolving the language as we use it. We will run into issues, into things that are very difficult or impossible to express in our language and we will end up evolving it. The second phase might not be as glamorous as the first one, but it is the phase in which we keep tuning our language to make it usable in practice, so we should not underestimate it.
在第二阶段,我们将在使用语言时不断发展。 我们将遇到各种问题,遇到用我们的语言很难或不可能表达的事情,最终我们将不断发展。 第二阶段可能不如第一阶段那么迷人,但这是我们不断调整语言以使其在实践中可用的阶段,因此我们不应低估它。
编译器 (Building a compiler)
Building a compiler is the most exciting step in creating a programming language. Once we have a compiler we can actually bring our language to life. A compiler permits us to start playing with the language, use it and identify what we miss in the initial design. It permits to see the first results. It is hard to beat the joy of executing the first program written in our brand new programming language, no matter how simple that program may be.
构建编译器是创建编程语言中最激动人心的步骤。 一旦有了编译器,我们就可以使我们的语言栩栩如生。 编译器允许我们开始使用该语言,使用它并确定我们在初始设计中缺少的内容。 它允许看到第一个结果。 不管程序多么简单,执行我们用全新的编程语言编写的第一个程序的乐趣都难以克服。
But how do we build a compiler?
但是,我们如何构建编译器?
As everything complex we do that in steps:
由于所有复杂的事情,我们分步骤进行:
We build a parser: the parser is the part of our compiler that takes the text of our programs and understand which commands they express. It recognizes the expressions, the statements, the classes and it creates internal data structures to represent them. The rest of the parser will work with those data structures, not with the original text
我们构建了一个解析器 :解析器是编译器的一部分,它接受程序的文本并了解它们表示哪些命令。 它识别表达式,语句,类,并创建内部数据结构来表示它们。 解析器的其余部分将使用这些数据结构,而不是原始文本
(optional) We translate the parse tree into an Abstract Syntax Tree. Typically the data structures produced by the parser are a bit low level as they contain a lot of details which are not crucial for our compiler. Because of this we want frequently to rearrange the data structures in something slightly more higher level
(可选) 我们将解析树转换为抽象语法树 。 通常,解析器生成的数据结构有点低级,因为它们包含很多细节,这些细节对于我们的编译器而言并不重要。 因此,我们希望经常在更高的层次上重新排列数据结构
We resolve symbols. In the code we write things like
a + 1
. Our compiler needs to figure out whata
refers to. Is it a field? Is it a variable? Is it a method parameter? We examine the code to answer that我们解析符号 。 在代码中,我们编写类似
a + 1
东西。 我们的编译器需要弄清楚a
指的是什么。 是田野吗? 它是变量吗? 它是方法参数吗? 我们检查代码以回答We validate the tree. We need to check the programmer did not commit errors. Is he trying to sum a boolean and an int? Or accessing an non-existing field? We need to produce appropriate error messages
我们验证树 。 我们需要检查程序员没有提交错误。 他是在尝试对布尔值和整数求和吗? 或访问不存在的字段? 我们需要产生适当的错误消息
We generate the machine code. At this point we translate the code in something the machine can execute. It could be proper machine code or bytecode for some virtual machine
我们生成机器代码 。 至此,我们将代码翻译成机器可以执行的东西。 对于某些虚拟机,它可能是正确的机器码或字节码
(optional) We perform the linking. In some cases we need to combine the machine code produced for our programs with the code of static libraries we want to include, in order to generate a single executable
(可选) 我们执行链接 。 在某些情况下,我们需要将为程序生成的机器代码与我们要包含的静态库的代码结合起来,以生成单个可执行文件
Do we always need a compiler? No. We can replace it with other means to execute the code:
我们是否总是需要编译器? 否。我们可以用其他方式代替它来执行代码:
We can write an interpreter: an interpreter is substantially a program that does steps 1-4 of a compiler and then directly executes what is specified by the Abstract Syntax Tree
我们可以编写一个解释器:解释器实质上是一个程序,执行编译器的步骤1-4,然后直接执行抽象语法树指定的内容
We can write a transpiler: a transpiler will do what is specified in steps 1-4 and then output some code in some language for which we have already a compiler (for example C++ or Java)
我们可以编写一个Transpiler:Transpiler将执行步骤1-4中指定的操作,然后以某种我们已经具有编译器的语言(例如C ++或Java)输出一些代码。
These two alternatives are perfectly valid and frequently it makes sense to choose one of these two because the effort required is typically smaller.
这两种选择是完全有效的,并且通常选择这两种方法中的一种是有意义的,因为所需的工作量通常较小。
您的编程语言的标准库 (A standard library for your programming language)
Any programming language needs to do a few things:
任何编程语言都需要做一些事情:
Printing on the screen
在屏幕上打印
Accessing the filesystem
访问文件系统
Use network connections
使用网络连接
Creating GUIs
创建GUI
These are the basic functionalities to interact with the rest of the system. Without them a language is basically useless. How do we provide these functionalities? By creating a standard library. This will be a set of functions or classes that can be called in the programs written in our programming language but that will be written in some other language. For example, many languages have standard libraries written at least partially in C.
这些是与系统其余部分进行交互的基本功能。 没有他们,一种语言基本上是没有用的。 我们如何提供这些功能? 通过创建标准库。 这是一组函数或类,可以在用我们的编程语言编写的程序中调用这些函数或类,但可以用其他某种语言编写。 例如,许多语言都有至少部分用C编写的标准库。
A standard library can then contain much more. For example classes to represent the main collections like lists and maps, or to process common formats like JSON or XML. Often it will contain advanced functionalities to process strings and regular expressions.
然后,标准库可以包含更多内容。 例如,用于表示诸如列表和地图之类的主要集合的类,或用于处理诸如JSON或XML之类的常见格式的类。 通常,它将包含处理字符串和正则表达式的高级功能。
In other words, writing a standard library is a lot of work. It is not glamorous, it is not conceptually as interesting as writing a compiler but it is still a fundamental component to make a programming language viable.
换句话说,编写标准库需要大量工作。 它并不迷人,在概念上不像编写编译器那样有趣,但是它仍然是使编程语言可行的基本组成部分。
There are ways to avoid this requirement. One is to make the language run on some platform and make it possible to reuse the standard library of another language. For example, all languages running on the JVM can simply reuse the Java standard library.
有一些方法可以避免此要求。 一种是使该语言在某种平台上运行,并可以重用另一种语言的标准库。 例如,在JVM上运行的所有语言都可以简单地重用Java标准库。
推荐资源: (recommends Resources:)
Programming Languages: An Interpret Based Approach - Samel N Kamin - this isn't a well known book, but it totally re-ignited my interest. This book goes through various languages and their features and builds interpreters for them (in Pascal, it's an old book).
编程语言 :一种基于解释的方法-Samel N Kamin-这不是一本众所周知的书,但它完全重新激发了我的兴趣。 本书介绍了各种语言及其功能,并为它们构建了口译员(在Pascal中,这是一本旧书)。
How to Create Your Own Programming Language: Want to create a programming language, but don't feel like going through one of those expensive and boring 1000-page books ? Well, you're not alone ...
如何创建自己的编程语言 :想要创建一种编程语言,但又不想遍历一本昂贵而枯燥的1000页书籍中的一本? 好吧,你并不孤单...
LLVM: Writing a Simple Programming Language - a step by step C++ tutorial on how to build a compiled language (using LLVM). You should basically use LLVM for the back-end, since that will save you hundreds of man-years of work and is open source. It's a well solved problem that is totally applicable to anything anyone can build. Clang is the C/C++/ObjectiveC front-end.
LLVM:编写一种简单的编程语言 -有关如何构建编译语言(使用LLVM)的分步C ++教程。 基本上,您应该将LLVM用于后端,因为这将节省您数百年的工作量,并且是开源的。 这是一个很好解决的问题,完全适用于任何人都可以构建的任何东西。 Clang是C / C ++ / ObjectiveC的前端。
Types and Programming Languages: Benjamin C Pierce - this is perhaps heavy mathematically, however you can skip those parts and still understand the concepts. If you want to properly understand type systems and do it properly, this book has it all. New languages like Scala, Kotlin, Swift (and others such as Typescript) are beginning to include things like set-theoretic type systems (optionals, unions, intersections). This book has been incredibly influential for myself, and will lead you down many good rabbit holes.
类型和编程语言:本杰明·皮尔斯 ( Benjamin C Pierce) -从数学上来说,这可能很繁重,但是您可以跳过那些部分,并且仍然可以理解这些概念。 如果您想正确地理解类型系统并正确地进行操作,则本书具有全部内容。 诸如Scala,Kotlin,Swift(以及诸如Typescript之类的其他语言)之类的新语言开始包含诸如集合论类型系统(可选,联合,交集)之类的东西。 这本书对我自己具有不可思议的影响力,它将带您深入研究许多不错的兔子洞。
翻译自: https://vuejsexamples.com/how-would-i-creating-a-programming-language/
创建编程语言