这是一段 Oleg Podsechin与NodeJS主程序员Ryan Dahl之间的访谈录。虽然不是NodeJS的教程,但是从访谈之中可以看出关于NodeJS的一些情况,帮助我们理解NodeJS之所以存在的前因后果。
Frank.
OP: 第一个问题就是介绍一番了。你是如何做出 NodeJS 来的?你以前有 JavaScript 的经验吗?你是什么时候开始 JavaScript 的,还有就是事件驱动的项目?The first question is an introduction really. How did you arrive at Node? Did you have experience with JavaScript before? When did you get started with JavaScript? And also event driven software?
Ry: 我曾经接触过一些 C 项目,各种各样的服务端 C 程序,有些也是事件驱动的。个中经历使得我明白,许多代码都是一样的,我只是在反复写啊又写。C 本身很好用,不过我还是希望通过脚本以同样的方式来控制那些服务端。I was a contractor and I was doing various little C projects usually involving server and event driven software and I realized that I was doing this same code over and over. C is a nice language to work in, but I wanted something I could script in the same way that I was programming these servers.
OP: 你做过前端的 JavaScript 吗?Had you done any front end stuff in JavaScript?
Ry: 做过。在 RoR 里做过一些,那时候我在不断地写前端代码。写前端之余我还写过一点的 Ruby 服务端程序,叫 Ebb,意思是更快的 Mongrel。那些代码就是 NodeJS 的前身啦。A little. I used to work a lot with Ruby on Rails – so I’d often be dealing with front-end code. Back then I wrote a little Ruby web server called Ebb that was meant to be a faster Mongrel. That code was the starting point for Node.
OP: Ebb应该用C写的吧?那么整个顺序下来,就是,打一开始时 Ruby 写的,然后用 C,最后落实 idea 的时候就是 JavaScript 了吧?Ebb was mostly in C right? So you went from writing it in Ruby, then writing it in C and now you’re sort of ending up writing it in JavaScript?
Ry: 是的。最初过程是从 Ruby 到 C。有段时期我想写个小巧服务端,就是一个 C 库,仅仅是想想而已,因为后来发现,要用 C 完成一切的事情的话,那难度太高了。有一日我灵光一闪,“JavaScript 不是绝佳我想要的好语言吗?”。记得这是发布 V8 之后的几天内想到的。Right. So what originally was Ruby turned into C. For a while I toyed with the idea of having a small web server C library – but it’s hard to get anything done in C. One day I had this epiphany: “JavaScript is exactly the language that I’m looking for here.” That happened shortly after V8 being released.
OP: 你说你总是离不开两门语言:C 和 JS。那么,JavaScript 作为一门通用的编程语言,您对此有何看法?You’ve said that there are two languages that will always be around: C and JavaScript. So, what are your thoughts on JavaScript as a general purposed programming language?
Ry: JS在某些方面,的确与其他的动态语言相当地不同。它没有线程。并发的模型完全基于事件。仅此一点就和别的主流动态语言区分开来,比如 Ruby 和 Python。至少在解决某些问题的方面上,我觉得 JS 有它的长处的,比如写一个 IRC 服务器的时候,我就觉得写起来很轻松。JavaScript has certain characteristics that make it very different than other dynamic languages, namely that it has no concept of threads. Its model of concurrency is completely based around events. This makes it rather different than other general purpose dynamic programming languages like Ruby and Python. At least for certain classes of problems, I’ve found JavaScript to be much easier to program in—for example when writing an IRC server.
OP: 你觉得 JS 前景如何?是否认为 JS 会日益流行,包括服务端和桌面上运行?How do you see the future of JavaScript? Do you see JavaScript becoming increasingly more prevalent, not only on servers but also on desktops?
Ry: JS已经在描述 GUI 的领域干得不错了。我认为像浏览器 API 风格的 JS 程序会在桌面大行其道。JavaScript is already doing a great job describing GUIs. I think with a familiar browser-like API JavaScript could also make a good desktop application language.
OP: JS 很容易乱,人们都在复制、粘贴 JS 的代码……JavaScript is quite unstructured, so people just copy paste JavaScript code …
Ry: 半点没错。缺乏一个模块系统肯定是致命的不足。还有 JS 鼓励人家多用全局变量,把一切都扔到全局变量中去。这真是 JS 该批评的地方,虽然有补救的办法克服不足。Yeah, not having a module system doesn’t help. JavaScript really encourages people to dump everything into global variables. That’s a real detractor for JavaScript but in the end better practices can overcome that sort of thing.
OP: 那么,你有留意 ECMAScript4 、ECMAScript5 和它们的讨论吗?So, did you follow the whole discussion around EcmaScript 4 and EcmaScript 5?
Ry: 我同意 Crockford 对语言的看法,就是保持精简。而 JavaScript 其中一个可爱的地方,就是其简洁性。我自己来说,对怎么弄好一件事情没有太多的预设性的看法,自由一点好——好比 I/O 的问题上。恰 好ECMAScript 4 并不包含 I/O 的内容,并让我由头搞过。新版本变化很大,而我想说的是,ECMAScript 5 功能上会更好一些。 I like Crockford’s opinion that the language should be kept simple. One of the best things about JavaScript was its simplicity. It didn’t have many predefined ideas about how to do stuff – particularly for I/O. Although EcmaScript 4 didn’t define any I/O, it did define a lot. It did make a lot of breaking changes. That said, I wish EcmaScript 5 did have a few more features.
OP: 比如呢? Any particular ones in mind?
Ry: 那个叫什么?反构分配?如果右边有个数组,左边有个多个变量,那么就一次性定义,不错的功能。What’s this called? Destructive assignment? If you have an array on the right and a list of variables on the left, and they can be define that way. That would be nice to have.
OP: Rhino 里已经有,V8 里没有。That’s included in Rhino, but not in V8
OP: 嗯~咱们说说 NodeJS。在设计这个项目的时候,你遇到了哪些最困难的地方,需要你认真地做出决定?So let’s move on to Node itself. what is the most difficult design decision you made with regards to the project?
Ry: 比较难的部分……嗯……我接触到的……我初衷是做的一个纯粹的非阻塞系统,而不用去考虑模块等的这样大头的问题。浏览器中通过 Script 标签加载 JS 都是非阻塞的。你不知道那些脚步究竟何时下载完毕,完毕了的话会触发 onLoad 事件。朴素地说,这点与NodeJS类似。你以模块形式 load 一堆 JS,也不知道他们何时解析完毕(interpreted, fully evaluated),除非有一个 “loaded” 的标志,通知已经加载完毕。这样搞的话就有点麻烦了。你不光需要 “require” 加载你所需要的包,而且还要准备好回调以等待完毕的时刻执行。Something that was very hard for me was … my original idea was that it was going to be purely non-blocking system and I’ve backed out of that a big in the module system and a few other areas. In the browser load JavaScript from a script tag is non-blocking. You don’t really know when the scripts are completely evaluated until an onLoad callback is made. Originally Node was similar. You would load a bunch of module files and you wouldn’t know that they were fully interpreted, fully evaluated, until a “loaded” event was . This made things a bit complicated. You couldn’t just do “require” and start using that stuff right below it, you had to wait for the callback to do that.
OP: 连 Hello world 程序运行起来有一个缩进。The hello world app would have one more indentation.
Ry: 是的。Right.
OP: 不少人喜欢 JS 的一项特点是,既可在浏览器中写 JS,又可在服务端中写,这不是一举两得吗?呵呵。但 CommonJS 的模块标准好像又无法在浏览器中应用,那么既如此,是怎么另辟蹊径使得 Framework 可以达到异步加载的目的的呢?But it’s funny because people say that one of the benefits JavaScript offers is that you can use it in the browser as well. You can run the same validation logic on the server and browser, but the CommonJS module spec doesn’t work within the browser, so there are these efforts to try and make frameworks with asynchronous module loading.
Ry: 是的,这就是设计上所谓困难的地方,我设想 NodeJS 是类似浏览器那样的,尽管方法可能不太一样但是结构是一样的,这样移植的时候就会方便些,连方法缩写也参照浏览器命名的。最初的时候 NodeJS 的确如此——完全是浏览器那样子的,甚至连 “window” 对象都有。于是,我静下慢慢细想,反倒觉得也没啥必要让它们太一致,服务端还是服务端的。因此我使用 CommonJS 的模块系统就自然而然了。大伙考量 CommonJS 的甚多,替我们费心,结果使得开发者可以省时省力。require 是阻塞的,另外 NodeJS 中一些小的地方也是阻塞的。不过没关系,只是一些小地方非阻塞不可,全盘估量 NodeJS 99% 都秉持非异步编程模式去倡导的,——但哪怕是到处都是同步的操作,也是可以正常工作的。如果同步加载模块对于服务端编程问题则不大。Right, so in terms of difficult design decisions, I wanted Node to be browser-like. Maybe it didn’t use the same methods but the same structures could be ported easily, aliasing methods to the browser ones. Originally Node achieved that—it was totally browser-like. Originally, it even had a ‘window’ object. I slowly backed off that API as it became clear it wasn’t necessary to have the server-side environment be exactly the same. So I went with the CommonJS module system which was rather reasonable; the CommonJS people had put a lot of thought into it and I didn’t really want to worry about modules so much. So yeah, require is blocking and there are some other minor things that are blocking in Node. Generally this pragmatic approach of being non-blocking 99% of the time, but allowing a few synchronous operations here and there has worked out well. It probably doesn’t matter for a server-side program if you load modules synchronously.
OP: 既然 CommonJS 涵盖那么多的主题,是否也有一些您是在跟进着并不时参与讨论的?So on the topic of CommonJS, are you following any of the APIs or any of the discussions on the list?
Ry: 嗯,是的。Yeah, sure
OP: 最吸引你的是什么主题?And which ones are you most interested in?
Ry: 恕我直言,CommonJS 的规范良莠不齐。一些规范真的只是预规范,规范而已,没有实现。结果我发现我错了。有一个规范指示服务端 JS 的接口已经很难得的了,因为如要要在不同 API 上花时间,实在太痛苦,有标准就好。二进制规范对 JS 来说尤其重要,原因是 JS 根本没有一个处理二进制的机制。CommonJS 模块规范不错、asset 规范不错,至于其他模块则待考查。CommonJS has some good specs and some less good specs. Some specs are rather prescriptive without any implementation – which I find wrong. I do like the idea of having a common server-side javascript interface – I just think it’s going to take some time to experiment with different APIs. A binary spec is quite important because JavaScript currently lacks a way of dealing with raw binary in any reasonable way. The module spec is good, the assert spec is good, the others are questionable.
OP: 包的规范呢?What about the package spec?
Ry: Oh yeah,看上去很赞。我觉得里面仍有不少可圈可点的地方,但我不打算插手包系统,了解得不算多。Oh yeah, it also looks good. I don’t want to work on a package system, so I’m not following it super closely, but I think that there’s a lot of good ideas there.
OP: 作为一个用户必然使用包系统,那么你会采用哪一种呢?As a user you must use a package management system. Which ones do you use?
Ry: 玩 NPM 比较多,有不少 bug,可还是能用。I’m playing around with NPM. It’s OK, kind of buggy, but you can use it.
OP: 看来还是涉及到包,还是会对 Node 核心产生重要的影响,但如果是外置的包,如 XML 解析,你觉得还有没有漏掉重要的包?So with regards to packages, obviously there’s some stuff that’s going into the core of Node, but external packages, like XML parsing, are there any packages that you think are important that aren’t there already?
Ry: 亟待改进的 MySQL 解决方案,如 libmysql_client,libmysql_client 会导致阻塞所以仍称不上是一个方案。尚有其他的方案也仍有不少 bug。对于那么多人使用MySQL,却是用起来还是不顺,就是一大障碍。这是其中一个问题。有一段时间我很渴望一个不错的JavaScript HTML解析器,但居然也搞掂了;我也想有个 DOM实现居然也有了。我较倾向用 Thrift 的方式(thrift属于facebook.com技术核心框架之一,使用不同开发语言开发的系统可以通过该框架实现彼此间的通讯)访问Cassandra,不过就还没有弄好。There needs to be a better MySQL solution, libmysql_client, the library that comes with MySQL is blocking so that is not a solution. There are other solutions, but they seem kind of buggy. A lot of people use MySQL and it would be a hindrance for them if they couldn’t access that easily. That’s one.
For a long time I was lusting after a good JavaScript HTML parser, but it seems that has solved. I also wanted a DOM implementation and it seems like that’s been solved too. I would really like a way to access Cassandra, which uses Thrift – that’s not been done yet.
OP: 似乎当前尚无一个 JS 的 Thrift 库。There aren’t really any decent JavaScript Thrift libraries
Ry: 由于有项目正在使用 Thrift,所以我们也有一定了解,就是发现 Thrift 里面的一些问题,不是太到位。如果可以切入绑定 Thrift(binding ) 就最好。Thrift is a piece of crap but unfortunately some projects are using it so we’ve got to interface with it. Some sort of Thrift binding would be good.
OP: 下次发布之中,他们期待出现 RESTful 的接口?I think in the next release they’re looking to have a RESTful interface.
Ry: 据说他们正在引入基于 Avro 的接口,一种新型的 PRC 消息序列化。我不清楚 Avro 到底可以表现得有多好。Avro 可以绑定到 Casandra,光看这一点很不错。那么是否代表可以取代 Thrift,我想也不是一概而论的。AvroI’ve heard they’re introducing an interface based on Avro, a new message serialization RPC thing, but I’m not sure how good the Avro support is. Avro seems a lot better than Thrift so just binding to Avro would be the best way go for talking to Casandra – I don’t know.
连接数据库无论对那个用户都是很重要的,不用多说,故所以 MySQL 的还是属于大头问题。Being able to connect to databases is important for users. If it’s not there, then it’s a total roadblock for a lot of people. So, MySQL is a major one.
OP: 另外一方面,有一大趋势,就是透过服务端的JS可以看出它与非关系型数据库将是一种“天作之合”。如果你愿意,你可以连接不同类型的数据源在一起,Nodejs 在这方面还很出色。CouchDB 的家伙正是深谙此道。你有哪些想法,是否觉得这方面大有可为呢?A slight aside, but a big trend, aside from server side JavaScript is non relational databases and Node seems like the perfect glue, if you will, to connect these different data stores together. CouchDB guys are using it for that purpose. What are your thoughts, can you see an opportunity there?
Ry: 没错,Node 恰好填充了代理层和认证层,介乎于后端存储与客户端之间。所以没错,我觉得还好而且同意 CouchDB 的思想,就是封装的应用程序可以放置到数据库中去。保存的东西都可以在 Couch中,而 NodeJS 就是来回处理其中的代理数据,进进出出。Exactly, Node perfectly fills the proxy and authentication layer, between the storage backend and client. So yeah, I think it’s a good sort of glue and I agree with the CouchDB philosophy that the bulk of the application can kind of sit in the database. All the hard stuff can be back in Couch and Node can just proxy data back and forth.
OP: 那么接着谈谈你喜欢的 Node 包,进入核心和考查项目的构建方式,你对领导开源项目有何心得?So talking about the packages you’d like in Node, moving into the core and looking at the way the project is being built, what are your thoughts on project leadership in open source projects? What do you think is the right way to do it, which things shouldn’t you do? What’s your personal approach? Because you used to post little challenges for people to get them excited and get them contributing a little bit. Can you talk more about that?
Ry: 自问,我在项目中比较强势。我是唯一的提交者,我有权决定 NodeJS 应该长什么样子,因为我认为这是 NodeJS 此阶段比较适合采取的做法。NodeJS 一直在成长,我们打算为此成立一个委员会来决定许多事情。如果有用户不但贡献了项目,而且还持之以恒去维护,这就相当重要了。部分角色不接受我可以维护的 change,也就是 Reject 了很多好的代码。主要原因是代码不能够吻合我所谓的“核心Node”的概念。有用户贡献过包管理的代码,但却没有时间去维护它。I have a strong arm in the project. I’m the only committer and I dictate how things go and I think that’s a good approach for Node at this stage. At some point, hopefully, Node will grow up and we’ll a committee that decides on things. But at the moment having somebody that’s dedicated to the project and who will make sure that any changes that go in will be maintained is important. Part of that roll is not accepting changes that I can’t maintain myself, and so it means rejecting a lot of good code – just because it doesn’t fit into my contrained idea of what “Node core” is. There are users who would have contributed, for example, package manager code, but it’s not something I have time to maintain.
OP: 做对的事情……Nudging them a little bit in the right direction …
Ry: Yeah.
OP: 按照大家对项目的了解,都是说你。你现在是供职于 Joyent,还是全职搞 Node?Moving on, with regards to commitment to the project, you’re saying that you’re fully behind (it and so on and so forth,) so you’re currently employed by Joyent? and working 100% on Node?
Ry: 嗯,NodeJS 的项目主页就是在 NodeJS 上面跑的。Yeah – Node and projects based on Node. It’s great.
OP: 估计关于 NodeJS 的商业文化和商业化 NodeJS 有文章可做。很明显 Joynt 作为一个主机供应商表现得饶有兴趣,不知您有没有了解到这些围绕 NodeJS 的商业生态系统的逐渐形成中。要是这样,将会是一个什么情形的商业模式?I guess the question is more about the commercial nature of Node and commercialization of Node. Clearly Joyent have an interest in it, being a hosting company, but do you see an ecosystem of businesses emerging around Node at some point and if so what types of businesses are these likely to be?
Ry: 很明显这些业务不单 Joynet 在做,另外也有别的人在实践着,如 Herku 在做的也是差不多的。NodeJS 的诞生,对于要求实时的网站,它打开了与别不同的一扇窗,提供了一种独立的思路,由此也营造了别致的生态系统。One obvious thing is hosting of applications in an simple way like Heroku is doing. Node opens the door to independent contractors making little real-time websites for people — so there’s that ecosystem.
OP: 你有没有兴趣围绕 NodeJS 构建咨询服务吗?还是喜欢维护核心项目多些?You don’t have an interest in building a service on top of Node? Rather you wish to maintain the core project?
Ry: 我在Joynt上班,帮他们搞产品,但是我最终的兴趣是 NodeJS, 让用户接受它,用得开心。NodeJSI work for Joyent, so I work on products for them, but my main interest is making Node perform well and make users happy.
OP: 到尾的可是几个抽象的问题了。第一个是关于 NodeJS 的异步其本质的问题。你是否认为将来的 Web 程序更多地往事件驱动这方面发展?不特 NodeJS,而且引申到其他 Web 程序的范围在内。So the last couple of questions are a bit more abstract. The first one is about the asynchronous nature of Node. Do you see event driven webapps becoming more prevalent in the future? Not only Node, but asynchronous webapps in general.
Ry:耶……没有错。无须等待数据对于性能的增益很大——每个 TCP 流数据包会小很多。我们需要可以应付多数连接为空闲连接的程序。但是我觉得,正常的连接于异步下的安排也是有必要的,事关对性能提升都有好处,即使不太明显。总之异步服务端节省系统资源的优点不能忽视。 Yeah, definitely. Not waiting for a database is a big win in terms of performance – the amount of bagage associated with each TCP stream is just much smaller. We need that for real-time applications where many mostly-idle connections are being held. But even for normal request response websites I think we’ll see more asynchronous setups just because of the performance wins – even if it’s necessary. It’s clear that asynchronous servers perform better in almost every way, it’s difficult to ignore that.
OP: 我认为JS的回调和缩进是异步编程的一大障碍,但比起其他语言 JS 还更轻松些。I guess writing callbacks and indentation in JavaScript is one hurdle towards asynchronous programming, but JavaScript is much easier for doing such stuff than other languages.
Ry: 绿色线程固然有效率得多,调度得更快,而且还有一个好处是,写异步的代码跟写同步的样子差不多。举个例子就是 Eventlet,但是为什么我对 Eventlet 有保留?因为我觉得其不足之处是了漏掉了抽象层面,没有回调函数的抽象,也就是把操作系统的基础设施直接翻译过来就是了。There are of course efficient green thread and coroutine implementations which allow you to write asynchronous code in a synchronous looking way – Eventlet for example. I’m not convinced that’s the right approach, I think it’s a leaky abstraction. There’s no abstraction with callbacks – it’s a rather direct translation from the interface the operating system gives.
Ry:嗯,在我个人眼中看来,多数的程序,包括我们在为大型程序写的一部分,可视作为其他对象所服务的代理(proxies),我们从数据库的数据代理到 Web 浏览器,其中的过程首先是读取模板然后进行一些 HTML 的处理,同样大体来说 ,也是把数据从一个地方传送到另外一个地方。NodeJS 负责把数据从一个地方传送别的地方,NodeJS 就是担负起这种任务,并且要十分高效的,这就必须设定好一定的数据节流 (data throttling)。 如果从数据库中很快地读取数据,你可以先停止读取流。为什么?假设在 TCP 连接中,你可以先不读取数据源的内容和填入到 Response的 对象中,先不这样做,而是读出模板的开始部分发送到 Web 浏览器中,然后才做读取 DB 数据库的事情。因此你不需要马上读取全部表格,装进模板。这应该是能够贯穿整个系统,流状形态的,所以以特定的方式去构建一个符合这种流形态的环境,就是一个问题了,并且是十分重要的難題。我们尚未达到这种水平。但我理想中的 NodeJS 应该是这样子的。文件描述器交接到下一步流的过程中,就有频密的数据交互,不需要缓冲大量数据。Yeah, I think most of the programs, or a large part of the programs that we write, are just proxies of some form or another. We proxy data from a database to a web browser, but maybe run it through a template first and put some HTML around or do some sort of logic with it. But largely, we’re just passing data from one place to the other. It’s important that Node is setup to pass data from one place to the other efficiently and with proper data throttling. So that when data is coming in too quickly from the database, that you can stop that incoming flow. Suppose it’s over a TCP connection, you can just stop reading from that data source and not fill up your memory with the whole response. Start sending out the first part of the template that you’re sending to the web browser and then pull in more data from the DB. You know, it must properly shuffle the data through the process without blowing up the memory if one side is slower than the other. You shouldn’t have to pull down the entire table, put into a template and then send it out. It should just be able to flow through your system, so creating an environment where it’s easy to setup these flows in the proper way is important. We’re not there yet, but that’s kind of my vision of what Node will be. Lots of shuffling of data from one file descriptor to the next, without having to buffer a ton of data.
OP: NodeJS 的名字源自哪里?你是怎么想到用NodeJS的这个名字?So in a way you can look at different Node instances talking to each other, forming a graph with directed edges between the different nodes? Is that where the name Node comes from? How did you come up with the name?
Ry:之所以采用 Node 的名称因为我感觉它是属于大型程序其中的一部分。没有进程(process)的程序,一个程序就是一个数据库加一个应用逻辑(application )加一个负载平衡,接着 NodeJS 便是其中的一个节点(node)。不需要太多的 NodeJS实例,一对 node.js实例加上其他东东就可以。 I used the name “Node” because I envision it as one part of a larger program. A program is not a process, a program is a database plus an application plus a load balancer and Node is one node of that. It’s not necessarily a bunch of NodeJS instances but a couple Node.js instances plus some other things.
OP: 不错!感谢您花出宝贵时间接受采访。Sounds good! Thank you for taking the time to chat.
来源:http://dailyjs.com/2010/08/11/ryan-dahl-part-1 http://dailyjs.com/2010/08/11/ryan-dahl-part-2 翻译有些微删节。