从头编写操作系统_从头开始在Node.js中编写的数据库

从头编写操作系统

Node轻巧且可扩展,可让我们快速发展,而npm具有令人难以置信的软件包。 (Node is lightweight & scalable, allows us to develop quickly, and npm has incredible packages.)

The founding team at HarperDB built the first and only database written in Node.js. A few months back, our CEO Stephen Goldberg was invited to speak at a Women Who Code meetup to share the story of this (what some called crazy) endeavor. Stephen discussed the architectural layers of the database, demonstrated how to build a highly scalable and distributed product in Node.js, and demoed the inner workings of HarperDB. You can watch his talk at the link above, and even read a post from back in 2017, but since we all love Node.js and it’s an interesting topic, I’ll summarize here.

HarperDB的创始团队构建了第一个也是唯一一个用Node.js编写的数据库。 几个月前,我们的首席执行官史蒂芬·戈德堡 ( Stephen Goldberg )受邀在《女性密码》(Women Who Code)聚会发表演讲,分享这一努力的故事(有些人称之为疯狂) 。 Stephen讨论了数据库的体系结构层,演示了如何在Node.js中构建高度可扩展的分布式产品,并演示了HarperDB的内部工作原理。 您可以在上面的链接上观看他的演讲,甚至可以阅读2017年帖子 ,但是由于我们所有人都喜欢Node.js,这是一个有趣的话题,在此我将进行总结。

The main (and simplest) reason we chose to build a database in Node is because we knew it really well. We got flak for not choosing Go, but people now accept that Go and Node are essentially head to head (in popularity & community support). Zach, one of our cofounders, recognized that with the time it would have taken to learn a new language, it would never be worth it.

我们选择在Node中构建数据库的主要原因(也是最简单的原因)是因为我们非常了解它。 我们因为没有选择Go而感到惊讶,但是现在人们接受了Go和Node本质上是正面交锋的( 在流行度和社区支持方面 )。 我们的联合创始人之一扎克(Zach)认识到,随着时间的流逝,学习一种新语言将永远是不值得的。

在Node.js中构建数据库的优点 (Pros of building a Database in Node.js)

  • We already knew Node.js

    我们已经知道Node.js
  • Lightweight

    轻巧的
  • Rapid development

    快速发展
  • Highly scalable

    高度可扩展
  • npm

    npm

The HarperDB team has a background in large scale software development. The initial goal of our database was to create a tool that empowers developers to focus on coding, without having to devote time and effort to database maintenance, while still providing a powerful solution. We wanted people to feel comfortable and confident in the product they were using. Our team has extensive experience in languages other than Node, but we had great success programming in it. (Although coming from Java, Stephen thought Node was horrible at first, but after about 90 days he learned to love it). Node is lightweight, allows us to develop quickly, and npm has incredible packages.

HarperDB团队具有大规模软件开发的背景。 我们数据库的最初目标是创建一个工具,使开发人员可以专注于编码,而不必花费时间和精力进行数据库维护,同时仍提供强大的解决方案。 我们希望人们对使用的产品感到舒适和自信。 我们的团队在Node以外的语言方面拥有丰富的经验,但是我们在编程方面取得了巨大的成功。 (尽管来自Java,Stephen最初认为Node很可怕,但是大约90天后,他学会了爱它)。 Node是轻量级的,它使我们能够快速发展,npm具有令人难以置信的软件包。

在Node.js中建立资料库的缺点 (Cons of building a Database in Node.js)

  • At the time was not accepted as an “enterprise grade language”

    当时不被接受为“企业级语言”
  • Does not have direct control of Operating System/File System

    没有对操作系统/文件系统的直接控制
  • Not as performant as C/C++

    表现不如C / C ++
  • Did not have native threading (now it does)

    没有本机线程(现在有)

We did have some troubles… being the first database written in Node.js we didn’t have the option to follow in anyone’s footsteps. We’re probably one of the first enterprise products ever built in Node, at least the most data-centric one. People questioned this. One guy told Stephen that he would rather cut his heart out with a spoon than program a database in Node.js. Now people have realized this was a great idea because we have all these incredible features in our product that we didn’t have to build and are inherent in what we do. We did run into challenges around not having direct control of OS in the file system. Also, C/C++ are faster, but can be more complicated and not necessarily as scalable horizontally. It really depends if you’re looking for vertical or horizontal computing.

我们确实遇到了一些麻烦……作为第一个用Node.js编写的数据库,我们没有选择跟随任何人的脚步。 我们可能是Node内置的首批企业产品之一,至少是最以数据为中心的产品。 人们对此表示质疑。 一个人告诉斯蒂芬,他宁愿用勺子掏出自己的心,也不愿在Node.js中编程数据库。 现在人们已经意识到这是一个好主意,因为我们在产品中拥有了所有这些令人难以置信的功能,这些功能我们不需要构建,并且这些功能固有于我们的工作中。 在没有直接控制文件系统中的OS方面,我们确实遇到了挑战。 同样,C / C ++速度更快,但可能更复杂,并且不一定像水平扩展一样。 这实际上取决于您要查找垂直还是水平计算。

Free NodeJS Tutorial for Beginners:

免费的NodeJS初学者教程:

Image for post

Download: Tutorial for node js

下载: Node js教程

科技栈 (Tech Stack)

Image for post

This is what our tech stack looks like. We consider our Management Studio to be part of the HarperDB stack, and that is built in React with a Node back end. The green box signifies any application built on top of HarperDB, for example our Node-RED node can be used to build custom workflows. The HarperDB technology is built entirely in Node.js, which encompasses our interfaces and HarperDB core.

这就是我们的技术堆栈。 我们认为Management Studio是HarperDB堆栈的一部分,它是在React中构建的,并带有Node后端。 绿色框表示在HarperDB之上构建的任何应用程序,例如,我们的Node-RED节点可用于构建自定义工作流程。 HarperDB技术完全构建在Node.js中,它包含我们的接口和HarperDB核心。

Our product presents itself as a REST API which, under the hood, is essentially just an Express application, that’s the primary interface for how you interact with HarperDB. Our NoSQL parser is a custom solution we built internally. We use AlaSQL for our SQL parsing functionality which you can read more about here, we extend their functionality with custom code on top of that, it’s an amazing npm package for parsing SQL. We offer drivers, like ODBC and JDBC, built by a partner of ours. Finally, we use SocketCluster for distributed computing and clustering which our CTO will be presenting on in a couple weeks.

我们的产品以REST API的形式呈现,它实际上是一个Express应用程序,它是您与HarperDB进行交互的主要接口。 我们的NoSQL解析器是我们内部构建的自定义解决方案。 我们将AlaSQL用于我们SQL解析功能,您可以在此处了解更多信息 ,此外,我们还使用自定义代码扩展了它们的功能,这是一个很棒的用于解析SQL的npm软件包。 我们提供由我们的合作伙伴构建的驱动程序,例如ODBC和JDBC。 最后,我们将使用SocketCluster进行分布式计算和集群化, 我们的CTO将在几周后进行介绍

The HarperDB core technology encompasses the “secret sauce.” This is what makes it possible for us to be fully indexed with no data duplication and offer various interface options to a single data model. Within the core there are numerous npm packages implemented to extend our functionality.

HarperDB核心技术包含“秘密调味料”。 这就是使我们有可能在没有数据重复的情况下被完全索引并为单个数据模型提供各种接口选项的原因。 在核心内,实现了许多npm软件包以扩展我们的功能。

Finally we have various options for storage media. We bundle LMDB by default as it provides significant performance gains over the other options. HarperDB core contains extensible code that allows us to add additional storage media options in the future.

最后,我们为存储介质提供了多种选择。 默认情况下,我们捆绑LMDB,因为与其他选项相比,它具有明显的性能提升。 HarperDB核心包含可扩展的代码,使我们能够在将来添加其他存储介质选项。

REST API (REST API)

  • HarperDB is a set of microservices

    HarperDB是一组微服务
  • A single endpoint

    单一端点
  • All operations are post

    所有操作均已过帐
  • Stateless/RESTful

    无状态/ RESTful
Image for post

(Sample code found at https://docs.harperdb.io/)

(示例代码位于 https://docs.harperdb.io/ )

At a former company our team dealt with the headache of hundreds of API’s with different endpoints, which was simply insane. People might think it’s weird that HarperDB is just one endpoint, but if you look in the body of the code, for every operation you do- all you ever have to change is the body, those first few lines. This is super simple, and when writing a REST-based application you can make it really straightforward. This is something you can take from us and use in any application! Basically you post a single message to the API, we see what operation you’re performing, and handle it with a standard set of methods. We’ve rewritten a lot of our application over the last couple years but this part has stayed mostly the same.

在以前的公司中,我们的团队解决了数百个具有不同端点的API的头痛问题,这简直太疯狂了。 人们可能会认为HarperDB只是一个端点很奇怪,但是如果您查看代码的主体,对于您执行的每一项操作,您需要更改的只是主体,前几行。 这非常简单,在编写基于REST的应用程序时,您可以使其变得非常简单。 您可以从我们这里获取并在任何应用程序中使用! 基本上,您将一条消息发布到API,我们将看到您正在执行的操作,并使用一组标准的方法对其进行处理。 在过去的几年中,我们已经重写了很多应用程序,但是这一部分几乎保持不变。

管理工作室 (Management Studio)

  • Built on the HarperDB REST API

    基于HarperDB REST API构建
  • Written in React Native

    用React Native写
  • Allows for control of your HarperDB instances via GUI

    允许通过GUI控制HarperDB实例
Image for post

The HarperDB Management Studio is a React front end built on top of our microservices (so we eat our own dog food). One awesome thing about JavaScript is how lightweight it is, regardless of what framework you’re using (Node, React, etc), and you can easily couple together these different layers. React is amazing, it’s changed the quality of front end development and allows us to make our application more accessible. By building on top of this, we’re also testing our own API’s at the same time — which makes it really powerful. Jaxon our VP of Product chose React for the Studio, while Stephen wrote our back end reporting in Express.

HarperDB Management Studio是在我们的微服务之上构建的React前端(因此我们可以吃自己的狗食)。 关于JavaScript的一件令人敬畏的事情是它的轻巧程度,无论您使用的是哪种框架(Node,React等),您都可以轻松地将这些不同的层耦合在一起。 React令人惊叹,它改变了前端开发的质量,并允许我们使应用程序更易于访问。 通过在此基础上构建,我们还将同时测试自己的API,这使其真正强大。 产品副总裁Jaxon为Studio选择了React,而Stephen在Express中撰写了我们的后端报告。

AlaSQL (AlaSQL)

We chose AlaSQL for HarperDB’s back end functionality, it has some great things in it that we don’t, and allows us to wire in things like Math.js and GeoJSON so it’s an incredible package. One amazing benefit of using Node for a language like this is as technology is advancing, most of the cool stuff that you want and need is on npm. If we had to build our own SQL parser we’d probably still be building HarperDB. It took one of our competitors, FaunaDB, about 4 years just to get to market, but we launched the beta of our product in 6 months, the original version in 12 months, and we just released our cloud product a few months ago (about 3 years later). We’re not saying we’re geniuses, but by developing in Node we got to stand on the shoulders of people like AlaSQL developers which is what we find amazing about the npm community.

我们选择AlaSQL作为HarperDB的后端功能 ,它具有一些我们没有的好东西,并且允许我们连接Math.js和GeoJSON之类的东西,因此它是一个了不起的软件包。 对这种语言使用Node的一个令人惊奇的好处是,随着技术的进步,您想要和需要的大多数很酷的东西都在npm上。 如果必须构建自己SQL解析器,我们可能仍会构建HarperDB。 我们的竞争对手之一FaunaDB大约花了4年才进入市场,但是我们在6个月内发布了该产品的测试版,而在12个月内发布了该产品的原始版本,而我们几个月前才发布了我们的云产品(大约3年后)。 我们并不是说我们是天才,而是通过在Node中进行开发,我们得以站在像AlaSQL开发人员这样的人的肩膀上,这使npm社区感到惊奇。

Maths.js (Maths.js)

  • HarperDB uses math.js functions inside our SQL

    HarperDB在我们SQL中使用math.js函数
  • Allows for enhanced math capability while leveraging the capabilities of npm community

    允许增强数学功能,同时利用npm社区的功能

Maths.js is another incredible package for things like averages, data science, etc., that we wired into our SQL capability. It’s not hard to use and very powerful in combination with AlaSQL.

Maths.js是另一个不可思议的软件包,用于处理平均值,数据科学等问题,我们将其连接到SQL功能中。 与AlaSQL结合使用并不难,而且功能非常强大。

集群/复制 (Clustering/Replication)

  • Built on SocketCluster.io

    建立在SocketCluster.io上
  • Fault tolerant

    容错
  • Peer-to-Peer

    点对点
  • Table level replication

    表级复制
  • Globally shared schema

    全局共享架构
  • Distributed Computing

    分布式计算
Image for post
Image for post

Another very cool feature of building something in Node.js is that it’s stateless by nature, meaning it does not require holding data in memory that is critical to serving clients across sessions, which is very resource efficient. Most enterprise grade applications have background processes and stateful variables that can become highly unstable. Node is stateless, designed for the web, designed to scale horizontally and to be peer-to-peer. An amazing benefit from using a Node framework is that we were able to wire in SocketCluster to power our clustering and replication. HarperDB uses a simple pub-sub model, so we replicate data by publishing data to different chat rooms which different nodes subscribe to and are able to be distributed horizontally. Node can be horizontally scalable and less resource intensive than other languages, and its stateless nature makes it incredibly stable. By putting Node on lots of computers (horizontally scaling) you can make the framework significantly more powerful while driving down costs, having easier development, and being part of an awesome community.

在Node.js中构建内容的另一个非常酷的功能是,它本质上是无状态的,这意味着它不需要在内存中保存对跨会话服务客户端至关重要的数据,这非常节省资源。 大多数企业级应用程序都有后台流程和有状态变量,它们可能变得非常不稳定。 Node是无状态的,专为Web设计,可水平扩展并实现点对点。 使用Node框架的一个惊人好处是我们能够连接SocketCluster来为我们的集群和复制提供支持。 HarperDB使用简单的pub-sub模型,因此我们通过将数据发布到不同的聊天室来复制数据,不同的聊天室由不同的节点订阅,并且可以水平分布。 与其他语言相比,节点可以水平扩展且资源占用较少,并且它的无状态性质使其变得异常稳定。 通过将Node放置在许多计算机上(水平扩展),您可以使该框架更加强大,同时降低成本,简化开发并成为一个很棒的社区的一部分。

LMDB和文件系统 (LMDB & File System)

  • Originally built our exploded data model on the file system

    最初在文件系统上构建了爆炸数据模型
  • Problematic due to the generation of many files taking up inodes and excess disk space, and other issues

    由于生成许多文件占用了索引节点和过多的磁盘空间,因此出现了问题
  • Rebuilt data model on LMDB

    在LMDB上重建数据模型
  • Massive performance gain

    巨大的性能提升
Image for post

Originally we were using the file system directly with the above HarperDB data model, this is what makes the product unique. As data comes in, we map it to our data model, it’s not a SQL engine or NoSQL engine. We exploded that data into individual attributes and stored them in a folder structure on the file system. We store each thing atomically, and you can query via SQL and NoSQL. We did run into some challenges at scale, so more recently we wired in a package called LMDB, a key value store that we operate on top of. We were able to implement our exact data model on top of that and it has provided incredible performance gains. In a recent benchmark we were about 37 times faster than MongoDB, largely thanks to LMDB.

最初,我们直接将文件系统与上述HarperDB数据模型一起使用,这就是使该产品独特的原因。 随着数据的传入,我们将其映射到我们的数据模型,而不是SQL引擎或NoSQL引擎。 我们将数据分解为单独的属性,并将其存储在文件系统上的文件夹结构中。 我们以原子方式存储每件事,您可以通过SQL和NoSQL查询。 我们确实遇到了一些挑战,所以最近我们连接了一个名为LMDB的程序包 ,这是我们赖以生存的关键价值存储。 在此基础上,我们能够实现精确的数据模型,并且提供了令人难以置信的性能提升。 在最近的基准测试中,我们的速度比MongoDB快37倍 ,这在很大程度上要归功于LMDB。

Read, Enjoy and Join the Programming Hub!

阅读,享受并加入编程中心!

Recommended Medium Articles from The Programming Hub:

编程中心推荐的中级文章:

翻译自: https://medium.com/the-programming-hub/a-database-written-in-node-js-from-the-ground-up-ecb84ae3ddc8

从头编写操作系统

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值