大量数据转移_大量数据

最新推荐文章于 2022-11-28 19:56:46 发布

cumichun6193

最新推荐文章于 2022-11-28 19:56:46 发布

阅读量225

点赞数

文章标签：数据库 java python mysql 人工智能

原文链接：https://www.freecodecamp.org/news/a-flood-of-data-714f287d75a0/

版权

大量数据转移

by BerkeleyTrue

由BerkeleyTrue

大量数据 (A Flood of Data)

Free Code Camp’s data has been doubling each month, thanks to a flood of highly-active campers. This rising tide of data has exposed several weaknesses in our codebase.

由于大量活跃的露营者，免费代码营的数据每个月都在增加一倍。数据的上涨趋势暴露了我们代码库中的一些弱点。

What started out 15 months ago as a small effort has since grown into a vibrant open source community. Nearly 300 contributors have stepped in to help us rapidly build features.

从15个月前开始做起，到现在已经发展成为一个充满活力的开源社区。近300位贡献者已介入以帮助我们快速构建功能。

As usual, maintaining that break-neck speed of development comes at a price. We’ve incurred a lot of technical debt.

像往常一样，保持发展的瓶颈速度是有代价的。我们承担了很多技术债务。

Taking on technical debt is like playing Jenga — you can build your tower taller and taller, but at a cost of stability. Sooner or later, you have to pay down your technical debt, or your tower will come crashing down.

承担技术债务就像玩积木-您可以建造越来越高的塔楼，但要付出稳定的代价。迟早，您必须还清技术债务，否则塔楼将倒塌。

Last week, our technical debt come back to bite us in our back end — both literally and figuratively.

上周，无论从字面上还是在形象上，我们的技术债务都再次给我们带来了痛苦。

During peak times, our MongoDB servers maxed out their capacity, and the rate at which they sent data back and forth to our Node servers slowed to a crawl. We needed fix this, and fast. But first, we had to figure out what was causing the issue to begin with.

在高峰时段，我们的MongoDB服务器将其容量最大化，并且将数据来回发送到Node服务器的速度减慢了爬网速度。我们需要尽快解决此问题。但是首先，我们必须弄清楚是什么导致了问题的开始。

We originally wrote most of our back end in a bleary-eyed crunch mode. We didn’t take the time to optimize our queries. Instead we chose to focus on features that we thought would more immediately impact our user experience.

最初，我们大多数后端都是以盲目的紧缩模式编写的。我们没有花时间来优化查询。相反，我们选择专注于我们认为会立即影响我们的用户体验的功能。

We audited our codebase and found a lot of frequent, inefficient writes to our databases. For instance, every time a camper completed a challenge, we would make the appropriate changes to their user instance, then call the “save” action. This caused the entire user object to be sent from our Node servers to our MongoDB servers, which then had to reconcile all of the data.

我们审核了代码库，发现对数据库的频繁，低效的写入。例如，每次露营者完成一项挑战时，我们都会对其用户实例进行适当的更改，然后调用“保存”操作。这导致整个用户对象从我们的节点服务器发送到我们的MongoDB服务器，然后必须协调所有数据。

This wasn’t a problem initially, because most of our user objects where small. But as we added features, user objects ballooned in size, causing lots more data to flow back and forth.

最初这不是问题，因为我们的大多数用户对象很小。但是，随着我们添加功能，用户对象的大小Swift膨胀，导致更多的数据来回流动。

We also used to save every solution that a camper submitted. This resulted in even larger completedChallenge arrays, which further exacerbated the back-and-forth.

我们还曾经保存过露营者提交的所有解决方案。这样就产生了更大的已完成挑战数组，这进一步加剧了前后关系。

On top of that, this meant some campers had to search through multiple solutions for the same challenge to find the one they wanted to reference. While this may have been interesting exercise for some, it was a distraction from actually coding and building projects.

最重要的是，这意味着一些露营者必须针对同一挑战搜索多个解决方案，以找到他们想参考的解决方案。尽管对于某些人来说这可能是有趣的练习，但实际上却分散了对代码和项目的构建的注意力。

Our fix involved taking two steps:

我们的修复包括两个步骤：

finding the heavily trafficked API endpoints that caused the database write, and changing them from a “save” action to an “update” action (which minimizes the volume of data sent over the wire).
查找导致数据库写操作的流量大的API端点，并将其从“保存”操作更改为“更新”操作(这样可最大程度地减少通过网络发送的数据量)。
transitioning the way we store completed challenges away from a giant array, and over to a key-value map.
从巨大的阵列转变为存储完成的挑战的方式，然后过渡到键值图。

This way, a camper could only have one solution for each challenge. This dramatically reduced the size of the completedChallenges object.

这样，对于每个挑战，露营者只能有一个解决方案。这大大减少了完成的挑战对象的大小。

We pushed up our fix in the middle of a Thursday afternoon, even though we had about 400 concurrent campers at the time. It was a gamble, but it paid off. We immediately saw an improvement in our CPU usage.

即使我们当时有大约400名并发露营者，我们还是在星期四下午中旬提高了定位。这是一场赌博，但有回报。我们立即看到CPU使用率的提高。

The big takeaway is this: if your application seems to be getting slower, there’s a good chance that this is caused by inefficient database queries.

最大的收获是：如果您的应用程序看起来越来越慢，则很有可能是由于数据库查询效率低下引起的。

If you can find these and fix them, you’ll be able to postpone expensive expansions to your infrastructure, while maintaining the speed your users have come to expect.

如果可以找到并修复它们，则可以将昂贵的扩展推迟到基础架构中，同时保持用户期望的速度。

翻译自: https://www.freecodecamp.org/news/a-flood-of-data-714f287d75a0/

大量数据转移

cumichun6193

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
大量数据转移_大量数据

大量数据转移by BerkeleyTrue 由BerkeleyTrue 大量数据 (A Flood of Data)Free Code Camp’s data has been doubling each month, thanks to a flood of highly-active campers. This rising tide of data has exposed severa...
复制链接

扫一扫