那个时候我不得不破解自己的Reddit密码-CSDN博客

by Haseeb Qureshi

由Haseeb Qureshi

那个时候我不得不破解自己的Reddit密码 (That time I had to crack my own Reddit password)

(金田) ((Kinda.))

I have no self-control.

我没有自制力。

Luckily, I know this about myself. This allows me to consciously engineer my life so that despite having the emotional maturity of a heroin-addicted lab rat, I’m occasionally able to get things done.

幸运的是，我了解我自己。这使我能够有意识地设计自己的生活，以便尽管拥有像海洛因迷恋的实验大鼠那样的情绪成熟，但我偶尔还是能够完成工作。

I waste a lot of time on Reddit. If I want to procrastinate on something, I’ll often open a new tab and dive down a Reddit-hole. But sometimes you need to turn on the blinders and dial down distractions. 2015 was one of these times — I was singularly focused on improving as a programmer, and Redditing was becoming a liability.

我在Reddit上浪费了很多时间。如果我想拖延时间，通常会打开一个新标签，然后浏览Reddit洞。但有时您需要打开百叶窗并降低干扰。 2015年是其中的一次-我专心于以程序员的身份进行改进，而Redditing成为了责任。

I needed an abstinence plan.

我需要一个戒酒计划。

So it occurred to me: how about I lock myself out of my account?

事情就这样发生了：我如何将自己锁定在帐户之外？

Here’s what I did:

这是我所做的：

I set a random password on my account. Then I asked a friend to e-mail me this password on a certain date. With that, I’d have a foolproof way to lock myself out of Reddit. (Also changed the e-mail for password recovery to cover all the bases.)

我在帐户上设置了随机密码。然后，我请朋友在特定日期通过电子邮件将密码发送给我。这样，我就可以通过一种万无一失的方法将自己锁定在Reddit之外。 (还更改了用于恢复密码的电子邮件，以涵盖所有基础。)

This should have worked.

这应该起作用了。

Unfortunately it turns out, friends are very susceptible to social engineering. The technical terminology for this is that they are “nice to you” and will give you back your password if you “beg them.”

不幸的是，事实证明，朋友非常容易受到社会工程学的影响。技术术语是，它们“对您很友好”，如果您“乞求它们”，则会给您密码。

After a few rounds of this failure mode, I needed a more robust solution. A little Google searching, and I came across this:

经过几轮这种故障模式后，我需要一个更可靠的解决方案。进行了一次Google搜索，结果发现：

Perfect — an automated, friend-less solution! (I’d alienated most of them by now, so that was a big selling point.)

完美-自动化，无朋友的解决方案！ (到目前为止，我已经疏远了其中大多数人，因此这是一个很大的卖点。)

A bit sketchy looking, but hey, any port in a storm.

看起来有点粗略，但是，嘿，风暴中的任何端口。

For a while I set this up this routine — during the week I’d e-mail myself my password, on the weekends I’d receive the password, load up on internet junk food, and then lock myself out again once the week began. It worked quite well from what I remember.

有一阵子我设置了这个例程—在一周内，我会用电子邮件发送自己的密码，在周末，我会收到密码，加载互联网垃圾食品，然后在一周开始后再次将自己锁定。从我记得的情况来看，它运行得很好。

Eventually I got so busy with programming stuff, I completely forgot about it.

最终我忙于编程，却完全忘记了。

削减到两年后。 (Cut to two years later.)

I’m now gainfully employed at Airbnb. And Airbnb, it so happens, has a large test suite. This means waiting, and waiting of course means internet rabbit holes.

我现在在Airbnb任职。碰巧的是，Airbnb有一个大型测试套件。这意味着等待，而等待当然意味着互联网兔子洞。

I decide to scrounge up my old account and find my Reddit password.

我决定扩充我的旧帐户并找到我的Reddit密码。

Oh no. That’s not good.

不好了。这不好。

I didn’t remember doing this, but I must have gotten so fed up with myself that I locked myself out until 2018. I also set it to “hide,” so I couldn’t view the contents of the e-mail until it’s sent.

我不记得这样做了，但是我一定对自己感到厌倦，以至于我将自己锁定到2018年 。我还将其设置为“隐藏”，因此在发送电子邮件之前，我无法查看其内容。

What do I do? Do I just have to create a new Reddit account and start from scratch? But that’s so much work.

我该怎么办？我是否只需要创建一个新的Reddit帐户并从头开始？但这是很多工作。

I could write in to LetterMeLater and explain that I didn’t mean to do this. But they would probably take a while to get back to me. We’ve already established I’m wildly impatient. Plus this site doesn’t look like it has a support team. Not to mention it would be an embarrassing e-mail exchange. I started brainstorming elaborate explanations involving dead relatives about why I needed access to the e-mail…

我可以写信给LetterMeLater并解释说我不是故意这样做的。但是他们可能需要一段时间才能回复我。我们已经确定我非常不耐烦。另外，这个网站看起来好像没有支持团队。更不用说这将是一个令人尴尬的电子邮件交换。我开始集思广益，对死者亲属进行详尽的解释，以解释为什么我需要访问该电子邮件……

All of my options were messy. I was walking home that night from the office pondering my predicament, when suddenly it hit me.

我所有的选择都很混乱。那天晚上，我正从办公室步行回家，想着我的困境，突然间，我陷入了困境。

The search bar.

搜索栏。

I pulled up the app on my mobile phone and tried it:

我在手机上打开了该应用程序并进行了尝试：

Hmm.

嗯

Okay. So it’s indexing the subject for sure. What about the body?

好的。因此，它确实可以索引主题。那身体呢？

I try a few letters, and voila. It’s definitely got the body indexed. Remember: the body consisted entirely of my password.

我尝试几个字母，瞧。它肯定有索引的身体。记住：尸体完全由我的密码组成。

Essentially, I’ve been given an interface to perform substring queries. By entering in a string into the search bar, the search results will confirm whether my password contains this substring.

本质上，已经为我提供了执行子字符串查询的接口。 通过在搜索栏中输入字符串，搜索结果将确认我的密码是否包含此子字符串。

We’re in business.

我们在做生意。

I hurry into my apartment, drop my bag, and pull out my laptop.

我急忙走进我的公寓，放下行李，拿出笔记本电脑。

Algorithms problem: you are given a function substring?(str), which returns true or false depending on whether a password contains any given substring. Given this function, write an algorithm that can deduce the hidden password.

算法问题：给您一个函数substring?(str) ，该函数根据密码是否包含任何给定的子字符串返回true或false。 给定此功能，编写一种可以推断出隐藏密码的算法。

算法 (The Algorithm)

So let’s think about this. A few things I know about my password: I know it was a long string with some random characters, probably something along the lines of asgoihej2409g. I probably didn’t include any upper-case characters (and Reddit doesn’t enforce that as a password constraint) so let’s assume for now that I didn’t — in case I did, we can just expand the search space later if the initial algorithm fails.

因此，让我们考虑一下。关于密码，我知道几件事：我知道这是一个长字符串，带有一些随机字符，可能类似于asgoihej2409g 。我可能未包含任何大写字符(并且Reddit并未将其强加为密码约束)，所以现在让我们假设我没有-如果是的话，我们可以稍后扩展搜索空间初始算法失败。

We also have a subject line as part of the string we’re querying. And we know the subject is “password”.

我们还在查询的字符串中包含一个主题行。而且我们知道主题是“密码”。

Let’s pretend the body is 6 characters long. So we’ve got six slots of characters, some of which may appear in the subject line, some of which certainly don’t. So if we take all of the characters that aren’t in the subject and try searching for each of them, we know for sure we’ll hit a unique letter that’s in the password. Think like a game of Wheel of Fortune.

让我们假设正文为6个字符长。因此，我们有六个字符插槽，其中一些可能会出现在主题行中，而某些字符肯定不会出现。因此，如果我们采用主题中没有的所有字符并尝试搜索它们中的每一个，那么我们肯定会打中密码中唯一的字母。像命运之轮游戏一样思考。

We keep trying letters one by one until we hit a match for something that’s not in our subject line. Say we hit it.

我们会一遍一遍地尝试字母，直到找到与主题行不符的内容为止。说我们打吧。

Once I’ve found my first letter, I don’t actually know where in this string I am. But I know I can start building out a bigger substring by appending different characters to the end of this until I hit another substring match.

找到第一个字母后，我实际上不知道该字符串在哪里。但是我知道我可以通过在结尾加上不同的字符来开始构建更大的子字符串，直到遇到另一个子字符串匹配为止。

We’ll potentially have to iterate through every character in our alphabet to find it. Any of those characters could be correct, so on average it’ll hit somewhere around the middle, so given an alphabet of size A, it should average out to A/2 guesses per letter (let’s assume the subject is small and there are no repeating patterns of 2+ characters).

我们可能必须遍历字母表中的每个字符才能找到它。这些字符中的任何一个都可能是正确的，因此平均而言，它将命中中间的某个位置，因此，给定一个大小为A的字母，它应该平均为每个字母A/2猜测(假设主题很小，并且没有2个以上字符的重复模式)。

I’ll keep building this substring until it eventually hits the end and no characters can extend it further.

我将继续构建此子字符串，直到最终将其终止，并且没有字符可以进一步扩展它。

But that’s not enough — most likely, there will be a prefix to the string that I missed, because I started in a random place. Easy enough: all I have to do is now repeat the process, except going backwards.

但这还不够—因为我是从一个随机的地方开始的，所以很可能会在我错过的字符串前添加一个前缀。足够容易：我要做的就是重复此过程，除了向后走。

Once the process terminates, I should be able to reconstruct the password. In total, I’ll need to figure outL characters(where L is the length), and need to expend on average A/2 guesses per character (where A is the alphabet size), so total guesses = A/2 * L.

一旦过程终止，我应该能够重建密码。总共，我需要找出L字符(其中L是长度)，并且平均每个字符要花费A/2猜测(其中A是字母大小)，因此总猜测= A/2 * L 。

To be precise, I also have to add another 2A to the number of guesses for ascertaining that the string has terminated on each end. So the total is A/2 * L + 2A, which we can factor as A(L/2 + 2).

确切地说，我还必须在猜测数上再加上2A ，以确定字符串在两端均已终止。因此，总数为A/2 * L + 2A ，我们可以将其视为A(L/2 + 2) 。

Let’s assume we have 20 characters in our password, and an alphabet consisting of a-z (26) and 0–9 (10), so a total alphabet size of 36. So we’re looking at an average of 36 * (20/2 + 2) = 36 * 12 = 432 iterations.

假设我们的密码中有20个字符，并且由az (26)和0–9 (10)组成的字母，所以总字母大小为36。因此，我们平均来看36 * (20/2 + 2) = 36 * 12 = 432次迭代。

Damn.

该死的。

This is actually doable.

这实际上是可行的。

实施 (The Implementation)

First things first: I need to write a client that can programmatically query the search box. This will serve as my substring oracle. Obviously this site has no API, so I’ll need to scrape the website directly.

首先，我需要编写一个可以以编程方式查询搜索框的客户端。这将作为我的子字符串oracle。显然，该网站没有API，因此我需要直接抓取该网站。

Looks like the URL format for searching is just a simple query string, www.lettermelater.com/account.php?qe=#{query_here}. That’s easy enough.

看起来，用于搜索的URL格式只是一个简单的查询字符串， www.lettermelater.com/account.php? qe=#{query_here} www.lettermelater.com/account.php? qe=#{query_here} 。那很容易。

Let’s start writing this script. I’m going to use the Faraday gem for making web requests, since it has a simple interface that I know well.

让我们开始编写此脚本。我将使用Faraday gem进行Web请求，因为它具有一个我很熟悉的简单界面。

I’ll start by making an API class.

我将从制作一个API类开始。

Of course, we don’t expect this to work yet, as our script won’t be authenticated into any account. As we can see, the response returns a 302 redirect with an error message provided in the cookie.

当然，由于我们的脚本不会在任何帐户中进行身份验证，因此我们不希望此方法起作用。如我们所见，响应返回302重定向，并在cookie中提供错误消息。

[10] pry(main)> Api.get(“foo”)

=> #<Faraday::Response:0x007fc01a5716d8

...

{“date”=>”Tue, 04 Apr 2017 15:35:07 GMT”,

“server”=>”Apache”,

“x-powered-by”=>”PHP/5.2.17",

“set-cookie”=&gt;”msg_error=You+must+be+signed+in+to+see+this+page.”,

“location”=>”.?pg=account.php”,

“content-length”=>”0",

“connection”=>”close”,

“content-type”=>”text/html; charset=utf-8"},

status=302>

So how do we sign in? We need to send in our cookies in the header, of course. Using Chrome inspector we can trivially grab them.

那么我们如何登录？当然，我们需要在标题中发送cookie 。使用Chrome检查器，我们可以轻松抓取它们。

(Not going to show my real cookie here, obviously. Interestingly, looks like it’s storing user_id client-side which is always a great sign.)

(显然，这里不会显示我的真实cookie。有趣的是，看起来它存储了user_id客户端，这总是一个好兆头。)

Through process of elimination, I realize that it needs both code and user_id to authenticate me… sigh.

通过消除过程，我意识到它需要code和user_id来对我进行身份验证…叹气。

So I add these to the script. (This is a fake cookie, just for illustration.)

所以我将它们添加到脚本中。 (这是一个伪造的cookie，仅供说明。)

[29] pry(main)> Api.get(“foo”)=> “\n<!DOCTYPE HTML PUBLIC \”-//W3C//DTD HTML 4.01//EN\” \”http://www.w3.org/TR/html4/strict.dtd\">\n<html>\n<head>\n\t<meta http-equiv=\”content-type\” content=\”text/html; charset=UTF-8\” />\n\t<meta name=\”Description\” content=\”LetterMeLater.com allows you to send emails to anyone, with the ability to have them sent at any future date and time you choose.\” />\n\t<meta name=\”keywords\” content=\”schedule email, recurring, repeating, delayed, text messaging, delivery, later, future, reminder, date, time, capsule\” />\n\t<title>LetterMeLater.com — Account Information</title>…

[30] pry(main)> _.include?(“Haseeb”)=> true

It’s got my name in there, so we’re definitely logged in!

我的名字在那里，所以我们一定已经登录！

We’ve got the scraping down, now we just have to parse the result. Luckily, this pretty easy — we know it’s a hit if the e-mail result shows up on the page, so we just need to look for any string that’s unique when the result is present. The string “password” appears nowhere else, so that will do just nicely.

我们已经努力了，现在我们只需要分析结果即可。幸运的是，这很容易-如果电子邮件结果显示在页面上，我们知道这很成功，因此，只要显示结果，我们只需要查找唯一的任何字符串即可。字符串“ password”在其他任何地方都不会出现，因此效果很好。

That’s all we need for our API class. We can now do substring queries entirely in Ruby.

这就是我们API类所需要的。现在，我们可以完全在Ruby中进行子字符串查询。

[31] pry(main)> Api.include?('password')

=> true

[32] pry(main)> Api.include?('f')

=> false

[33] pry(main)> Api.include?('g')

=> true

Now that we know that works, let’s stub out the API while we develop our algorithm. Making HTTP requests is going to be really slow and we might trigger some rate-limiting as we’re experimenting. If we assume our API is correct, once we get the rest of the algorithm working, everything should just work once we swap the real API back in.

现在，我们知道它可以工作，让我们在开发算法时将API存根。发出HTTP请求将会非常缓慢，我们可能会在进行实验时触发一些速率限制。如果我们假设我们的API是正确的，那么一旦我们完成了算法的其余部分的工作，一旦交换了真实的API，一切都应该工作了。

So here’s the stubbed API, with a random secret string:

所以这是存根API，带有一个随机的秘密字符串：

We’ll inject the stubbed API into the class while we’re testing. Then for the final run, we’ll use the real API to query for the real password.

我们将在测试时将存根API注入类中。然后，对于最后的运行，我们将使用真实的API查询真实的密码。

So let’s get started with this class. From a high level, recalling my algorithm diagram, it goes in three steps:

因此，让我们开始这个课程。从高层次上回顾一下我的算法图，它分为三个步骤：

First, find the first letter that’s not in the subject but exists in the password. This is our starting off point.
首先，找到不在主题中但在密码中存在的第一个字母。这是我们的出发点。
Build those letters forward until we fall off the end of the string.
向前构建那些字母，直到我们脱离字符串的结尾。
Build that substring backwards until we hit the beginning of the string.
向后构建该子字符串，直到我们击中字符串的开头。

Then we’re done!

然后，我们完成了！

Let’s start with initialization. We’ll inject the API, and other than that we just need to initialize the current password chunk to be an empty string.

让我们从初始化开始。我们将注入API，除此之外，我们只需要将当前密码块初始化为空字符串即可。

Now let’s write three methods, following the steps we outlined.

现在，按照我们概述的步骤，编写三种方法。

Perfect. Now the rest of the implementation can take place in private methods.

完善。现在，其余的实现可以在私有方法中进行。

For finding the first letter, we need to iterate over each character in the alphabet that’s not contained in the subject. To construct this alphabet, we’re going to use a-z and 0–9. Ruby allows us to do this pretty easily with ranges:

为了找到第一个字母，我们需要遍历该主题中未包含的字母中的每个字符。为了构造这个字母，我们将使用az和0–9 。 Ruby允许我们使用范围轻松地做到这一点：

ALPHABET = ((‘a’..’z’).to_a + (‘0’..’9').to_a).shuffle

I prefer to shuffle this to remove any bias in the password’s letter distribution. This will make our algorithm query A/2 times on average per character, even if the password is non-randomly distributed.

我更喜欢将其改组以消除密码字母分布中的任何偏差。即使密码是非随机分配的，这也会使我们的算法平均每个字符查询A / 2次。

We also want to set the subject as a constant:

我们还想将主题设置为常量：

SUBJECT = ‘password’

That’s all the setup we need. Now time to write find_starting_letter. This needs to iterate through each candidate letter (in the alphabet but not in the subject) until it finds a match.

这就是我们所需的全部设置。现在该写find_starting_letter 。这需要遍历每个候选字母(在字母中但不在主题中)，直到找到匹配项为止。

In testing, looks like this works perfectly:

在测试中，看起来像这样完美工作：

PasswordCracker.new(ApiStub).send(:find_starting_letter!) # => 'f'

Now for the heavy lifting.

现在进行繁重的工作。

I’m going to do this recursively, because it makes the structure very elegant.

我将递归执行此操作，因为它使结构非常优雅。

The code is surprisingly straightforward. Let’s see if it works with our stub API.

该代码非常简单。让我们看看它是否可以与我们的存根API一起使用。

[63] pry(main)> PasswordCracker.new(ApiStub).crack!

fj

fjp

fjpe

fjpef

fjpefo

fjpefoj

fjpefoj4

fjpefoj49

fjpefoj490

fjpefoj490r

fjpefoj490rj

fjpefoj490rjg

fjpefoj490rjgs

fjpefoj490rjgsd

=> “fjpefoj490rjgsd”

Awesome. We’ve got a suffix, now just to build backward and complete the string. This should look very similar.

太棒了我们有一个后缀，现在只是为了向后构建并完成字符串。这看起来应该非常相似。

In fact, there’s only two lines of difference here: how we construct the guess, and the name of the recursive call. There’s an obvious refactoring here, so let’s do it.

实际上，这里只有两行区别：我们如何构造guess和递归调用的名称。这里有一个明显的重构，让我们开始吧。

Now these other calls simply reduce to:

现在，这些其他调用只需简化为：

And let’s see how it works in action:

让我们看看它是如何工作的：

Apps-MacBook:password-recovery haseeb$ ruby letter_me_now.rb

Current password: 9

Current password: 90

Current password: 90r

Current password: 90rj

Current password: 90rjg

Current password: 90rjgs

Current password: 90rjgsd

Current password: 90rjgsd

Current password: 490rjgsd

Current password: j490rjgsd

Current password: oj490rjgsd

Current password: foj490rjgsd

Current password: efoj490rjgsd

Current password: pefoj490rjgsd

Current password: jpefoj490rjgsd

Current password: fjpefoj490rjgsd

Current password: pfjpefoj490rjgsd

Current password: hpfjpefoj490rjgsd

Current password: 0hpfjpefoj490rjgsd

Current password: 20hpfjpefoj490rjgsd

Current password: 420hpfjpefoj490rjgsd

Current password: g420hpfjpefoj490rjgsd

g420hpfjpefoj490rjgsd

Beautiful. Now let’s just add some more print statements and a bit of extra logging, and we’ll have our finished PasswordCracker.

美丽。现在，我们只添加一些打印语句和一些额外的日志记录，我们就完成了PasswordCracker 。

And now… the magic moment. Let’s swap the stub with the real API and see what happens.

现在……神奇的时刻。让我们用真实的API交换存根，看看会发生什么。

关键时刻 (The Moment of Truth)

Cross your fingers…

交叉手指...

PasswordCracker.new(Api).crack!

Boom. 443 iterations.

繁荣。 443次迭代。

Tried it out on Reddit, and login was successful.

在Reddit上进行了尝试，登录成功。

Wow.

哇。

It… actually worked.

它...确实有效。

Recall our original formula for the number of iterations: A(N/2 + 2). The true password was 22 characters, so our formula would estimate 36 * (22/2 + 2) = 36 * 13 = 468 iterations. Our real password took 443 iterations, so our estimate was within 5% of the observed runtime.

回忆一下我们原始的迭代次数公式： A(N/2 + 2) 。真正的密码是22个字符，因此我们的公式将估算出36 * (22/2 + 2) = 36 * 13 = 468次迭代。我们的真实密码进行了443次迭代，因此我们的估计在观察到的运行时的5％以内。

Math.

数学。

It works.

有用。

Embarrassing support e-mail averted. Reddit rabbit-holing restored. It’s now confirmed: programming is, indeed, magic.

避免了令人尴尬的支持电子邮件。 Reddit兔子洞恢复了。现在已经确认：编程确实是魔术。

(The downside is I am now going to have to find a new technique to lock myself out of my accounts.)

(缺点是我现在必须找到一种新技术来将自己锁定在帐户之外。)

And with that, I’m gonna get back to my internet rabbit-holes. Thanks for reading, and give it a like if you enjoyed this!

有了这些，我将回到我的网上兔子洞。感谢您的阅读，如果喜欢的话，给它一个赞！

—Haseeb