比程序员更好的职业_如何成为一个更好的程序员:点点滴滴

比程序员更好的职业

In the past year, I've had two different developers come to me for help with problems that were actually quite simple. The underlying issue in both cases was that the developers had either never learned or had forgotten everything about working with the most basic unit of computing - the bit.

在过去的一年中,我遇到了两个不同的开发人员,他们针对实际上非常简单的问题寻求帮助。 在这两种情况下,潜在的问题是开发人员从未学习过或忘记了与最基本的计算单元一起工作的一切。

A lot of programming languages have gotten so high-level and easy to use that it's become easy to forget about things like bits. Why worry about those when your boss is asking you to fix a bug with e-mail address validation?

许多编程语言已经变得如此高级和易于使用,以至于人们很容易忘记比特之类的东西。 当老板要求您使用电子邮件地址验证修复错误时,为什么还要担心这些问题?

So you fix the bug and then your boss asks you to enforce IP filters so that only people with IP addresses within 71.23.45.0/24 can hit certain web pages, which prompts you to adopt a confused look. You've seen that kind of IP address notation before and you know it has something to do with masking, so you start looking for an explanation. When you find that people are talking about bitwise operators you get that dread in your stomach because dealing with bits is like a foreign language to you.

因此,您修复了该错误,然后老板要求您强制执行IP筛选器,以便只有IP地址在71.23.45.0/24之内的用户才能访问某些网页,这会提示您采用混乱的外观。 您之前已经看过这种IP地址表示法,并且知道它与屏蔽有关,因此您开始寻找解释。 当您发现人们在谈论按位运算符时,您会感到不寒而栗,因为处理位对您来说就像是一门外语。

Here are three pieces of good news:

这是三个好消息:

#1. Bits are a lot easier to understand than a foreign language.

#1。 比特比外语更容易理解。

#2. You probably already know a lot of the concepts but don't realize you know them.

#2。 您可能已经知道很多概念,但没有

#3. Understanding bits will make you a better programmer.

#3。 了解位将使您成为一名更好的程序员。

Let's dive in!

让我们潜入吧!

数十亿……浪费了! (Billions... Wasted!)

想象一下,您正在电视上观看自己喜欢的运动,然后看一下记分板,看看球队的得分是多少。 看来您的球队有122分。 然后,您将意识到数字“ 122”被分为3个独立的大型LCD屏幕:“ 1”,“ 2”和“ 2”。 哇,您认为-真是浪费钱。 这些液晶显示屏中的每个液晶显示屏可能要花费数百美元,并且它们可以很容易地在单个液晶显示屏上显示完整的数字“ 122”。

Yet, this is precisely the kind of wasteful thing that many programmers do every day, and I'll give a more relevant example. Let's say that an average programmer named Matt is storing the ages of people in a database. Matt creates his database and creates a column that holds up to 3 characters (since most people will not live beyond 999 years old). All of his data seems to be stored perfectly well in the database and he can get it back out easily later on. He thinks everything is fine.

但是,这恰恰是许多程序员每天要做的那种浪费的事情,我将举一个更相关的示例。 假设一位名叫Matt的普通程序员正在将人们的年龄存储在数据库中。 Matt创建了自己的数据库,并创建了一个最多包含3个字符的列(因为大多数人的年龄不会超过999岁)。 他的所有数据似乎都很好地存储在数据库中,以后他可以轻松地将其取回。 他认为一切都很好。

What he doesn't realize is that he's using up to 3 bytes of data for information that can be stored in 1 byte of data. That's up to 2 extra bytes of wasted data. It doesn't seem like much, but let's say that 5 years later, he has 10 million contacts in his database. Now that waste has multiplied into 20,000,000 bytes of wasted data! Now let's say that he has a backup system which stores up to 30 days of full backups. He now is storing 20 million wasted bytes * 30 days = 600,000,000 bytes of wasted data altogether - all because he didn't use the proper data type at the beginning.

他没有意识到的是,他正在使用多达3个字节的数据来存储可以存储在1个字节的数据中的信息。 浪费的数据最多要多出2个字节。 看起来并不多,但是可以说,五年后,他的数据库中有1000万名联系人。 现在,浪费已经变成了20,000,000字节的浪费数据! 现在,假设他有一个备份系统,该系统最多可以存储30天的完整备份。 现在,他总共存储了2000万个浪费的字节* 30天=总共浪费了600000000字节的数据-所有这些都是因为他一开始没有使用正确的数据类型。

Hopefully, most programmers will use an integer for an age column, but I've seen many people create 15-character text fields on their databases to store IP addresses like "123.145.167.189" - and that's just IPv4 addresses. We're entering the age of IPv6 addresses like "fe80:abcd:ef01:2345:6789:0abc:def0:1234", and I can already see people creating 39-character text fields to store that data. Programmers are doing this because they don't understand bits well enough to know how an IPv4 address can actually fit into 4 bytes instead of 15. 

希望大多数程序员会在年龄列中使用整数,但是我已经看到很多人在他们的数据库中创建15个字符的文本字段来存储IP地址,例如“ 123.145.167.189”,而这仅仅是IPv4地址。 我们正在输入IPv6地址的年龄,例如“ fe80:abcd:ef01:2345:6789: 0abc:def0: 1234”,我已经看到人们创建了39个字符的文本字段来存储该数据。程序员之所以这样做,是因为他们对位的了解不够深,无法知道IPv4地址实际上可以容纳4个字节而不是15个字节。

Imagine Matt creating a database that stores access logs of a web server, and it stores the IP addresses as 15-character strings. Five years later, with 30 days of backups, we're talking about many gigabytes of wasted space and bandwidth.

想象一下Matt创建了一个存储Web服务器访问日志的数据库,并将IP地址存储为15个字符的字符串。 五年后,经过30天的备份,我们谈论的是数GB的浪费的空间和带宽。

钻头,地下室和灯泡 (Bits, Basements, and Bulbs)

因此,让我们谈谈Matt如何改善自己的处境。 第一步是了解单个字节可以做什么-您可能已经知道了部分或大部分内容。

A single byte can hold any number between 0 and 255. This is because a single byte is made up of 8 bits, and each bit is either a 0 (off) or a 1 (on).

一个字节可以容纳0到255之间的任何数字。这是因为单个字节由8位组成,并且每个位都是0(关闭)或1(打开)。

An easier way to think about this is to imagine going into a completely dark basement. There are eight light switches on the wall, so you flip the one on the right, and a small lamp in the corner with one tiny lightbulb turns on. It makes the room a little brighter, but you want more brightness, so you flip the next switch (the second one from the right), and this time, a different lamp turns on. This new lamp has two lightbulbs in it, so now you have three lightbulbs turned on in the room.

考虑这一点的一种更简单的方法是想象进入一个完全黑暗的地下室。 墙上有八个电灯开关,因此您将其向右拨动,在角落的一盏小灯打开,上面有一个小灯泡。 它使房间更亮一些,但是您想要更高的亮度,因此请拨动下一个开关(右侧的第二个开关),这一次,另一个灯点亮。 这个新灯泡中有两个灯泡,所以现在您在房间里打开了三个灯泡。

You flip on the third light switch from the right, and an even bigger lamp turns on - this one has 4 lightbulbs in it, so now you have a total of 7 lightbulbs turned on.

您从右侧打开第三个电灯开关,然后打开一个更大的灯-该灯中有4个灯泡,因此现在总共有7个灯泡打开了。

You start realizing that the number of bulbs is doubling on each lamp as you move from the right to the left. So you flip the left-most light switch and suddenly the room gets really bright because a huge chandelier with (yes, you guessed it) 128 lightbulbs suddenly turns on. So now you have a total of 135 lightbulbs turned on.

您开始意识到,当您从右向左移动时,每个灯泡上的灯泡数量将增加一倍。 因此,您打开最左侧的电灯开关,房间突然变得

You turn on all of the eight light switches, and you end up with every lamp and chandelier in the room turned on, with a total of 255 light bulbs (and a really bright basement).

打开所有八个电灯开关,最后打开房间中的每盏灯和枝形吊灯,总共有255个灯泡(和一个

This is precisely how a computer thinks about numbers. A byte is like a room with 8 light switches that goes from a total value of 0 (no switches turned on) all the way up to 255 (all switched turned on). So a value of zero looks like 00000000 while a value of 255 looks like 11111111. Or an easier way to look at it might be:

这正是计算机思考数字的方式。 一个字节就像一个房间,上面有8个电灯开关,从总值0(没有开关打开)一直到255(所有开关都打开)。 因此,零值看起来像000000000,而255值看起来像11111111。或者更简单的方法可能是:

A byte value of zero has all bits set to off:

字节值零将所有位设置为关闭:

 0   0   0   0   0   0   0   0
___ ___ ___ ___ ___ ___ ___ ___
128  64  32  16  8   4   2   1

A byte value of 73 is made up of the bits for 64, 8, and 1 being turned on:

字节值73由打开64、8和1的位组成:

 0   1   0   0   1   0   0   1
___ ___ ___ ___ ___ ___ ___ ___
128  64  32  16  8   4   2   1

A byte value of 255 has all bits turned on:

字节值255将所有位打开:

 1   1   1   1   1   1   1   1
___ ___ ___ ___ ___ ___ ___ ___
128  64  32  16  8   4   2   1

Simple enough, right? So what happens if we need to store the number 256?

很简单,对不对? 那么,如果我们需要存储数字256会发生什么呢?

我们需要更大的船 (We're Gonna Need a Bigger Boat)

很好的问题-很高兴您提出这个问题,简单的答案是:使用另一个字节! 较难的问题是如何将两个字节一起使用。 毕竟,每个字节只能容纳一个最大为255的值。

The answer is simply a matter of instructing the computer to treat both bytes as one, big number. If you think about it, we already do this every day with numbers. For example, the number 24 is technically two digits - 2 and 4. But when we read them together, the 2 isn't just a 2 - it represents 20. So when a computer needs to deal with numbers that are larger than a single byte, it simply needs to be told that "there are 2 bytes here, and they should be considered as one big number."

答案很简单,就是指示计算机将两个字节都视为一个大数字。 如果您考虑一下,我们已经每天使用数字进行此操作。 例如,数字24在技术上是两位数字-2和4。但是当我们一起阅读它们时,数字2不仅是2-还代表20。因此,当计算机需要处理大于单个数字的数字时,字节,只需告知“这里有2个字节,应该将它们视为一个大数字”。

或者也许更大一点... (Or Maybe Just Bigger Bits....)

在编程语言中,单个字节有8位,因此它是8位数字。 两个字节的数字是一个16位数字,通常称为“短”。 当计算机被告知将2个字节视为16位数字时,它会保持相同的加倍数字模式,因此一个字节的值突然变得更大。 看起来像这样:
_____ _____ _____ _____ _____ _____ _____ _____   ___ ___ ___ ___ ___ ___ ___ ___
32768 16384  8192  4096  2048  1024  512   256    128  64  32  16  8   4   2   1
                  Byte #1                                      Byte #2

So by being told to use two bytes, a computer can handle numbers between 0 and 65,535. 

因此,通过被告知使用两个字节,计算机可以处理0到65,535之间的数字。

So there's really no difference in how the data is stored - every piece of data out there is just a series of 8-bit bytes. However, with a little bit of programming instructions, a computer can be told to sometimes use more than one byte in order to hold larger numbers.

因此,数据的存储方式实际上没有任何区别-每个数据片段只有一系列的8位字节。 但是,通过少量编程指令,可以告诉计算机有时使用一个以上的字节来保存更大的数字。

This same concept extends into 32-bit numbers and 64-bit numbers. A 32-bit number, commonly known as a "long", is made up of four bytes and can hold numbers up to 4,294,967,295. A 64-bit number uses eight bytes and holds values up to 18,446,744,073,709,551,615.

同样的概念扩展到32位数字和64位数字。 一个32位数字(通常称为“长”)由四个字节组成,最多可容纳4,294,967,295个数字。 一个64位数字使用八个字节并保存最多18,446,744,073,709,551,615的值 。

Bazillions:一个IP地址爱情故事 (Bazillions: An IP Address Love Story)

现在我们进入了128位计算的时代,128位数字的第一个最常见的应用是IPv6标准。 IPv4实际上是一个32位数字(再次是“长”),该地址中最多允许42亿个IP地址,而IPv6地址却是128位数字。 我将允许您考虑IPv6允许多少个IP地址。

Some people might look at an IP address and say, "That's not a 32-bit number." What they are missing is that the whole "###.###.###.###" syntax is simply a visual representation of a long number. Each ### is simply the value of each byte. So "72.123.45.67" is actually:

有些人可能会看一个IP地址,然后说:“这不是32位数字。” 他们缺少的是整个“ ###。### .. ###。###”语法只是一个长数字的视觉表示。 每个###只是每个字节的值。 因此,“ 72.123.45.67”实际上是:

Byte #1 = 72

字节#1 = 72

Byte #2 = 123

字节#2 = 123

Byte #3 = 45

字节#3 = 45

Byte #4 = 67

字节#4 = 67

If we actually take those 4 bytes and tell the computer to treat them as one big number, we get 1,216,032,067. Likewise, you can "convert" 1,216,032,067 back into that IP address at any time. So in reality, Average Matt could store all his IPv4 IP addresses as "long" numbers, taking up only 4 bytes of space each time, instead of up to whopping 15 bytes of text per each one.

如果我们实际上取了这4个字节,并告诉计算机将它们视为一个大数,则得到1,216,032,067。 同样,您可以随时将“ 1,216,032,067”转换回该IP地址。 因此,实际上,Average Matt可以将其所有IPv4 IP地址存储为“长”数字,每次仅占用4个字节的空间,而不是每个文本最多占用15个字节的文本。

An additional benefit to storing them as long numbers is that you can use math with numbers. Need to find out if 72.123.45.67 is between 70.1.1.1 and 82.1.2.3? You can either break up the strings into pieces and compare each set of numbers, or you can simply convert all three to long numbers and do simple less-than and greater-than comparisons.

将它们存储为长数字的另一个好处是可以将数学与数字一起使用。 是否需要找出72.123.45.67是否在70.1.1.1和82.1.2.3之间? 您可以将字符串分成多个部分并比较每组数字,也可以将所有三个数字都转换为长数字,然后进行小于和大于的简单比较。


只是一点消极 (
Just a Little Bit of Negativity)

兴趣点-计算机通常需要使用负数。 因此,几乎所有数字都可以被视为带符号或无符号。 到目前为止,我们一直在谈论无符号数字(或者数字前面没有负号的可能性),因此8位数字从0到255,即16位数字从0到65,535,依此类推。

When the computer is told to treat a number as a signed number, the maximum value is basically split in half, and you get half of it as negative, and half as positive. So a signed 8-bit number can go from -128 to 127. A signed 16-bit number can go from -32,768 to 32,767. One bit is used to indicate whether the number is negative or positive.

当告诉计算机将数字视为带符号的数字时,最大值基本上被分成两半,您将得到一半为负数,一半为正数。 因此,带符号的8位数字可以从-128到127。带符号的16位数字可以从-32,768到32,767。 一位用于指示数字是负数还是正数。

Some languages, like PHP, will default to using signed numbers all the time. So when you try to tell it to deal with a really large 32-bit number, like 3,000,000,000, it can only go up to 2,147,483,647 before it has nowhere else to go. So it will then "wrap around". So if you tell PHP to convert "127.255.255.255" to a long, it'll correctly tell you the result is 2147483647, which is the maximum value for a signed 32-bit number. However, if you just add 1 and tell it to convert "128.0.0.0" to a long, it will come back with -2147483648.

某些语言(例如PHP)将默认始终使用带符号的数字。 因此,当您尝试告诉它处理非常大的32位数字(例如3,000,000,000)时,它只能上升到2,147,483,647,然后再无处可去。 因此它将“环绕”。 因此,如果您告诉PHP将“ 127.255.255.255”转换为长整数,则可以正确地告诉您结果为2147483647,这是有符号32位数字的最大值。 但是,如果只加1并告诉它将“ 128.0.0.0”转换为长整数,它将返回-2147483648。

So it can be helpful to understand what your programming language is doing for you whenever you ask it to deal with different types of numbers.

因此,当您要求编程语言处理不同类型的数字时,了解编程语言在为您做什么是有帮助的。

位图和其他位的创意用途 (Bit Maps and Other Creative Uses for Bits)

继续前进,使用位通常会很有用。 可以使用一个字节的全部8位有很多应用程序和创造性的方式,它们不一定总是与数学有关的问题。 例如,假设您要以5分钟为单位记录一个复杂的每日工作时间表,如下所示:

My Billable Time

我的计费时间

12:05 AM - 12:15 AM

上午12:05-上午12:15

12:30 AM - 12:35 AM

上午12:30-上午12:35

1:00 AM - 1:35 AM

1:00 AM-1:35 AM

Imagine you had a hundred entries in your log. Now, you could store the start and stop time for every entry. Or you could simply use bits to represent 5-minute blocks throughout the day:

想象一下,您的日志中有一百个条目。 现在,您

Bit #1 = 12:00 AM - 12:05 AM

位#1 = 12:00 AM-12:05 AM

Bit #2 = 12:05 AM - 12:10 AM

位#2 = 12:05 AM-12:10 AM

Bit #3 = 12:10 AM - 12:15 AM

位#3 = 12:10 AM-12:15 AM

...etc...

...等等...

So in the above example's first three entries, you could have a "bit map" that looks like this:

因此,在上面的示例的前三个条目中,您可以拥有一个如下所示的“位图”:

011000100000111111100000

011000100000111111100000

The first bit is 0 because you didn't work from 12:00 AM to 12:05 AM. The next two bits are 1, indicating that you worked from 12:05 AM to 12:10 AM and from 12:10 AM to 12:15 AM, and so on.

第一位是0,因为您从12:00 AM到12:05 AM都没有工作。 接下来的两位是1,表示您是从12:05 AM到12:10 AM以及从12:10 AM到12:15 AM工作,依此类推。

So in 24 bits (3 bytes), we've described someone's activity in 5-minute increments for the first 2 hours of the day! You could do the same thing for an entire 24-hour period in only 36 bytes. For the purposes of comparison, that's the same amount of bytes it takes to store the following part of this paragraph: "So in 24 bits (3 bytes), we've descr". So in the same few bits that would be used to store that fragment of a sentence, you could store the entirety of someone’s billable time with 5-minute accuracy!

因此,我们以24位(3字节)为单位,描述了一天中前2个小时某人的活动(以5分钟为增量)! 您可以在整个36小时内仅用36个字节来执行相同的操作。 为了进行比较,存储该段的以下部分所需的字节数相同:“因此,在24位(3字节)中,我们已经描述了”。 因此,用与存储句子片段相同的几分,您可以以5分钟的准确度存储某人的整个计费时间!

Of course, the results wouldn't really make any sense as numbers. You'd have all sorts of "random" numbers, but the total values of each byte would have no purpose here. You'd be using each byte to simply record eight 5-minute blocks of time.

当然,结果对于数字来说并没有任何意义。 您将拥有各种各样的“随机”数字,但是每个字节的总值在此毫无用处。 您将使用每个字节来简单地记录八个5分钟的时间块。


这个,这个和这个,但不是那个 (
This, This, and This, But Not That)

使用位的另一种常见方式是允许某人混合和匹配权限。 UNIX系统上的文件权限可能非常简单。 您可以读取文件,写入文件和/或执行文件。 那么,如果您希望某人读取并执行文件 not write a file? Bits to the rescue!

The common permission set looks like this:

通用权限集如下所示:

Bit value of 1 = Execute

位值为1 =执行

Bit value of 2 = Write

位值为2 =写入

Bit value of 4 = Read

位值为4 =读

So if you want full permissions, you end up with 7 (1 + 2 + 4). If you want just read and execute permissions, you'd use a permission of 5 (1 + 4). If you want read and write, but not execute, then you'd have a permission of 6 (2 + 4). You can store any combination of up to 8 different permissions in a single byte!

因此,如果要获得完全许可,最终将得到7(1 + 2 + 4)。 如果您只想读取和执行权限,可以使用5(1 + 4)的权限。 如果您想读写但不执行,那么您的权限为6(2 + 4)。 您可以在一个字节中存储多达8种不同权限的任意组合!


按位运算符 (
Bitwise Operators)

由于这里具有所有独特的应用程序,因此大多数编程语言都提供了一组用于专门处理位的功能和工具。 这些通常称为“按位运算符”,并且它们在不同语言之间倾向于具有相似的语法。 通常,您将具有用于向左或向右移动位的工具(例如,将位向左移动看起来像是从00000001到00000010,这实际上与将数字乘以2相同)。 您还经常会使用一些工具来翻转位(所有0变为1,所有1变为0),或比较两组不同的位(例如,“向我显示值A中的所有1位也都设置在值B中”) )。

使用IP地址掩码按位 (Being Bitwise with IP Address Masks)

操作员很简单,只有少数几个,但是他们确实需要一些额外的思考才能有效地使用。 例如,按位运算符将一个复杂的概念(例如IP地址网络掩码)(例如我们之前的CIDR示例“ 71.23.45.0/24”)转变为一个比较简单的1和0的问题。 这只是找出别人如何决定他们如何使用钻头的问题。 在CIDR示例中,“ / 24”仅表示“组成此IP地址的32位,将前24位保持原样,然后其余8位定义范围从0到255。 ” 因此,“ 71.23.45.0/24”的字面意思是“保留71.23.45”部分。 换句话说,就是“ 71.23.45.0到71.23.45.255”。


这个有帮助吗? 单击一个按钮! (
Was this Helpful? Click a Button!)

希望这会开辟一些思考数据的新方法,以及如何更有效地利用数据的每个字节。 感谢您的阅读,如果您喜欢这篇文章,请务必对本文进行投票。

翻译自: https://www.experts-exchange.com/articles/21379/How-to-Be-a-Better-Programmer-Bits-and-Bytes.html

比程序员更好的职业

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值