为什么将字符串称为字符串?

This is the editorial for the July 25th edition of the SitePoint PHP newsletter.

这是SitePoint PHP新闻通讯7月25日版的社论。



Why is a string called a string? Have you ever given this some thought? We never use such a word in contexts other than programming for a set of letters sticking together, and yet – in programming it’s as pervasive as the word “variable”. Why is that, and where does it come from?

为什么将字符串称为字符串? 你有没有想过这个想法? 除了编程将一组字母粘贴在一起外,我们从不在上下文中使用过这个词,但是–在编程中,它与“变量”一词一样普遍。 为什么会这样,它从哪里来?

To find out, we have to tackle some related terms first. History lesson time!

为了找出答案,我们必须先解决一些相关术语。 历史课时间!

Abstract old calendar illustration

The word font is derived from the French fonte – something that has been melted; a casting. Given that letters for printing presses were literally made of metal and smelted at type foundries, that makes sense.

字体从法国丰泰衍生-这已经被熔化; 铸件。 鉴于印刷机的信件实际上是由金属制成并在类型铸造厂熔化的 ,所以这是有道理的。

The metal letters of a font

The terms uppercase and lowercase refer to the literal part of the case in which the font was transported. So the printer (person) had a heavy case he lugged around or had set up at a printing press, and in this case were two “levels” – an upper case, and a lower case. The upper case contained only – you guessed it – UPPERCASE letters, while the lower case only contained lowercase ones.

术语大写小写表示传输字体的情况的原义部分。 因此,打印机(人)有一个沉重的箱子,他随身携带或在印刷机上摆放东西,在这种情况下,有两个“层次” –一个大写和一个小写。 大写字母仅包含–您猜到了–大写字母,而小写字母仅包含小写字母。

Printing press case

You’ll notice that there were more lowercase letters than uppercase ones. This was to be expected – a letter could only be used once on a single page and after all, a written body of text will have many more lowercase letters than uppercase ones, as there was no such thing as Youtube comments and CAPS LOCK yet.

您会注意到,小写字母比大写字母更多。 这是意料之中的–一个字母只能在一页纸上使用一次,毕竟,书面文本中的小写字母要比大写字母多得多,因为还没有Youtube注释和CAPS LOCK这样的东西。

So how does all this relate to strings?

那么所有这些与字符串有什么关系呢?

Well, as printing became more mainstream and printing presses began offering their services to individuals, not just newspapers and publishers, it is said they decided to charge based on the length of the printed material – length in feet. Granted, a lot of this is speculative, but if they strung together the produced, printed material, they could easily estimate the costs and bill customers. So we can conclude with a reasonable degree of certainty that they used the word string in this context as a sequence of characters.

嗯,随着印刷变得越来越主流,印刷机开始向个人提供服务,而不仅仅是报纸和出版商, 据说他们决定根据印刷材料的长度(以英尺为单位)收费。 当然,这很多是投机性的,但是如果将生产的印刷材料串在一起 ,则可以轻松估算成本并向客户收费。 因此,我们可以合理地确定,他们在此上下文中使用字符串作为字符序列。

Edit, July 26th 2017: As pointed out in the comments below, it seems that there was an actual string in use to tie the character blocks together as they were transported to the press after being assembled! A Twitter follower even sent me the following video, demonstrating the process!

编辑,2017年7月26日:正如以下评论所指出的那样,似乎在组装时将字符块运输到印刷机时使用了实际的字符串将字符块绑在一起! 一个推特关注者甚至向我发送了以下视频,演示了该过程!

Still, how does this relate to the programming field? I mean, you could say a string of anything in regards to anything at all and it would make a degree of sense in the non-programming world. It’s just a word that can be applied generally quite easily to things, even though it generally isn’t.

尽管如此,这与编程领域有何关系? 我的意思是,您可以说一连串关于任何事物的事物,这在非编程世界中具有一定意义。 这只是一个单词,通常可以很容易地应用到事物上,即使通常不是这样。

What if we look across academia for first references?

如果我们在学术界寻找第一手资料怎么办?

透过望远镜看一个人的矢量图

In 1944’s Recursively enumerable sets of positive integers and their decision problems we have a mention that could vaguely resemble the modern definition:

在1944年的正整数递归枚举集及其决策问题中,我们提到了可能模糊地类似于现代定义的情况:

For working purposes, we introduce the letter 6, and consider “strings” of 1’s and b’s such as 11b1bb1.

出于工作目的,我们引入字母6,并考虑1和b的“字符串”,例如11b1bb1。

In this paper, the term refers to a sequence of identical symbols, so a string of 1’s or a string of b’s. Not exactly our definition but it’s a start.

在本文中,该术语指的是相同符号的序列,因此是一串1或一串b。 不完全是我们的定义,但这只是一个开始。

Then, a full 14 years later, in 1958’s A Programming Language for Mechanical Translation, the word is used thusly, and only once:

随后整整14年,即1958年的《机械翻译编程语言》中 ,仅使用了该词一次,但仅使用一次:

Each continuous string of letters between punctuation marks or spaces is looked up in the dictionary.

在字典中查找标点符号或空格之间的每个连续字符串。

Okay, kiiiind of similar to our notion of strings, but it seems like he’s just describing, well, words. Obviously, that cannot apply – it’s too generic. For some reason, though, it seems to have stuck.

好的,有点类似于我们的字符串概念,但是似乎他只是在描述单词 。 显然,这不适用-它太笼统了。 但是由于某种原因,它似乎卡住了。

In 1958’s A command language for handling strings of symbols, the word string is used in exactly the same way we use it today, albeit not defined as such.

在1958年的A用于处理符号字符串的命令语言中 ,单词string的使用方式与我们今天使用的方式完全相同 ,尽管并未如此定义

We find one more reference in 1959, The COMIT system for mechanical translation:

我们在1959年发现了另一本有关机械翻译的COMIT系统参考:

If we want to replace D SIN(F) by COS(F) D (F), where F is unrestricted and may be any arbitrary sequence of constituents, we use the notation $ to stand for this string.

如果我们想用COS(F)D(F)替换D SIN(F),其中F是不受限制的,并且可以是任意的组成序列,则可以使用$来表示该字符串。

Interesting! Here’s the dollar sign we all know from PHP, and which was (is?) actually the string symbol in BASIC.

有趣! 这是我们从PHP都知道的美元符号,实际上是(BASIC)中的字符串符号。

Again in 1959 we have a more direct definition in The Share 709 System: Machine Implementation of Symbolic Programming:

再次在1959年,我们在“共享709系统:符号编程的机器实现”中有了一个更直接的定义:

The text is a linearly ordered string of bits representing the rest of the information required in the loading and listing processes.

文本是一个线性排列的位字符串,表示加载和列出过程中所需的其余信息。

In fact, it was through ALGOL in April of 1960 that string seems to have taken its modern-day shorthand form “string” (up until then people said string of [something]). See this paper’s abstract.

实际上,通过ALGOL于1960年4月,字符串似乎采用了其现代的缩写形式“字符串”(直到那时人们才说[something]字符串 )。 参见本文摘要。

Then finally, in May 1960, the Report on the Algorithmic Language Algol 60 mentions it in a form that hits home.

最后,在1960年5月, 关于算法语言Algol 60报告以一种流行的形式提到了它。

论文中字符串类型的定义

From there, it just takes off like a modern day meme.

从那里,它就像现代的模因一样起飞。

In 1963 METEOR: A LISP Interpreter for String Transformations goes with the rather unspecific “[…] but certain simple transformations of linear lists (strings) are awkward to define in this notation.”.

1963年, METEOR:用于字符串转换的LISP解释器带有相当不明确的“ […],但在此符号中定义线性列表(字符串)的某些简单转换并不方便。”

In 1964, On declaring arbitrarily coded alphabets mentions “character strings”.

1964年, 在声明任意编码的字母时提到了“字符串”。

Searching ACM reveals a bunch of other resources in the 60s and later which all now use the term regularly, so it seems the 60s were a catalyst in the term’s evolution and made it what it is today, slowly, through the needs of the systems it found itself in. Kind of funny that it ended up representing a similar concept as in the printing press days – a set of characters which has a meaning and carries with it some costs (only this time, in memory).

搜索ACM会发现60年代之后的大量其他资源,这些资源现在都定期使用该术语,因此60年代似乎是该术语演变的催化剂,并通过系统的需求使其逐渐成为今天。自己发现了。有点可笑的是,它最终代表了与印刷界类似的概念-一组字符,具有一定的含义并带有一定的成本(仅这次是在内存中)。

As a side note – consider all those papers from over 60 years ago. 60 years ago they had computer science problems they were solving on punch cards, and writing about in academic papers. And here we are in 2017 with 2017 JavaScript frameworks, fighting about who can have sex with whom in Drupal’s community and trying to redefine the word Facade over and over again. While we’re arguing about the rocket science of “stuff goes into a box, stuff comes out of a box” that is modern web development, those people back then shaped the entire world by translating the analog environments they found themselves in into digital, by essentially tricking a little bit of sand into remembering numbers.

附带说明–考虑60年前的所有这些论文。 60年前,他们遇到了计算机科学问题,需要在打Kong卡上解决问题,并在学术论文中发表文章。 在2017年,我们将拥有2017个JavaScript框架,在Drupal社区中与谁可以和谁发生性行为展开斗争,并试图一遍又一遍地重新定义Facade一词。 当我们争论火箭科学是“东西塞进盒子,东西塞满盒子”(这是现代网络开发)时,那些人通过将自己发现的模拟环境转化为数字环境,从而塑造了整个世界,通过从本质上欺骗一点沙子来记住数字。

结论 (Conclusion)

So now we know – or at least think we know – where string comes from. Computer science has always been a dark space of mysteries and slow evolution, and just like we now know that the human eye has had half-stages and semi-eyes in its past, so too have terms in computer science evolved past and around their original meaning, until they gave us what we have today. The 1960’s have, in various locations all at once, given birth to the same concept with the same name, until it evolved into one unified term that we all understand and use today and, most importantly, can agree on.

因此,现在我们知道-或至少认为我们知道- 字符串来自何处。 计算机科学一直是谜团和缓慢发展的黑暗空间,就像我们现在知道人眼在过去拥有半阶段和半眼一样,计算机科学的术语也在过去和最初的历史中得到了发展。直到他们给了我们今天的一切为止。 1960年代在不同的地方一次生出了具有相同名称的相同概念,直到它演变成我们大家今天都理解和使用的一个统一术语,最重要的是可以达成共识。

When you think about it, was there a better word we could have used? While string hardly feels natural due to the complete detachment from a similar term in the “real world” (we don’t call words on a book’s page “strings”), I fail to think of any term which would fit this popular data type better. Can you? Let me know.

当您考虑一下时,我们可以使用更好的词吗? 由于完全脱离了“现实世界”中的类似术语(我们不称书页上的单词为“字符串”),因此字符串很难让人感到自然,但我没有想到任何适合这种流行数据类型的术语更好。 你能? 让我知道。

翻译自: https://www.sitepoint.com/why-is-a-string-called-a-string/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值