你的句子应该多久

Writers like to write. Shocking, I know. But how much are we supposed to write before people lose interest?

作家喜欢写作。 令人震惊,我知道。 但是在人们失去兴趣之前,我们应该写多少?

To write is to translate the billions of interelated ideas in our minds into a few lines of prose, with the hope that whoever reads it will parse the same meaning. Long, complex writing is our way of making sure text holds its message.

写作是将我们脑海中数十亿个相互关联的思想转变成几行散文,希望阅读本书的人能解析出相同的含义。 漫长而复杂的写作是我们确保文本保留其信息的方式。

But, it also makes reading frustrating.

但是,这也使阅读令人沮丧。

When people talk about measuring good writing, sentence length often comes up. It makes sense, doesn’t it? Long sentences are boring, brief ones are “punchier” (whatever that means). So, let’s all write really short, right?

当人们谈论衡量良好的写作时,句子的长度经常会出现。 这是有道理的,不是吗? 长句子很无聊,简短句子是“ punchier”(无论如何)。 所以,让我们都写得很短吧?

This doesn’t always work out, as Gary Provost demonstrates here:

正如加里·普罗佛斯特(Gary Provost)在这里演示的那样,这并不总是可行的:

This sentence has five words. Here are five more words. Five-word sentences are fine. But several together become monotonous. Listen to what is happening. The writing is getting boring. The sound of it drones. It’s like a stuck record. The ear demands some variety.

这句话有五个字。 这里还有五个字。 五个字的句子就可以了。 但是,几个一起变得单调。 听听发生了什么。 写作变得无聊。 它的声音无人驾驶。 就像卡住了。 耳朵需要一些变化。

Quick sentences don’t work if we lack “flow”. A lot of people talk about writing lacking “flow” — but don’t understand what they mean by it. Luckily, Provost gives us the solution.

如果我们缺少“流程”,那么快速的句子是行不通的。 很多人都在谈论缺乏“流”的写作,但他们并不理解其含义。 幸运的是,Provost为我们提供了解决方案。

Now listen. I vary the sentence length, and I create music. Music. The writing sings. It has a pleasant rhythm, a lilt, a harmony. I use short sentences. And I use sentences of medium length. And sometimes, when I am certain the reader is rested, I will engage him with a sentence of considerable length, a sentence that burns with energy and builds with all the impetus of a crescendo, the roll of the drums, the crash of the cymbals–sounds that say listen to this, it is important.”

现在听。 我改变句子的长度,并创作音乐。 音乐。 写作唱歌。 它具有令人愉悦的节奏,轻快,和谐。 我用简短的句子。 我使用中等长度的句子。 有时候,当我确定读者已经休息时,我会用一段相当长的句子与他交往,这句话充满能量,并伴随着渐强,鼓的滚动,cy的crash撞而发扬光大。 –听起来很重要的声音。”

Varying sentence length is what seperates good writers from mediocre ones. Editors know this. Newer writers struggle with it. It defies a writer’s every instinct to shut up, be concise, and let words speak for themselves until the moment is right.

句子长短的变化将优秀作家与普通作家区分开。 编辑知道这一点。 新作家为此而苦苦挣扎。 它无视作家的每一个直觉,要闭嘴,简明扼要,让话语为自己说话,直到时机成熟。

But now the problem is how you’re supposed to apply this lesson. How short is too short? How much is too much? Should I have three 6 word long sentences, followed by a 10 word one? It’s an easy criticism to make — that a writer lacks “flow”. It’s harder to say exactly how to fix it.

但是现在的问题是,您应该如何应用本课程。 多短就是多短? 多少是太多了? 我应该有3个6字长的句子,然后是10字长的句子吗? 容易做出批评-作家缺乏“流动”。 很难说出确切的解决方法。

There’s a mathematical way to measure how much values differ called “standard deviation”. Using a formula we can calculate how much values “deviate” (ie. are different) from the mean. If we count how many words are in each sentence we can find out the standard deviation of sentence length for an entire text. The higher the value is, the more variety in length we’ll find.

有一种数学方法可以测量多少差异,称为“标准偏差”。 使用公式,我们可以计算出与平均值“偏离”(即不同)多少值。 如果我们计算每个句子中有多少个单词,我们可以找出整个文本的句子长度的标准差。 值越高,我们发现的长度就越多样化。

Back in the day, getting metrics on longer texts took either hours of your time or an armada of interns. Neither of these were accessible to a starving writer. Today, technology can automate the work that tedium makes impossible. We can do this in a few lines of Python. Behold:

过去,获取较长文本的指标需要花费您数小时的时间或一批实习生。 挨饿的作家都无法获得这些东西。 今天,技术可以使繁琐的工作自动化。 我们可以用几行Python做到这一点。 看哪:

sent_tok = nltk.sent_tokenize(text)
print(len(sent_tok))
sent_len=[]
for i in sent_tok:
sent_w = nltk.word_tokenize(i)
sent_len.append(len(sent_w))
print(sent_len)
df = pd.DataFrame(sent_len)
print(“Mean Sentence Length: “ + str(statistics.mean(sent_len)))
print(“Standard Devation of Sentence Length: “ + str(statistics.stdev(sent_len)))

(Note: I adapted this code from the one in this blog. It’s also got some great insights into academic writing if you want to read around!)

(注意:我改编自此博客中的代码。如果您想阅读一下,它也对学术写作有很好的见解!)

The code breaks up the text into sentences, then breaks each sentence into words (or, as the code calls them, “tokens”). After removing punctuation, it counts the number of words in each sentence and makes a list of these values. We can then run calculations on those lists. If you don’t understand this, don’t worry! All you need to know is we’re going to get three results: a list of sentence lengths, the mean average, and the standard deviation.

该代码将文本分解为句子,然后将每个句子分解为单词(或者,如代码所称的那样,则为“令牌”)。 删除标点符号后,它会计算每个句子中的单词数,并列出这些值。 然后,我们可以在这些列表上进行计算。 如果您不明白这一点,请不要担心! 您需要知道的是,我们将获得三个结果:句子长度,平均数和标准差的列表。

If we run this on the “boring” and “musical” parts of Provost’s quote, it spits out the following:

如果我们根据Provost报价的“无聊”和“音乐”部分运行此命令,则会吐出以下内容:

[5, 5, 4, 5, 5, 5, 5, 7, 5]
Mean Sentence Length: 5.111111111111111
Standard Deviation of Sentence Length: 0.7817359599705717[2, 9, 1, 3, 9, 4, 7, 53]
Mean Sentence Length: 11
Standard Deviation of Sentence Length: 17.246117575517438

“Cool,” you say. “But why would I care that one number is bigger? I write. I don’t like doing maths.” Stay with me.

“很酷,”你说。 “但是为什么我要关心一个数字更大? 我写。 我不喜欢数学。” 跟我在一起。

The point of this exercise is that, despite our assumption that a lower average length is better, the “boring” section is shorter. What matters is how much length changes. That’s the “standard deviation” part.

该练习的目的在于,尽管我们假设平均长度越短越好,但“无聊”部分却较短。 重要的是长度变化了多少。 这就是“标准偏差”部分。

We can also make distribution charts to illustrate this. Provost’s text is a bit too short to plot out, but we can do it with longer texts.

我们还可以制作分布图来说明这一点。 Provost的文本太短了,无法绘制出来,但是我们可以用更长的文本来做。

So, let’s run our code on some text in the wild. Here’s what I got when I ran it on a New York Times piece on COVID-19 in Europe:

因此,让我们在某些文本上运行我们的代码。 这是我在《纽约时报》上刊登的有关欧洲COVID-19的文章中得到的:

[34, 11, 43, 3, 22, 20, 29, 19, 17, 31, 34, 35, 34, 10, 9, 10, 38, 33, 28, 8, 15, 15, 8, 11, 59, 7, 28, 16, 19, 37, 37, 34, 10, 55, 34, 15, 18, 22, 27, 22, 23, 19, 34, 23, 66, 28, 15, 22, 23, 28, 31, 26, 19, 32, 8, 45, 12, 26, 16, 43, 68, 15, 31, 28, 20, 13, 8]
Mean Sentence Length: 25.059701492537314
Standard Deviation of Sentence Length: 13.806981761561381

You’ll notice that although the sentence length is longer, you still get that high variance.

您会注意到,尽管句子的长度更长,但您仍然会得到很高的差异。

Using the Seaborn module we can make a distribution graph for sentence lengths (yep, programmers, that’s why we put everything in a dataframe!)

使用Seaborn模块,我们可以创建句子长度的分布图(是的,程序员,这就是为什么我们将所有内容都放在数据框中!)

On this graph (and every following graph in this article), we’re plotting the distribution as a histogram of sentences against each length. With Seaborn we also get a Kernel Density Estimation line which is… err… complicated to explain. All you need to know for now is we’re going to be keeping it on for illustrative purposes. You’ll see why when we’re stacking graphs on top of each other.

在此图(以及本文中的每个随后的图)上,我们将分布分布绘制为针对每个长度的句子直方图。 与Seaborn一起,我们还获得了内核密度估算线,该线很容易解释。 您现在需要知道的是,为了说明目的,我们将继续使用它。 您会看到为什么当我们将图相互堆叠时。

Look at how few of the sentences get longer than the mid-30s and how the peak is just under 20. Now think about how long yours are.

看看有多少句子比30年代中期的句子更长,以及峰值如何刚好低于20。现在考虑一下句子的长度。

Furthermore, I’m convinced that the New York Times has specific rules for sentence length:

此外,我深信《纽约时报》对句子时长有具体规定:

Image for post

The distribution shape is almost identical. Even though there are wordier articles, there’s still a lot of sentences below 20 in length. The two peaks are around 20 and around 35. It’s a very tight style. Even if there’s nobody in the office counting words there are probably journalists who unconciously know to keep writing fluid.

分布形状几乎相同。 即使文章比较冗长,但仍有很多句子的长度在20以下。 两个峰值分别在20和35附近。这是一个非常紧凑的样式。 即使办公室里没有人在数词,也可能有新闻记者不自觉地知道要保持流畅。

For comparison, lets look at an article in Surfer magazine. Surfing is a very niche sport with excellent writers, but you’d expect editorial guidelines to be a bit more lenient than a front page New York Times story.

为了进行比较,让我们看一下《 Surfer》杂志上的一篇文章。 冲浪是一项非常优秀的作家运动,它是一种利基运动,但是您希望编辑指南比《纽约时报》头版的报道更为宽容。

[37, 46, 27, 39, 9, 34, 29, 19, 35, 34, 41, 4, 4, 37, 19, 25, 10, 18, 57, 21, 22, 37, 22, 53, 31, 9, 28, 24, 22]
Mean Sentence Length: 27.344827586206897
Standard Deviation of Sentence Length: 13.475575423889183

Although the deviation is similar, overall sentence length is longer. If we visualize this next to a New York Times article we can see the difference more clearly.

尽管偏差相似,但总体句子长度较长。 如果我们在《纽约时报》的文章旁边将其可视化,则可以更清楚地看到差异。

Image for post

See how the peak has nudged to the right? How the curve is sharper? This isn’t to say the surf writer is worse than an editorial journalist (there’s plenty of lazy journalism in the world!). It’s that there’s a difference in style between a niche sports mag and a national publication. Niche writing tends to be wordier. People already interested in a topic already have enough context, and don’t need to digest their information in chunks. A more general audience needs hand-holding.

看看高峰如何向右移动? 曲线如何锐化? 这并不是说冲浪作家要比社论记者差(世界上有很多懒惰的新闻!)。 就是说,利基运动杂志和国家出版物之间在风格上存在差异。 利基的写作往往比较冗长。 已经对某个主题感兴趣的人已经具有足够的上下文,并且不需要大量地消化其信息。 更广泛的受众需要掌握。

Certain kinds of writers watch their style hawkishly — like advertising copywriters. Watch what happens when we run our analyzer on Ogilvy’s Rolls Royce and How to Create Advertising ads:

某些类型的作家鹰派地看着自己的风格,例如广告文案。 观看我们在奥美劳斯莱斯上运行分析仪时会发生什么以及如何制作广告广告:

Rolls Royce:
[46, 20, 8, 26, 8, 10, 10, 9, 3, 12, 8, 11, 7, 16, 24, 23, 17, 13, 8, 28, 13, 8, 14, 5, 8, 6, 18, 12, 1, 8, 8, 25]
Mean Sentence Length: 13.53125
Standard Deviation of Sentence Length: 9.108607680154401How to Create Advertising that Works:
[27, 15, 1, 4, 27, 12, 19, 20, 12, 3, 4, 1, 2, 13, 11, 18, 9, 4, 22, 1, 2, 13, 20, 23, 1, 2, 18, 24, 32, 5, 15, 1, 3, 13, 39, 1, 6, 8, 9, 6, 9, 3, 3, 1, 1, 5, 12, 8, 11, 4, 1, 4, 13, 13, 19, 10, 10, 8, 1, 2, 25, 19, 21, 1, 5, 22, 9, 17, 11, 1, 4, 6, 8, 9, 8, 19, 6, 4, 11, 8, 16, 7, 1, 5, 9, 10, 5, 14, 24, 1, 2, 12, 6, 3, 5, 3, 1, 3, 9, 12, 1, 2, 6, 20, 7, 8, 1, 2, 11, 1, 2, 5, 11, 6, 15, 1, 1, 13, 1, 3, 11, 15, 18, 18, 1, 3, 11, 6, 12, 6, 21, 12, 1, 2, 8, 9, 12, 1, 3, 10, 16, 15, 1, 1, 21, 6, 1, 16, 21, 15, 1, 3, 13, 1, 3, 15, 25, 11, 3, 1, 2, 11, 11, 1, 6, 28, 13, 14, 33, 2, 15, 1, 3, 29, 1, 5, 18, 10, 19, 11, 4, 13, 23, 26, 6, 1, 3, 11, 11, 1, 3, 12, 21, 1, 4, 17, 31, 1, 2, 16, 8, 1, 3, 14, 9, 5, 8, 4, 32, 14]
Mean Sentence Length: 9.476190476190476
Standard Deviation of Sentence Length: 8.03651384594596

Look at how big the deviation is compared to the average sentence length in both. Notice how small the average length is overall.

看看两者之间的偏差与平均句子长度相比有多大。 请注意,平均长度是多少。

Image for post

Copywriting is about persuasion, and good persuasion is about being deliberate. Being deliberate means knowing when to hold back and when to let loose. Hence why the length spikes just under 10 words. There’s very little activity over 30 — if you’re a budding advertiser, that’s an easy takeaway. Every single word in these ads was chosen on purpose. Nothing is wasted. It shows. These are some of the most successful written advertisements in history.

文案写作是关于说服力,而良好的说服力则是刻意。 刻意的意思是知道什么时候退缩以及什么时候放松。 因此,为什么长度会激增至不到10个字。 30岁以上的活动很少-如果您是一个崭露头角的广告客户,这很容易理解。 这些广告中的每个单词都是故意选择的。 没有浪费。 表明。 这些是历史上最成功的书面广告之一。

Here’s what they look like on top of that first New York Times article.

这是《纽约时报》第一篇文章的开头。

Image for post

If you’re only going to look at one graph here, let it be this one.

如果您只在这里看一张图,那就让它成为这张图。

Ogilvy’s writing doesn’t use the overt “sales” language we associate with advertising. Nor does it use editorial language. It’s readable. When you read Ogilvy’s writing it seems less like an article and more like someone is speaking to you directly. This builds trust. Editorial content, while more authoritative than advertising, struggles to be familiar. The wording is more convoluted. Arguments meander around before settling on a point. Copy does not.

奥美的写作没有使用与广告相关的公开“销售”语言。 它也不使用编辑语言。 这是可读的。 当您阅读奥美的著作时,它看起来不像是一篇文章,而是更像是有人在直接与您讲话。 这建立了信任。 编辑内容虽然比广告更具权威性,但却难以为人们所熟悉。 措词比较复杂。 在确定一个点之前,争论会四处徘徊。 复制不。

We can also apply this method to fiction. Experienced fiction writers tend to be very deliberate with the pacing of writing. Here’s what we get with a sample chapter from George RR Martin’s (still unreleased!) Winds of Winter:

我们也可以将此方法应用于小说。 有经验的小说作家往往对写作节奏非常谨慎。 这就是乔治·RR·马丁(George RR Martin)的章示例样本(至今尚未发布!)。

[27, 7, [... Cutting a few of these out for brevity! ...], 13]
Mean Sentence Length: 16.936675461741427
Standard Deviation of Sentence Length: 10.966094959175557
Image for post

Despite fantasy’s reputation for wordiness, this has a certain fluidity. I want to write an entire article applying these ideas to fiction writing — so I won’t get lost in the weeds too much. Just remember that, yes, authors need to mix it up too!

尽管幻想以word语着称,但它具有一定的流动性。 我想写一篇完整的文章,将这些思想运用到小说写作中,这样我就不会在杂草中迷失太多。 请记住,是的,作者也需要混合使用!

Here’s something interesting to finish off. Out of curiosity, I analyzed an interview from Surfer magazine. Interviews are a transcript of natural speech even if they are in a formal setting. By running our code on an interview we can compare conversational English with our written texts.

这是一些有趣的结束。 出于好奇,我分析了《冲浪者》杂志的一次采访。 采访即使是正式场合,也是自然语言的笔录。 通过在面试中运行代码,我们可以将会话英语与书面文本进行比较。

[50, 61, 68, 41, 31, 7, 26, 36, 11, 11, 14, 22, 11, 17, 39, 18, 13, 11, 13, 45, 16, 13, 9, 26, 8, 17, 15, 44, 19, 8, 21, 9, 14, 12, 10, 9, 11, 51, 4, 10, 16, 14, 9, 9, 14, 8, 26, 9, 13, 14, 33, 13, 10, 17, 4, 8, 11, 18, 27, 14, 17, 22, 14, 23, 23, 16, 4, 5, 37, 9, 19, 4, 10, 11, 5, 8, 10, 12, 8, 9, 6, 46, 12, 10, 29, 22, 15, 21, 5, 9, 16, 13, 25, 8, 32, 8, 5, 18, 13, 4, 20, 9, 24, 4, 10, 15, 7, 4, 4, 9]
Mean Sentence Length: 16.79090909090909
Standard Deviation of Sentence Length: 12.448127566020386

What’s striking is how alike speech “in the wild” is to copywriting. It’s even more apparent when you plot the distribution.

令人惊讶的是,“野外”演讲与文案写作有何相似之处。 当您绘制分布时,这一点更加明显。

Image for post

Effective copywriting mimics human speech. To write well is to write how people talk.

有效的文案模仿人类的言语。 写得好就是写人们怎么说。

Again, this doesn’t mean the way you write is wrong! What it does mean is certain kinds of writing demand a certain kind of style. Journalism and content need longer, nuanced writing, while copy needs to be fast and precise. This shouldn’t be news to you, but hopefully seeing the difference in a quantitative way reinforces the message.

同样,这并不意味着您编写的方式是错误的! 它的意思是某种写作需要某种风格。 新闻和内容需要更长而细致的写作,而复制则需要快速而精确。 这对您来说不是新闻,但希望看到定量上的差异可以加强信息。

In the end, the only way to develop your own effective style is by emulating the style of others and putting it into practice. If you see writing that ‘flows’, apply some of the lessons here. If an article is enlightening (or, to the contrary, boring) to you, look at how the author spaces out lengthier passages. When you see a longer Facebook ad, look at how short they write.

最后,发展自己的有效风格的唯一方法是模仿他人的风格并将其付诸实践。 如果您看到编写这些“流程”,请在此处应用一些课程。 如果一篇文章对您有所启发(或相反,很无聊),请查看作者如何将篇幅较长的文章排开。 当您看到较长的Facebook广告时,请查看其撰写时间。

And for those wondering, here are the metrics for this article. Try to think about why I’ve chosen to write this way!

对于那些想知道的人,这里是本文的指标。 试着想一想为什么我选择写这种方式!

[6, 4, 4, 3, 31, 13, 9, 12, 4, 11, 8, 16, 5, 4, 5, 5, 5, 5, 6, 5, 9, 19, 9, 2, 9, 1, 3, 9, 4, 7, 53, 11, 3, 3, 18, 13, 5, 5, 13, 13, 14, 11, 5, 24, 20, 12, 10, 10, 12, 16, 21, 19, 8, 9, 25, 22, 10, 2, 6, 3, 22, 7, 6, 9, 19, 12, 35, 24, 23, 16, 19, 14, 20, 21, 17, 9, 6, 21, 10, 29, 10, 18, 9, 5, 23, 18, 6, 20, 6, 12, 42, 8, 11, 13, 9, 19, 10, 3, 2, 11, 15, 16, 14, 6, 3, 21, 3, 11, 5, 8, 3, 8, 13, 16, 14, 24, 11, 8, 10, 16, 16, 13, 10, 5, 9, 14, 12, 6, 11, 15, 15, 19, 24, 13, 21, 13, 11]
Mean Sentence Length: 12.226277372262773
Standard Deviation of Sentence Length: 8.04398999320202
Image for post

翻译自: https://medium.com/swlh/how-long-should-your-sentences-be-4f1b2dda2627

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值