自定义按钮动态变化_新闻价值的变化定义

自定义按钮动态变化

I read Bari Weiss’ resignation letter from the New York Times with some perplexity. In particular, I found her claim that she “was hired with the goal of bringing in voices that would not otherwise appear in your pages” a bit strange: Weiss is, after all, a wealthy graduate of Columbia who’s lived in the Northeast her entire life.

我有些困惑地读了《纽约时报》上巴里·韦斯的辞职信。 尤其是,我发现她声称她“被录用的目的是使您的声音不会出现在您的页面中”有点奇怪:Weiss毕竟是哥伦比亚的一位富有的毕业生,她整个居住在东北。生活。

I’ve been playing around with New York Times archive data for some time and wanted to test Weiss’ claim. Has there been a (quantitative) change in the news the New York Times produces, especially since November 2016?

我一直在玩《纽约时报》的存档数据已有一段时间,想测试一下魏斯的说法。 《纽约时报》产生的新闻是否有(定量)变化,尤其是自2016年11月以来?

In this post, I’m going to take a look at what we can learn about “all the news that’s fit to print” from the metadata available in the archive — things like word counts, bylines, and news divisions. I’m looking for ways to approximate the “diversity” of the Times’ coverage since 2015. Essentially, we’re looking for real evidence of all those reporters sent to diners in Trump country in 2016 and after.

在这篇文章中,我将看一下我们可以从档案库中可用的元数据中了解“适合打印的所有新闻”的内容,例如字数统计,署名和新闻划分。 自2015年以来,我一直在寻找逼近《纽约时报》报道“多样性”的方法。从本质上讲,我们正在寻找2016年及之后派往特朗普国家食客的所有记者的真实证据。

(For reference, here’s a map of all the IHOPs in the country. IHOP seemed like a good proxy for diners generally, and Waffle House is too concentrated in the south to be meaningful nationwide — it’s actually not that far off from what we have.)

(作为参考,下面是该国所有IHOP的地图。IHOP似乎通常是食客的一个很好的代理,Waffle House太集中在南部,在全国范围内都没有意义-实际上距离我们所拥有的并不遥远。 )

The New York Times, as the largest and most successful American newspaper (and as far as I can tell, the biggest one with a publicly available API) offers an interesting case for analysis: the Times went to a paywall in 2011, and its emphasis on producing content to fuel subscriptions has accelerated since then; simultaneously, engagement with the Times since Trump’s election has increased steadily. Bari Weiss aside, how has the Times navigated its way through competing currents in media?

《纽约时报》作为最大,最成功的美国报纸(据我所知,是最大的拥有公开API的报纸)提供了一个有趣的分析案例:《纽约时报》于2011年进入付费专栏,其重点是从那时起,为订阅提供内容的制作速度加快了; 同时,自特朗普当选以来与《纽约时报》的参与稳步增加。 除了巴里·魏斯(Bari Weiss),《纽约时报》如何驾驭媒体竞争潮流?

适合打印 (Fit to Print)

Let’s first look at the Times’ output in broad terms. The Times itself reported a 42% increase in the number of paid (digital) subscriptions between December 2016 and December 2017. Certainly not all of that is attributable to Trump (we can gesture at secular media trends and the effectiveness of the Times’ own strategy there), but some of it certainly is due to Trump: the Times told CNBC in November 2016 that they had seen startling subscriber growth in the three weeks since the election, and the times rode that increase to the 42% year over year subscriber growth between 2016 and 2017 cited in their 2017 10-K.

让我们首先从广义上看《纽约时报》的产出。 《泰晤士报》本身报告称,2016年12月至2017年12月之间,付费(数字)订阅数量增长了42%。当然,并非所有这一切都应归功于特朗普(我们可以对世俗媒体趋势以及《 泰晤士报》自身战略的有效性进行指责)有),但有些可以肯定的是 ,由于特朗普:时代在接受CNBC在2016年11月,他们曾看见,因为大选惊心在为期三周的用户增长,与时代骑马走过一年的用户增长是提高到42%,比去年在其2017年10-K中引用的2016年至2017年之间。

技术说明 (Technical Notes)

I’ve used the Times’ archive API to download all the available data, going back to January of 2015. I wrote a class to handle this for me, but the script is relatively simple: call the API, parse the JSON, save as a CSV, and repeat for each year, for each month. Because we get a full month’s archive with each API call, we should fall well under the rate limit of 4,000 calls per day.

我使用Times的存档API下载了所有可用数据,可追溯到2015年1月。我为我编写了一个类来处理此数据,但脚本相对简单:调用API,解析JSON,另存为CSV,然后每年重复一次,每月重复一次。 因为每个API调用都会获得一个月的存档,所以我们应该将其限制在每天4,000个调用的速率限制之下。

The code I used to extract the archive is available at my github, here. The full script, an exercise in data engineering, will also allow you to post the archive to a relational database (Redshift on AWS in this case), but that’s beyond the scope of this post.

我用来提取档案的代码可在我的github上找到 。 完整的脚本(数据工程中的练习)还将允许您将存档发布到关系数据库(在本例中为AWS上的Redshift),但这超出了本文的范围。

We can see that the overall output (in terms of stories published) has been decreasing over time:

我们可以看到,随着时间的流逝,总体产出(就已发表的故事而言)一直在下降:

Image for post

Trump’s election does not appear to have had any effect here — the Times has steadily dropped the number of articles it’s published going back to 2015.

特朗普的当选似乎并未在此产生任何影响-泰晤士报已将其可追溯至2015年的文章数量稳步下降。

Is this true across all news desks at the paper? Or across all sections (i.e. politics, U.S. news, international news, etc.)?

在报纸上的所有新闻服务台上都是这样吗? 还是所有部分(例如政治,美国新闻,国际新闻等)?

Image for post

Nothing jumps out from the chart, rather there’s a general across different sections of the paper. In terms of real numbers, across the top sections of the paper (i.e. the sections under which the most articles are published) the full table is available in the github repo, the last 10 months’ of aggregate data looks like this (throw out the last row as potentially incomplete):

图表上什么都没有跳出来,而是本文的不同部分都有一个概括。 就实数而言,在论文的顶部(即发布文章最多的部分)中,完整表可在github存储库中找到,最后10个月的汇总数据如下所示(将最后一行可能不完整):

Image for post

But perhaps there’s more to the story than declining “output.” If we look not at the number of articles, but at the number of words devoted to each section, we can see where that output has gone. I’ve tallied the average word count by year in the following table:

但是,也许故事的内容不仅仅是减少“产出”。 如果我们不看文章的数量,而是看每个部分的单词数,我们可以看到输出结果。 我在下表中按年份统计了平均单词数:

Image for post

So there are fewer articles being published, but those that make it to print (or the website) tend to be longer — growing at an average of about 7.5% every year.

因此,发表的文章较少,但可打印的文章(或网站)则往往更长-每年平均增长约7.5%。

The table below shows the top sections in every year since 2015, as well as the word count for all the articles published in each section. I’ve included the year over year change as well. (I should note that for 2020 articles I’ve prorated the output, so that although only 7 months of 2020 are fully available for analysis, we should be able to compare 2020 rates with other years available in the dataset.)

下表显示了自2015年以来每年的热门部分,以及每个部分中发布的所有文章的字数。 我还包括了逐年变化。 (我应该注意,对于2020年的文章,我已经按比例对输出进行了估算,因此,尽管只有2020年的7个月可以完全用于分析,但我们应该能够将2020年的比率与数据集中的其他年份进行比较。)

Image for post

We can look at this data graphically, too:

我们也可以以图形方式查看此数据:

Image for post

And using the same analysis we did for the articles above:

并使用与上述文章相同的分析方法:

Image for post
Let me count the words
让我数一数

Some things stand out: as we’ve seen, it looks like the decline in the paper’s “output” is general, rather than particular. Most of the sections are down, albeit modestly. In all, the Times publishes about 45.5 million words each year. This has fallen slightly since 2015, by about 2.5% each year.

有些事情脱颖而出:如我们所见,论文的“产出”下降似乎是普遍的,而不是特殊的。 大多数部分都下降了,尽管幅度不大。 总体而言,《泰晤士报》每年出版约4550万个单词。 自2015年以来,这一数字每年略有下降,约为2.5%。

What we see is generally in line with our previous findings: there is some evidence of less content in fewer articles and fewer words, but those forces are generally working against one another—the Times is producing less, but more detailed content as part of a secular shift in strategy, from a business based on advertising to one based on digital subscriptions.

我们所看到的通常与我们以前的发现相符:有证据表明,文章较少,字词较少,但这些力量总体上是相互抵触的。战略的长期转变,从基于广告的业务向基于数字订阅的业务转变。

With these longer, more detailed pieces, the Times might have more room to send journalists to Trump country diners. Do we see that in their coverage?

有了这些更长,更详细的文章,《纽约时报》可能有更大的空间将记者派往特朗普的乡村食客。 我们在他们的报道中看到了吗?

不宽容的政治 (The Politics of Intolerance)

So, what was the Times covering in 2015? What was it covering then that it isn’t now, and what is it covering now that it wasn’t then?

那么,《泰晤士报》 2015年报道的内容是什么? 当时不是现在涵盖了什么,现在还没有涵盖什么?

We can look at this in a few ways. In this piece I’m going to look at the geography of coverage (at least, what I could find from some simple text mining), but some others include text analysis of both headlines and keywords, or looking at bylines and representation at the Times in more detail.

我们可以通过几种方式对此进行研究。 在本篇文章中,我将研究覆盖范围的地理位置(至少,我可以从一些简单的文本挖掘中找到),但另一些则包括标题和关键字的文本分析,或者在《泰晤士报》上查看下划线和代表形式。更详细地。

技术说明 (Technical notes)

I’ve generated the data under analysis by parsing bylines in archived articles. This is, as far as I can tell, the easiest way to get geography from historical articles. The Times does offer a wire service that includes geographical information, but what I’m interested in here is historical data, not incoming stories.

我已经通过分析存档文章中的按行生成了正在分析的数据。 据我所知,这是从历史文章中获取地理信息的最简单方法。 时报确实提供了包括地理信息在内的有线服务,但是我在这里感兴趣的是历史数据,而不是传入的故事。

The script I use to parse geography relies on the journalistic convention of the dateline — something like KABUL, Afghanistan; LONDON; or, as in one recent article, CENTER OF THE WORLD, Ohio (in an article about Ohio State football). Given the available data, this seemed like the best (/only) way to get the information we’re interested in.

我用来解析地理的脚本依赖于时间轴的新闻惯例-诸如阿富汗的卡布尔(KABUL); 伦敦; 或者,如最近的一篇文章 ,《俄亥俄州的世界中心》(关于俄亥俄州立足球的一篇文章)。 给定可用数据,这似乎是获取我们感兴趣的信息的最佳(/唯一)方法。

A few caveats to this approach: articles that have a dateline like this are less and less common, not only at the Times but elsewhere in the news media as well. In many cases, we also miss out on geographic information from the dateline because there’s no geographic focus to the article — think opinion pieces, magazine posts, podcasts and other multimedia posts, as well as “geographically diffuse” stories about things like the coronavirus, or Biden’s VP pick.

这种方法的一些注意事项:带有这样的日期期限的文章越来越少见,不仅在《纽约时报》,而且在新闻媒体的其他地方也是如此。 在许多情况下,我们也错过了日期栏中的地理信息,因为该文章没有针对地理区域的关注-思考文章,杂志文章,播客和其他多媒体文章,以及关于诸如冠状病毒,或拜登(Biden)的VP选择。

More specifically, if we look at the number of articles with extractable geographic information by year, we have the following:

更具体地说,如果按年份查看具有可提取的地理信息的文章数,则将具有以下内容:

Image for post

Not surprising, given that we know the number of stories overall is decreasing. But the same is true if we look at these kinds of stories as a percentage of all stories published, we get a clearer picture:

并不奇怪,因为我们知道故事的数量整体减少。 但是,如果我们将这些类型的故事占所有已发布故事的百分比来看,情况也是如此,我们会得到更清晰的画面:

Image for post

Looks like the percentage of stories with identifiable geography in the byline has remained roughly constant with a dip in 2018.

看起来在地标中具有可识别地理位置的故事百分比在2018年有所下降,大致保持不变。

This suggests a few things: first, that there may have been a shift to more named places after Trump’s election (2017 has the highest percentage of “geographic” bylines at 28.5%) but it wasn’t major (the average was 26% of bylines containing geographic information). Second, though we’ve seen the Times drop the overall number of articles, they don’t seem to have shifted significantly away from “locatable” stories.

这表明了一些事情:首先,特朗普当选后可能已经转移到了更具名的地方(2017年“地理”署名的比例最高,为28.5%),但并不是主要的(平均值为26%包含地理信息的摘要)。 其次,尽管我们已经看到《纽约时报》减少了文章的总数,但是它们似乎并没有明显偏离“可定位”的故事。

There’s one more thing we should look at with these aggregate measures: which sections are the most “diverse?” By “diversity” here I mean place diversity, without reference to the subjects or authors of the pieces in question (that material should be saved for a future analysis). I’m going to exclude foreign desk coverage here—it’s natural that a desk with a remit to cover stories from all over the world would feature articles from… all over the world—and focus on domestic stories. So, what does the diversity by news desk look like?

通过这些总体衡量指标,我们还需要考虑另一件事:哪些部分最“多样化”? 这里所说的“多样性”是指地点多样性,而不涉及所讨论作品的主题或作者(应保存材料以备将来分析)。 我将在这里排除国外的报道范围,自然而然的是,一个具有职权范围以报道来自世界各地的故事的办公桌将收录来自……世界各地的文章,并将重点放在国内故事上。 那么,新闻台的多样性是什么样的呢?

Image for post

And finally, what is the count of unique places by year?

最后,按年计算的不重复地点数是多少?

Image for post

(Note that the 2020 number is low because we’re only halfway through the year. On a pro-rated basis, 451 places would translate to about 770 locations, putting 2020 in line with 2019.) It would seem that the Weiss was wrong about the Times’ direction (or at least, her hire didn’t exactly presage a new era of geographic diversity in their pages).

(请注意,2020年的数字很低,因为我们还只是今年的一半。按比例计算,451个位置将转换为大约770个位置,使2020年与2019年保持一致。)Weiss似乎是错误的关于《纽约时报》的发展方向(至少,她的聘用并没有预示着他们页面上地理多样性的新时代)。

First let’s look worldwide.

首先让我们放眼全球。

Image for post

On to the mapping. A first quick look at the distribution of (unique) byline geography looks ecumenical enough. The distribution is, naturally enough, concentrated in the United States (although looks like maybe there is no news made in Idaho or eastern New Mexico), Europe, and the Pacific Rim. (Note that this distribution is roughly the same one you might find from any other big news organization — this chart from a 2018 Forbes article shows the distribution of stories on CNN, Fox, and MSNBC).

到映射。 快速浏览一下(独特的)按行标注地理分布看起来足够普遍。 分布自然就集中在美国(尽管看起来在爱达荷州或新墨西哥州东部可能没有新闻),欧洲和环太平洋地区。 (请注意,这种分布与您从其他大型新闻机构可能发现的分布大致相同-来自福布斯 2018年文章的这张图表显示了CNN,Fox和MSNBC的新闻分布)。

What happens if we zoom in on the United States, and look at year over year changes? For this part, I’ve filtered the data down to include the United States. only. Let’s see what we have.

如果我们放大美国,看看逐年变化会怎样? 对于这一部分,我已经过滤掉数据以包括美国。 只要。 让我们看看我们有什么。

Image for post

Some things stand out here: we have lots of coverage in the big cities — New York, DC, Los Angeles, Chicago (though we might expect more news about places closer to New York — commuters from Connecticut would be interested in the Metro section, for example) but five red states, Ohio, Iowa, Pennsylvania, Georgia and Florida, get a lot of coverage as well.

这里有一些突出的地方:我们在大城市(纽约,哥伦比亚特区,洛杉矶,芝加哥)提供了很多报道(尽管我们可能希望获得更多靠近纽约的新闻,而来自康涅狄格州的通勤者会对“地铁”部分感兴趣,例如),但五个红色州(俄亥俄州,爱荷华州,宾夕法尼亚州,乔治亚州和佛罗里达州)的覆盖率也很高。

What about on a per-capita basis? Here, I’m using 2019 state population estimates from Wikipedia.

那么人均呢? 在这里,我使用的是Wikipedia的 2019年州人口估计。

Image for post

I left D.C. out because it was such an outlier it threw the rest of the map off. This is what we’d expect from political coverage especially—Iowa and New Hampshire get tons of coverage in presidential election years, enough to make them stand out across the entire sample.

我把DC排除在外了,因为它离得太远了,它把地图的其余部分都扔掉了。 这是我们特别希望从政治报道中获得的结果—爱荷华州和新罕布什尔州在总统选举年获得了大量报道,足以使他们在整个样本中脱颖而出。

One more hex map. What if we exclude political coverage? Here I’m filtering out all stories from the politics news desk.

一张十六进制地图。 如果我们排除政治报道该怎么办? 在这里,我从politics新闻台过滤掉所有故事。

Image for post

Generally pretty similar, but we see some more emphasis on the Northeast, but still a lot of coverage of California, and, strangely, New Hampshire.

通常情况下非常相似,但我们看到的更多是对东北的关注,但对加利福尼亚以及新罕布什尔州的报道仍然很多。

For a closer look at the impact of Trump’s election, we can look at the data from 2015–2018. Are there any changes we can see in the map over time?

为了更深入地了解特朗普当选的影响,我们可以看看2015-2018年的数据。 随着时间的推移,我们在地图上可以看到任何变化吗?

Image for post

We can see a lot more action in some of the swing states in election years. 2018 and 2016 in particular show a broader range of states covered in the Times’ pages.

在选举年的某些摇摆州中,我们可以看到更多的行动。 特别是2018年和2016年,《纽约时报》页面涵盖的州范围更广。

进一步的考虑 (Further considerations)

Next we should look at what the Times is writing about in each of these places. In other words, what are the keywords most associated with each state?

接下来,我们应该看一下《纽约时报》在每个地方所写的内容。 换句话说,与每个州最相关的关键字是什么?

Let’s start with Washington, D.C., which sees the most Times coverage.

让我们从华盛顿特区开始,它是《纽约时报》报道最多的地区。

Image for post

As we’d expect, politics, and in particular Trump, dominate Washington coverage. It’s further not surprising that the Republicans get more coverage than democrats, since they control the Senate, White House, and Supreme Court (all of which are also featured prominently).

如我们所料,政治,特别是特朗普,主导了华盛顿的报道。 共和党人比民主党人拥有更多的报道也就不足为奇了,因为他们控制着参议院,白宫和最高法院(所有这些都在显着位置)。

And what if we look at “diner country,” that is, states in the middle of the country where we’d already identified lots of Times coverage? Here’s what we have for Ohio:

如果我们看“吃饭的国家”,即我们已经确定了很多《纽约时报》报道的国家中部的州,该怎么办? 这是俄亥俄州的设施:

Image for post

And Iowa:

和爱荷华州:

Image for post

And finally, Pennsylvania:

最后,宾夕法尼亚州:

Image for post

It certainly does look like the Times invested heavily in coverage of 2016 swing states and covered the aftermath of the election. Curiously though, the Sports section also appears to be a major driver of “place diversity” in the Times’ coverage.

看起来《纽约时报》确实确实花了很多钱来报道2016年的摇摆状态,并报道了选举的后果。 令人奇怪的是,体育版块似乎也是《纽约时报》报道中“场所多样性”的主要推动力。

加起来 (Summing up)

So, have we found any evidence of diner journalism in the Times’ pages? Well, sure, some. It is true that, especially in election years, swing states in “flyover country” get more attention. In general, the Times looks more or less like any other big news organization, focusing on the major cities in the United States, especially in the Northeast. If you live in New Hampshire, you have the good fortune of having the most words per capita written about your state by the Times.

那么,我们是否在《泰晤士报》的网页上找到任何关于晚餐新闻的证据? 好吧,当然。 确实,尤其是在选举年中,“立交桥国家”中的摇摆州得到了更多关注。 总体而言,《纽约时报》看起来或多或少像其他任何大型新闻机构,都集中在美国的主要城市,尤其是东北地区。 如果您居住在新罕布什尔州,那么您的幸运之星就是,《纽约时报》以人均字数最多的是您的州。

There may be some truth to what Weiss wrote in her resignation letter; the Times does, especially in election years, spread its coverage out to states across the country. We’d expect nothing less of a national news organization. However, we’ve also seen that fewer and fewer unique places are getting covered by the Times, albeit the articles that are being produced now are longer than in years past.

魏斯在辞职信中写的话可能有些道理。 《泰晤士报》确实将其报道范围扩大到全国各州,尤其是在选举年代。 我们期待有一个国家新闻机构。 但是,我们还发现,《纽约时报》报道的独特场所越来越少,尽管现在制作的文章比过去几年更长。

In all, the New York Times really does produce news from all over the globe and all over the United States; I’d add that the Times has also committed itself to a diversity of voices on its op-ed page, and had done so before they hired Bari Weiss; the Times has long had (dishonest) conservatives on its payroll, elevating voices that would otherwise be heard in, say, the pages of the New York Post. (The Times also proclaims that it is committed to publishing a diversity of letters to the editor in its op-ed pages.)

总之,《纽约时报》确实确实产生了来自全球和美国各地的新闻; 我要补充的是,《泰晤士报》在其专栏文章上也致力于表达各种声音,并且在雇用Bari Weiss之前就这样做了。 《泰晤士报》长期以来一直( 不诚实地 )在工资单上采取保守措施,提高了人们的声音,这些声音本来可以在《纽约邮报》的页面上听到。 (《泰晤士报》还宣称,它将致力于在其专着的页面上向编辑发布各种信函。)

I haven’t discussed another important aspect of diversity at the New York Times: the identities of the people producing and gatekeeping the stories that get written. This is, I would argue, just as important as any focus on geographic representation, and requires a deeper treatment than whatever we can get at from individual lines in individual stories. Weiss is right that the news business has the power to elevate people’s stories and voices; there may be a commitment to doing so in its coverage. We should encourage them to do the same in the boardroom.

在《纽约时报》上,我还没有讨论多样性的另一个重要方面:生产和保存所写故事的人的身份。 我认为,这与对地理表示法的任何关注一样重要,并且需要比我们从单个故事中的单个台词所能得到的任何东西更深入的对待。 魏斯(Weiss)说的对,新闻业有能力提升人们的故事和声音是正确的; 可能会承诺在其范围内这样做。 我们应该鼓励他们在董事会做同样的事情。

翻译自: https://medium.com/swlh/the-shifting-definition-of-newsworthiness-1566c0430a14

自定义按钮动态变化

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值