摘录本的摘录内容_使用PHP构建更好的文本摘录

摘录本的摘录内容

Many developers spend time writing code designed to extract a very specific number of words or characters from a piece of text. This text sample, often drawn from a blog post or comment, is usually displayed with a link that leads the user to read more.

许多开发人员花时间编写代码,旨在从一段文本中提取非常具体的单词或字符。 此文本示例通常是从博客文章或评论中提取的,通常显示带有链接的链接,该链接可以使用户阅读更多内容。

It’s possible to use JavaScript, and server-side languages to create text extracts. For the purposes of this article, I’ll use .

可以使用JavaScript和服务器端语言来创建文本摘录。 出于本文的目的,我将使用

It doesn’t particularly matter how you gain your piece of text. For this example, I’ll use the opening lines of Wikipedia’s entry on Richard III as a corpus:

如何获得一段文字并不重要。 对于此示例,我将使用Wikipedia在Richard III上的条目开头作为语料库:

$string = "<p><strong>Richard III</strong> (2 October 1452 – 22 August 1485) was King of England for two years, from 1483 until his death in 1485 in the Battle of Bosworth Field. He was the last king of the House of York and the last of the Plantagenet dynasty. His defeat at Bosworth Field, the decisive battle of the Wars of the Roses, is sometimes regarded as the end of the Middle Ages in England. He is the subject of the play <cite>Richard III</cite> by <a href=//en.wikipedia.org/wiki/William_Shakespeare>William Shakespeare.</a>"

The first thing I’ll do create a text extract is to remove HTML markup from the text contained in the variable $string:

我要做的第一件事是创建文本提取,以从变量$string包含的文本中删除HTML标记:

$string = strip_tags($string);

With the sample now pure text, I’ll trim it down to a set number of characters:

现在有了示例纯文本,我将其缩减为一定数量的字符:

$string = substr($string, 0, 200);

Next, I’ll ensure that the sample does not end with a comma, exclamation mark, or other punctuation:

接下来,我将确保样本不以逗号,感叹号或其他标点结尾:

$string = rtrim($string, "!,.—);

Finally, I’ll ensure that the extracted text ends with a space (as we don’t want to have the text sample end with a cut-off word) before appending an ellipsis when the text extract is printed on the screen:

最后,当在屏幕上打印文本摘录时,在附加省略号之前,我将确保提取的文本以空格结尾(因为我们不希望文本样本的结尾单词是截止词):

$string = substr($string, 0, strrpos($string, ' '));
echo $string."… ";

The result will look something like this:

结果将如下所示:

Richard III (2 October 1452 – 22 August 1485) was King of England for two years, from 1483 until his death in 1485 in the Battle of Bosworth Field. He was the last king of the House of York and the…

理查三世(1452年10月2日至1485年8月22日)担任英国国王,从1483年直到1485年在博斯沃思战场(Bottle of Bosworth Field)逝世为止。 他是约克家族和...的最后一位国王。

While this method is sufficient for most purposes, the extract can be enhanced with other techniques, which I will demonstrate in future articles.

尽管此方法对于大多数目的已经足够,但是可以使用其他技术来增强提取效果,我将在以后的文章中进行演示。

翻译自: https://thenewcode.com/703/Build-Better-Text-Extracts-with-PHP

摘录本的摘录内容

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值