python将首字符移到末尾_python-截断字符串而不以单词的中间结尾

weixin_39755952

于 2020-12-08 18:53:03 发布

阅读量318

点赞数

文章标签： Python 字符串截断智能截断正则表达式 textwrap

python-截断字符串而不以单词的中间结尾

我正在寻找一种在Python中截断字符串的方法，该方法不会切断单词中间的字符串。

例如：

Original: "This is really awesome."

"Dumb" truncate: "This is real..."

"Smart" truncate: "This is really..."

我正在寻找一种从上面完成“智能”截断的方法。

Jack asked 2020-08-07T00:19:51Z

7个解决方案

60 votes

实际上，我在我最近的一个项目中为此写了一个解决方案。我将其中的大部分压缩为较小一点。

def smart_truncate(content, length=100, suffix='...'):

if len(content) <= length:

return content

else:

return ' '.join(content[:length+1].split(' ')[0:-1]) + suffix

if语句会检查您的内容是否已经低于临界点。如果不是，则将其截断为所需的长度，在空格处分割，删除最后一个元素(这样就不会切断单词)，然后将其重新连接在一起(同时附加“ ...”) 。

Adam answered 2020-08-07T00:20:02Z

45 votes

这是亚当解决方案中最后一行的稍好版本：

return content[:length].rsplit(' ', 1)[0]+suffix

(这会稍微提高效率，并且在字符串的前面没有空格的情况下返回更合理的结果。)

bobince answered 2020-08-07T00:20:26Z

11 votes

有一些微妙之处可能对您来说不是问题，例如标签的处理(例如，如果您将标签显示为8个空格，但内部将其视为1个字符)，则处理各种折断和非打破空白，或允许断字等。如果需要这样做，您可能需要看一下textwrap模块。例如：

def truncate(text, max_size):

if len(text) <= max_size:

return text

return textwrap.wrap(text, max_size-3)[0] + "..."

大于max_size的单词的默认行为是破坏它们(使max_size成为硬限制)。您可以通过将break_long_words = False传递给wrap()来更改此处其他解决方案所使用的软限制，在这种情况下，它将返回整个单词。如果您希望这种行为将最后一行更改为：

lines = textwrap.wrap(text, max_size-3, break_long_words=False)

return lines[0] + ("..." if len(lines)>1 else "")

根据您想要的确切行为，可能还有一些其他选项(例如expand_tabs)会引起您的注意。

Brian answered 2020-08-07T00:20:56Z

7 votes

def smart_truncate1(text, max_length=100, suffix='...'):

"""Returns a string of at most `max_length` characters, cutting

only at word-boundaries. If the string was truncated, `suffix`

will be appended.

"""

if len(text) > max_length:

pattern = r'^(.{0,%d}\S)\s.*' % (max_length-len(suffix)-1)

return re.sub(pattern, r'\1' + suffix, text)

else:

return text

要么

def smart_truncate2(text, min_length=100, suffix='...'):

"""If the `text` is more than `min_length` characters long,

it will be cut at the next word-boundary and `suffix`will

be appended.

"""

pattern = r'^(.{%d,}?\S)\s.*' % (min_length-1)

return re.sub(pattern, r'\1' + suffix, text)

OR

def smart_truncate3(text, length=100, suffix='...'):

"""Truncates `text`, on a word boundary, as close to

the target length it can come.

"""

slen = len(suffix)

pattern = r'^(.{0,%d}\S)\s+\S+' % (length-slen-1)

if len(text) > length:

match = re.match(pattern, text)

if match:

length0 = match.end(0)

length1 = match.end(1)

if abs(length0+slen-length) < abs(length1+slen-length):

return match.group(0) + suffix

else:

return match.group(1) + suffix

return text

Markus Jarderot answered 2020-08-07T00:22:16Z

6 votes

>>> import textwrap

>>> textwrap.wrap('The quick brown fox jumps over the lazy dog', 12)

['The quick', 'brown fox', 'jumps over', 'the lazy dog']

您只需要考虑其中的第一个要素就可以完成...

Antonio answered 2020-08-07T00:22:36Z

3 votes

def smart_truncate(s, width):

if s[width].isspace():

return s[0:width];

else:

return s[0:width].rsplit(None, 1)[0]

测试它：

>>> smart_truncate('The quick brown fox jumped over the lazy dog.', 23) + "..."

'The quick brown fox...'

Vebjorn Ljosa answered 2020-08-07T00:22:55Z

1 votes

在Python 3.4及更高版本中，您可以使用textwrap.shorten。以OP为例：

>>> import textwrap

>>> original = "This is really awesome."

>>> textwrap.shorten(original, width=20, placeholder="...")

'This is really...'

textwrap.shorten(text，width，** kwargs)

折叠并截断给定的文本以适合给定的宽度。

首先，文本中的空格被折叠(所有空格均由单个空格代替)。如果结果适合宽度，则为回来。否则，将从结尾处删除足够的单词，以便剩余的单词加上占位符适合宽度：

marcanuy answered 2020-08-07T00:23:29Z

weixin_39755952

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python将首字符移到末尾_python-截断字符串而不以单词的中间结尾

python-截断字符串而不以单词的中间结尾我正在寻找一种在Python中截断字符串的方法，该方法不会切断单词中间的字符串。例如：Original: "This is really awesome.""Dumb" truncate: "This is real...""Smart" truncate: "This is really..."我正在寻找一种从上面完成“智能”截...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。