python中字符串结束标志,从Python中的字符串中删除表情符号

I found this code in Python for removing emojis but it is not working. Can you help with other codes or fix to this?

I have observed all my emjois start with \xf but when I try to search for str.startswith("\xf") I get invalid character error.

emoji_pattern = r'/[x{1F601}-x{1F64F}]/u'

re.sub(emoji_pattern, '', word)

Here's the error:

Traceback (most recent call last):

File "test.py", line 52, in

re.sub(emoji_pattern,'',word)

File "/usr/lib/python2.7/re.py", line 151, in sub

return _compile(pattern, flags).sub(repl, string, count)

File "/usr/lib/python2.7/re.py", line 244, in _compile

raise error, v # invalid expression

sre_constants.error: bad character range

Each of the items in a list can be a word ['This', 'dog', '\xf0\x9f\x98\x82', 'https://t.co/5N86jYipOI']

UPDATE:

I used this other code:

emoji_pattern=re.compile(ur" " " [\U0001F600-\U0001F64F] # emoticons \

|\

[\U0001F300-\U0001F5FF] # symbols & pictographs\

|\

[\U0001F680-\U0001F6FF] # transport & map symbols\

|\

[\U0001F1E0-\U0001F1FF] # flags (iOS)\

" " ", re.VERBOSE)

emoji_pattern.sub('', word)

But this still doesn't remove the emojis and shows them! Any clue why is that?

393702dedfc72f8e36218e5423983549.png

解决方案

I am updating my answer to this by @jfs because my previous answer failed to account for other Unicode standards such as Latin, Greek etc. StackOverFlow doesn't allow me to delete my previous answer hence I am updating it to match the most acceptable answer to the question.

#!/usr/bin/env python

import re

text = u'This is a smiley face \U0001f602'

print(text) # with emoji

def deEmojify(text):

regrex_pattern = re.compile(pattern = "["

u"\U0001F600-\U0001F64F" # emoticons

u"\U0001F300-\U0001F5FF" # symbols & pictographs

u"\U0001F680-\U0001F6FF" # transport & map symbols

u"\U0001F1E0-\U0001F1FF" # flags (iOS)

"]+", flags = re.UNICODE)

return regrex_pattern.sub(r'',text)

print(deEmojify(text))

This was my previous answer, do not use this.

def deEmojify(inputString):

return inputString.encode('ascii', 'ignore').decode('ascii')

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值