‘M-BM-‘ character

 使用unicode命令转换: 

得到16进制Unicode编码 \u2013

或者:

复制需要转换的符号– 到上面框, 然后点击“转换成unicode”,在下面框会显示对应的\u开头的16进制Unicode编码 \u2013,利用这个码,可以得到这个符号:

ctrl + shift + u, then release, type in 2013, enter, it will appear –

 Unicode与中文互转|16进制Unicode编码在线转换|反斜杠u(\u)编码|Java转义字符还原-站长工具

在linux 中,可以利用这个码来打印符号:

用来取代: 

The M-BM- characters are an ASCII representation of byte sequence 0xc2 0xa0, which is the UTF8 encoding of unicode character A0 - a non-breaking space character. This character can be inserted in both LibreOffice and Microsoft Word documents using the key sequence Ctrl+Shift+SPACE.

For example if we create a new .odt document in LibreOffice and type ABCCtrl+Shift+SPACEDEF, then Save As... Text (ignoring the warning that the document may contain features that cannot be saved in that format), then view the resulting .txt file with cat:

$ cat nbsp.txt 
ABC DEF

and then again with the -v switch to show non-printing characters

$ cat -v nbsp.txt 
M-oM-;M-?ABCM-BM- DEF

Note that we also get an initial sequence M-oM-;M-? or hexadecimal 0xef 0xbb 0xbf which is the UTF8 byte order mark (BOM) consistent with the file type reported by the file command i.e.

$ file nbsp.txt 
nbsp.txt: UTF-8 Unicode (with BOM) text

Using od to print the hexadecimal values in byte order we see

$ od -tx1 nbsp.txt
0000000 ef bb bf 41 42 43 c2 a0 44 45 46 0a
0000014

It is possible to manipulate these characters using standard tools like sed or tr by specifying the hex codes as escape sequences e.g. to replace the non-breaking space with a plain ASCII space

$ sed 's/\xc2\xa0/ /g' nbsp.txt
ABC DEF

Checking again with od confirms the replacement by an ordinary ASCII space 0x20 (decimal 32)

$ sed 's/\xc2\xa0/ /g' nbsp.txt | od -tx1
0000000 ef bb bf 41 42 43 20 44 45 46 0a
0000013

In gnome-terminal (and maybe other UTF8-aware terminal emulators), it's also possible to enter the unicode code point value directly using the key sequence Ctrl+Shift+u followed by a hexidecimal value then the Enter key - the sequence shows up initially as u̲.̲.̲.̲ but then the character should compose when you hit Enter e.g. for the same non-breaking space replacement we can do

$ sed 's/Ctrl+Shift+ua0which displays as
$ sed 's/̲/̲u̲a̲0̲

and then completes as

$ sed 's/ / /g' nbsp.txt
ABC DEF

Using cat -v we can confirm the M-BM- sequence has become an ordinary space

$ sed 's/ / /g' nbsp.txt | cat -v
M-oM-;M-?ABC DEF

You may want to look at more generic encoding converters such as iconv and uconv as well.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值