Linux grep正则匹配各国语言

最新推荐文章于 2024-04-19 08:20:41 发布

pl在之心

最新推荐文章于 2024-04-19 08:20:41 发布

阅读量924

点赞数 1

分类专栏： Linux常用命令

本文链接：https://blog.csdn.net/u010627840/article/details/103312037

版权

Linux常用命令专栏收录该内容

14 篇文章 1 订阅

订阅专栏

The Unicode standard places each assigned code point (character) into one script. A script is a group of code points used by a particular human writing system. Some scripts like Thai correspond with a single human language. Other scripts like Latin span multiple languages.

Some languages are composed of multiple scripts. There is no Japanese Unicode script. Instead, Unicode offers the Hiragana, Katakana, Han, and Latin scripts that Japanese documents are usually composed of.

A special script is the Common script. This script contains all sorts of characters that are common to a wide range of scripts. It includes all sorts of punctuation, whitespace and miscellaneous symbols.

All assigned Unicode code points (those matched by \P{Cn}) are part of exactly one Unicode script. All unassigned Unicode code points (those matched by \p{Cn}) are not part of any Unicode script at all.

The JGsoft engine, Perl, PCRE, PHP, Ruby 1.9, Delphi, and XRegExp can match Unicode scripts. Here’s a list:

1. 匹配中文

grep -P '[\p{Han}]'  test.log

2. 匹配其他国家语言

\p{Common} \p{Arabic} \p{Armenian} \p{Bengali} \p{Bopomofo} 
\p{Braille} \p{Buhid} \p{Canadian_Aboriginal} \p{Cherokee} 
\p{Cyrillic} \p{Devanagari} \p{Ethiopic} \p{Georgian} \p{Greek} 
\p{Gujarati} \p{Gurmukhi} \p{Han} \p{Hangul} \p{Hanunoo} \p{Hebrew} 
\p{Hiragana} \p{Inherited} \p{Kannada} \p{Katakana} \p{Khmer} \p{Lao} 
\p{Latin} \p{Limbu} \p{Malayalam} \p{Mongolian} \p{Myanmar} \p{Ogham} 
\p{Oriya} \p{Runic} \p{Sinhala} \p{Syriac} \p{Tagalog} \p{Tagbanwa} 
\p{TaiLe} \p{Tamil} \p{Telugu} \p{Thaana} \p{Thai} \p{Tibetan}

pl在之心

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Linux grep正则匹配各国语言

The Unicode standard places each assigned code point (character) into one script. A script is a group of code points used by a particular human writing system. Some scripts likeThaicorrespond with ...
复制链接

扫一扫