ECMAScript5P5(正则表达式)

最新推荐文章于 2024-07-18 08:59:30 发布

今天你四百冷了吗

最新推荐文章于 2024-07-18 08:59:30 发布

阅读量250

点赞数

文章标签：正则表达式 javascript 前端

本文链接：https://blog.csdn.net/weixin_48931100/article/details/126503414

版权

什么是正则表达式

正则表达式用于定义一些字符串的规则，计算机可以根据正则表达式来检查一个字符串是否符合规则，获取字符串中符合规则的内容；

正则表达式的创建

字面量创建

在一对反斜线中写正则表达式内容
var reg = /正则表达式/修饰符

// var reg = /条件/匹配模式;
var reg1 = /hello/g

构造函数创建

构造正则表达式的实例，如new RexExp(‘abc’)
内部传入的参数为字符串/字符串的变量

// var reg2 = new RegExp('正则表达式','匹配模式')
var reg2 = new RegExp('hello','g')

字符分类

普通字符

字母、数字、下划线、汉字、没有特殊含义的符号（,;!@等）

实际上不是特殊字符的字符都是普通字符

特殊字符

\：将特殊字符转义成普通字符

模式修饰符

i：ignoreCase，匹配时忽视大小写

m：multiline，多行匹配

g：global，全局匹配

字面量创建正则时，模式修饰符写在一对反斜线后

正则表达式实例方法

exec

可用来匹配字符串中符合正则表达式的字符串
返回值：
若匹配上了，则返回一个数组
返回的数组格式：
[匹配的内容，index: 在str中匹配的起始位置，input: 参数字符串，groups: undefined]
没匹配上，返回null

var str = 'hello world hello';
var reg1 = /hello/;
var reg2 = /hello/g;
var reg3 = /exe/g;
console.log(reg1.exec(str)); 
//[ 'hello', index: 0, input: 'hello world hello', groups: undefined ]
console.log(reg2.exec(str)); 
//[ 'hello', index: 0, input: 'hello world hello', groups: undefined ]
console.log(reg3.exec(str)); // null

// 如果是全局模式的正则验证 还可以使用循环进行输出
while(true){
	var result = reg.exec(str);
	if(!result){
		break;
	}
	console.log(result[0],result["index"],reg.lastIndex);
}

注意点：

如果正则表达式中有修饰符"g",这时，在正则表达式的实例reg中会维护lastIndex属性，记录下一次开始的位置，当第二次执行exec的时候，从lastIndex开始检索。
如果正则表达式中没有修饰符"g",不会维护lastIndex属性，每次执行从开始位置检索。

var str = 'hello html hello js hello css'
// 没有开启全局匹配 没有lastIndex这个属性 每次校验都会回到起始位置 0 重新匹配
var reg1 = /hello/
// 开启全局匹配模式
var reg2 = /hello/g

//没有开启全局匹配模式
console.log(reg1.lastIndex);//0
console.log(reg1.exec(str));
//['hello',index: 0,input: 'hello html hello js hello css',groups: undefined]
console.log(reg1.lastIndex);//0

console.log(reg1.lastIndex);//0
console.log(reg1.exec(str));
//['hello',index: 0,input: 'hello html hello js hello css',groups: undefined]
console.log(reg1.lastIndex);//0


//开启了全局匹配模式
console.log(reg2.lastIndex);//0
console.log(reg2.exec(str));
//['hello',index: 0,input: 'hello html hello js hello css',groups: undefined]
console.log(reg2.lastIndex);//5

console.log(reg2.lastIndex);//5
console.log(reg2.exec(str));
//['hello',index: 11,input: 'hello html hello js hello css',groups: undefined]
console.log(reg2.lastIndex);//16

console.log(reg2.lastIndex);//16
console.log(reg2.exec(str));
//['hello',index: 20,input: 'hello html hello js hello css',groups: undefined]
console.log(reg2.lastIndex);//25

console.log(reg2.lastIndex);//25
console.log(reg2.exec(str)); null
console.log(reg2.lastIndex); 0

test

用来测试待检测的字符串中是否有可以匹配到正则表达式的字符串，如果有返回true，否则返回false；

var str = 'hello world';
var reg1 = /world/;
var reg2 = /Regex/;
console.log(reg1.test(str)); //返回true
console.log(reg2.test(str)); //返回false

注意点：

如果正则表达式中有修饰符"g",这时，在正则表达式的实例reg中会维护lastIndex属性，记录下一次开始的位置，当第二次执行test的时候，从lastIndex开始检索。
如果正则表达式中没有修饰符"g",不会维护lastIndex属性，每次执行从开始位置检索。

toString/toLocaleString

把正则表达式的内容转化成字面量形式字符串/有本地特色的字符串（JS中没效果）

var reg1 = /hello/;
console.log(reg1.toString()); //返回 /hello/ 字符串
console.log(reg1.toLocaleString()); //返回 /hello/ 字符串

valueOf

返回正则表达式本身

var reg3 = /hello/;
console.log(reg3.valueOf());// /hello/

正则表达式实例属性

lastIndex

当没设置全局匹配时，该属性值始终为0

设置了全局匹配时，每执行一次exec/test来匹配，lastIndex就会移向匹配到的字符串的下一个位置，当指向的位置后没有可以再次匹配的字符串时，下一次执行exec返回null，test执行返回false，然后lastIndex归零，从字符串的开头重新匹配一轮

可以理解成，每次正则查找的起点就是lastIndex

ignoreCase、global、multiline

判断正则表达式中是否有忽略大小写、全局匹配、多行匹配三个模式修饰符

var reg1 = /hello/i;
console.log(reg1.ignoreCase); //true
console.log(reg1.global); //false
console.log(reg1.multiline);  //false

source

返回字面量形式的正则表达式（类似于toString）

var reg1 = /hello/igm;
console.log(reg1.source); //hello

正则表达式语法-元字符

直接量字符

在这里插入图片描述

字符集合

方括号用于查找某个范围内的字符:

[abc]	查找方括号之间的任何字符

var str = 'abc qwe abd'
var reg1 = /[abc]/;// 只要包含有a 或者 包含有b 或者包含有c 都返回为true
console.log(reg1.test(str)); //true

[0-9]	查找任何从0至9的数字

var str = 'abc qwe abd1'
var reg1 = /[0-9]/igm;
console.log(reg1.test(str)); //true

[^xyz] 一个反义或补充字符集，也叫反义字符组。也就是说，它匹配任意不在括号内的字符。你也可以通过使用连字符 '-' 指定一个范围内的字符。

注意：^写在[]里面是反义字符组

var str = 'abc qwe abd1,2'
console.log(str);
var reg1 = /[^abc ]/igm;
console.log(reg1.exec(str)); //true

边界符

^ 匹配输入开始。表示匹配行首的文本（以谁开始)。如果多行（multiline）标志被设为 true，该字符也会匹配一个断行（line break）符后的开始处。
$ 匹配输入结尾。表示匹配行尾的文本（以谁结束）。如果多行（multiline）标志被设为 true，该字符也会匹配一个断行（line break）符的前的结尾处。
如果 ^和 $ 在一起，表示必须是精确匹配。

var rg = /abc/; 
// /abc/ 只要包含有abc这个字符串返回的都是true
console.log(rg.test('abc'));  //true
console.log(rg.test('abcd')); //true
console.log(rg.test('aabcd'));//true
console.log('---------------------------');
// 必须是以abc开头的字符串才会满足
var reg = /^abc/;
console.log(reg.test('abc')); // true
console.log(reg.test('abcd')); // true
console.log(reg.test('aabcd')); // false
console.log('---------------------------');
// 必须是以abc结尾的字符串才会满足
var reg = /abc$/;
console.log(reg.test('abc')); // true
console.log(reg.test('qweabc')); // true
console.log(reg.test('aabcd')); // false
console.log('---------------------------');
var reg1 = /^abc$/; // 精确匹配 要求必须是 abc字符串才符合规范
console.log(reg1.test('abc')); // true
console.log(reg1.test('abcd')); // false
console.log(reg1.test('aabcd')); // false
console.log(reg1.test('abcabc')); // false

字符集合与"^“和”$"一起使用

// 三选一 只有是a 或者是 b  或者是c 这三个字母才返回 true
var rg1 = /^[abc]$/; 
console.log(rg1.test('aa'));//false
console.log(rg1.test('a'));//true
console.log(rg1.test('b'));//true
console.log(rg1.test('c'));//true
console.log(rg1.test('abc'));//false
//26个英文字母任何一个字母返回 true  - 表示的是a 到z 的范围  
var reg = /^[a-z]$/ 
console.log(reg.test('a'));//true
console.log(reg.test('z'));//true
console.log(reg.test('A'));//false
//字符组合
// 26个英文字母(大写和小写都可以)任何一个字母返回 true
var reg1 = /^[a-zA-Z0-9]$/; 
//取反 方括号内部加上 ^ 表示取反，只要包含方括号内的字符，都返回 false 。
var reg2 = /^[^a-zA-Z0-9]$/;
console.log(reg2.test('a'));//false
console.log(reg2.test('B'));//false
console.log(reg2.test(8));//false
console.log(reg2.test('!'));//true

\b 匹配一个零宽单词边界（zero-width word boundary），表示一个单词（而非字符）边界，也就是单词和空格之间的位置，或者字符（\w）与字符串开头或者结尾之间的位置。
\B 匹配一个零宽非单词边界（zero-width non-word boundary），与"\b"相反。

var str = 'Hello World Hello JavaScript';
var reg1 = /\bHello\b/g;
var reg2 = /\BScrip\B/g;
console.log(reg1.exec(str));
//['Hello',index: 0,input: 'Hello World Hello JavaScript',groups: undefined]
console.log(reg2.exec(str));
//['Scrip',index: 22,input: 'Hello World Hello JavaScript',groups: undefined]

字符类

在这里插入图片描述
结合英文原意记忆：

d ==> digit（数字）
s ==> space（空白）
w ==> word（单词）

var str = '\nHello World Hello\r JavaScript';
console.log(str);
var reg1 = /./g;
console.log(reg1.exec(str));
//['H',index: 1,input: '\nHello World Hello\r JavaScript',groups: undefined]

var str = '123Hello World Hello 123JavaScript';
var reg1 = /^\d/g;
console.log(reg1.exec(str));
//['1',index: 0,input: '123Hello World Hello 123JavaScript',groups: undefined]

var str = 'Hello World Hello 123JavaScript';
console.log(str);
var reg1 = /^\D/g;
console.log(reg1.exec(str));
//['H',index: 0,input: 'Hello World Hello 123JavaScript',groups: undefined]

var str = '!Hello World Hello JavaScript';
// \w -> [a-zA-Z0-9_]
var reg1 = /^\w/;
console.log(reg1.test(str));
//['H',index: 1,input: '!Hello World Hello JavaScript',groups: undefined]
// \W -> [^a-zA-Z0-9_]
var reg2 = /^\W/;
console.log(reg2.test(str));
//['!',index: 0,input: '!Hello World Hello JavaScript',groups: undefined]

var str = '\nHello World Hello 123JavaScript';
console.log(str);
//['\n',index: 0,input: '\nHello World Hello 123JavaScript',groups: undefined]
var reg1 = /^\s/g;
console.log(reg1.exec(str));
//['H',index: 1,input: '\nHello World Hello 123JavaScript',groups: undefined]

数量词

在这里插入图片描述

var reg = /ab{2}/;
console.log(reg.test("ab"));//fasle

var reg = /a{3,}/;
console.log(reg.test("ab"));//fasle

var reg = /ab*/;
console.log(reg.test("ab"));//true
var reg1 = new RegExp(/^a*$/);
console.log(reg1.test("a")); // true
console.log(reg1.test("")); // true

var reg = /a?/;
console.log(reg.test("ab"));//true
var reg3 = new RegExp(/^a?$/);
console.log(reg3.test("a")); // true
console.log(reg3.test("")); // true
console.log(reg3.test("aaa")); // false

var reg2 = new RegExp(/^a+$/);
console.log(reg2.test("a")); // true
console.log(reg2.test("")); // false

重复方式

贪婪模式：尽可能多的匹配（首先取最多可匹配的数量为一组进行匹配），当匹配剩余的字符串，还会继续尝试新的匹配，直到匹配不到为止，为默认模式。

// 对字符串"123456789"，匹配其中的数字3-6次：\d{3,6}，先匹配数字出现6次的字符串（123456），然后再从剩余字符串（789）中匹配出现数字3次的情况，剩余字符若没有出现数字3次则停止匹配.
var str = "123456789";
var reg = /\d{3,6}/g;
console.log(reg.exec(str)); //[ '123456', index: 0, input: '12345678', groups: undefined ]
console.log(reg.exec(str)); // [ '789', index: 6, input: '123456789', groups: undefined ]
console.log(reg.exec(str)); // null

非贪婪模式：尽可能少的匹配（每次取最少匹配的数量为一组进行匹配），直到匹配不到为止
使用方法：在量词后加上？

// 对字符串"123456789"，匹配其中的数字3-6次：\d{3,6}，先匹配数字出现3次的字符串（123），然后再从剩余字符串（456789）中匹配出现数字3次的情况，剩余字符若没有出现数字3次则停止匹配.
var str = "123456789";
var reg = /\d{3,6}?/g;
console.log(reg.exec(str)); //[ '123', index: 0, input: '123456789', groups: undefined ]
console.log(reg.exec(str)); // [ '456', index: 3, input: '123456789', groups: undefined ]
console.log(reg.exec(str)); // [ '789', index: 6, input: '123456789', groups: undefined ]

选择-分组-引用

选择

字符"|"用于分隔供选择的字符，选择项的尝试匹配次序是从左到右，直到发现了匹配项，如果左边的选择项匹配，就忽略右边的匹配项，即使它可以产生更好的匹配。

var reg = /html|css|js/
console.log(reg.exec('qweqwehtmlcss')); // html

分组

下面的正则表达式可以匹配’spylentspylentspylent’

/spylentspylentspylent/

分组则是：
这里有圆括号包裹的一个小整体成为分组。

/(briup){3}/

候选

一个分组中，可以有多个候选表达式，用 | 分隔：

var reg = /I Like (basketball|football|table tennis)/
console.log(reg.test('I Like basketball')); //true
console.log(reg.test('I Like football')); //true
console.log(reg.test('I Like table tennis')); //true

捕获与引用

被正则表达式匹配（捕获）到的字符串会被暂存起来。其中，由分组捕获的串会从1开始编号，于是我们可以引用这些串：

var reg = /(\d{4})-(\d{2})-(\d{2})/

var date = '2021-08-29'

reg.test(date)
// 捕获之前要先test/exec
console.log(RegExp.$1); //2021
console.log(RegExp.$2); //08
console.log(RegExp.$3); //29

$1引用了第一个被捕获的串，$2是第二个，依次类推。

嵌套分组的捕获

var reg = /((apple) is (a (fruit)))/
var str = "apple is a fruit"
reg.test(str) // true
RegExp.$1 // apple is a fruit
RegExp.$2 // apple
RegExp.$3 // a fruit
RegExp.$4 // fruit

引用

正则表达式里也能进行引用，这称为反向引用：

var reg = /(\w{3}) is \1/
console.log(reg.test('kid is kid')); // true
console.log(reg.test('dik is dik')); // true
console.log(reg.test('kid is dik')); // false
console.log(reg.test('dik is kid')); // false

\1引用了第一个被分组所捕获的串，换言之，表达式是动态决定的。

注意，如果编号越界了，则会被当成普通的表达式：

var reg = /(\w{3}) is \6/;
reg.test( 'kid is kid' ); // false
reg.test( 'kid is \6' );  // true

String对正则表达式的支持

search

查找字符串中是否有匹配正则的字符串，有则返回字符串第一次出现时的位置，无则返回null

正则中无论是否有全局匹配都不会影响返回结果

var str = 'hello world hello';
var reg = /hello/;
var reg2 = /hello/g;
console.log(str.search(reg)); //返回 0
console.log(str.search(reg2));//返回 0

match

匹配字符串中符合正则表达式的字符串，并返回该字符串的一个数组，其中包括字符串内容、位置

如果正则设置全局匹配，则一次性返回所有符合正则表达式的字符串数组

如果其中添加了分组，返回符合要求的字符串以及分组的一个数组，但如果同时开启全局匹配则不会在数组中添加分组内容

var str = 'hello world hello';
var reg1 = /hello/;
var reg2 = /hello/g;
var reg3 = /(he)llo/;
var reg4 = /(he)llo/g;
// 匹配字符串中符合正则表达式的字符串，并返回该字符串的一个数组，其中包括字符串内容、位置
// [ 'hello', index: 0, input: 'hello world hello', groups: undefined ]
console.log(str.match(reg1));
// 如果正则设置全局匹配，则一次性返回所有符合正则表达式的字符串数组
// [ 'hello', 'hello' ]
console.log(str.match(reg2));
// 如果其中添加了分组，返回符合要求的字符串以及分组的一个数组
// [
//   'hello',
//   'he',
//   index: 0,
//   input: 'hello world hello',
//   groups: undefined
// ]
console.log(str.match(reg3));
// 如果同时开启全局匹配则不会在数组中添加分组内容
// [ 'hello', 'hello' ]
console.log(str.match(reg4));

split

// 以某种形式分割字符串 split()
var str = "terry134briup156lisi zhangsan";
// 当数字出现一次或多次时
var reg = /\d+/;
var result = str.split(reg);
console.log(result); // [ 'terry', 'briup', 'lisi', 'zhangsan' ]

replace

// 满足正则表达式条件的内容将被替换
var str = 'javascript'
// 如果开启全局模式 则替换所有满足条件的字符
var reg = /javascript/;
// replace(正则表达式, 要替换的内容)
var result = str.replace(reg, 'java');
console.log(result); //java
console.log(str); //javascript

前瞻表达式

在正则表达式当中有个东西叫做前瞻，有的管它叫零宽断言：

表达式	名称	描述
(?=exp)	正向前瞻	匹配后面满足表达式exp的位置
(?!exp)	负向前瞻	匹配后面不满足表达式exp的位置

由于 JS 原生不支持后瞻，所以这里就不研究它了。我们来看看前瞻的作用：

var str = 'Hello, Hi, I am Hilary.';
// 后面一定要匹配什么
var reg = /H(?=i)/g;
var newStr = str.replace(reg, "T");
console.log(newStr);//Hello, Ti, I am Tilary.

在这个demo中我们可以看出正向前瞻的作用，同样是字符"H"，但是只匹配"H"后面紧跟"i"的"H"。就相当于有一家公司reg，这时候有多名"H"人员前来应聘，但是reg公司提出了一个硬条件是必须掌握"i"这项技能，所以"Hello"就自然的被淘汰掉了。

那么负向前瞻呢？道理是相同的：

var str = 'Hello, Hi, I am Hilary.';
// 后面一定不要匹配什么
var reg = /H(?!i)/g;
var newStr = str.replace(reg, "T");
console.log(newStr);//Tello, Hi, I am Hilary.