【实例】JS 正则表达式提取 html 中纯文本，去掉样式，其它标签_前端正则提取带有标签的html文字不提取数字-CSDN博客

背景

最近遇到需要处理富文本粘贴过滤的问题，想着最近也学习了正则表达式，拿着练练手，下面代码还有需要优化的点，但是觉得对于粘贴过滤基本上 80% 可以用吧。

贴一下代码：

实现代码

const getParseText = (html: any) => {
    const reg = new RegExp('<.+?>', 'g');
    const msg = html.replace(reg, '');
    return msg;
};

// 主函数
const handlePasteText = (html: any) => {
    const reg = new RegExp('<.+?>(.*?)</.+?>', 'g');
    const msg = html.replace(reg, (all: any, content: any) => {
        const text = getParseText(content);
        return `<p>${text}</p>`;
    });
    return msg;
};

仅做参考，应该还有未处理到的点，对于网页文字基本粘贴过滤应该是可以的。

欢迎交流

======== 2021/12/28 晚更新============
又有新的考虑点了，更新一下代码：

const getParseText = (html: any) => {
    const reg = new RegExp('<(.+?)>(.*?)</.+?>', 'g');
    let flag = false;
    const msg = html.replace(reg, (all: any, tag: any, content: any) => {
        if (tag === 'strong') {
            return content;
        } else {
            const res = content.replace(new RegExp('<.+?>', 'g'), '');
            flag = true;
            return `<p>${res}</p>`;
        }
    });
    if (flag) {
        return msg;
    } else {
        const res = msg.replace(new RegExp('<.+?>', 'g'), '');
        return res;
    }
};
// 主函数
const handlePasteText = (html: any) => {
    const reg = new RegExp('<.+?>(.*?)</.+?>', 'g');
    const msg = html.replace(reg, (all: any, content: any) => {
        return getParseText(content);
    });
    const res = msg.replace(new RegExp('<.+?>', 'g'), '<p>');
    return res;
};