LeetCode 393. UTF-8 编码验证（模拟+位运算）

最新推荐文章于 2024-08-10 22:52:22 发布

xylitolz

最新推荐文章于 2024-08-10 22:52:22 发布

阅读量1k

点赞数

分类专栏： LeetCode 文章标签： leetcode 算法

本文链接：https://blog.csdn.net/xylitolz/article/details/123466598

版权

UTF-8编码遍历位运算字符验证算法实现

关键词由CSDN通过智能技术生成

LeetCode 专栏收录该内容

55 篇文章 0 订阅

订阅专栏

文章目录

题目描述
遍历 + 位运算
Reference

题目描述

393. UTF-8 编码验证

在这里插入图片描述

遍历 + 位运算

对于从 $\textit{data}[\textit{index}]$ 开始的 $\text{UTF-8}$ 字符，可根据 $\textit{data}[\textit{index}]$ 的值得到该字符的长度 $n$ ，如果下一个 $\text{UTF-8}$ 字符存在，则下一个 $\text{UTF-8}$ 字符从下标 $\textit{index} + n$ 开始。

遍历数组 $\textit{data}$ 得到每个字符的开始下标和长度，并分别判断每个字符是否符合 $\text{UTF-8}$ 编码的规则。

class Solution {
    public boolean validUtf8(int[] data) {
        int n = data.length;
        int k = 0;
        while (k < n) {
            int num = data[k];
            int nByte = getNByte(num);
            if (nByte == -1 || nByte + k > n) {
                return false;
            }
            for (int i = 1; i <= nByte - 1; i++) {
                // 10 开头
                if (((data[i + k] >>> 6) & 2) != 2) {
                    return false;
                }
            }
            k += (nByte);
        }
        return true;
    }
	// n 字节
    private int getNByte(int num) {
        if (((num >>> 7) & 1) == 0) {
            // 1 字节的字符
            return 1;
        }
        int res = 0;
        int move = 7;
        while (((num >>> move) & 1) == 1 ) {
            res++;
            // 每个 UTF-8 字符由 1 到 4 个字节组成
            if (res > 4) {
                return -1;
            }
            move--;
        }
        return res <= 1 ? -1 : res;
    }
}

时间复杂度： $O (m)$ ，其中 $m$ 是数组 $\textit{data}$ 的长度。需要遍历数组 $\textit{data}$ 一次，对于数组中的每个元素的计算时间都是 $O (1)$ 。
空间复杂度： $O (1)$ 。

Reference

UTF-8 编码验证

xylitolz

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
LeetCode 393. UTF-8 编码验证（模拟+位运算）

文章目录题目描述遍历 + 位运算Reference题目描述393. UTF-8 编码验证遍历 + 位运算对于从 data[index]\textit{data}[\textit{index}]data[index] 开始的 UTF-8\text{UTF-8}UTF-8 字符，可根据 data[index]\textit{data}[\textit{index}]data[index] 的值得到该字符的长度 nnn，如果下一个 UTF-8\text{UTF-8}UTF-8 字符存在，则下一个
复制链接

扫一扫