Redis：HyperLogLog应用

最新推荐文章于 2024-05-10 14:45:07 发布

寿命齿轮

最新推荐文章于 2024-05-10 14:45:07 发布

阅读量117

点赞数

分类专栏： redis学习文章标签： redis 数据库缓存

本文链接：https://blog.csdn.net/lifetime_gear/article/details/132697074

版权

redis学习专栏收录该内容

25 篇文章 0 订阅

订阅专栏

本文将先讲解亿级数据的常用统计策略，然后介绍HyperLogLog的相关概念和命令，最后使用应用代码举例讲解。

亿级数据的使用

HyperLogLog相关概念

去重统计为什么要用HyperLogLog？

亿级数据的使用

通常情况下，程序对数据的统计需求有：

聚合统计：统计多个集合元素的聚合结果（即集合间的运算操作），如共同好友、多标筛选。
排序统计：对数据进行排序，如展示最新列表、排行榜等场景。
二值统计：取值只有0和1的集合，如签到打卡场景。
基数统计：统计一个集合中不重复的元素，如UV、PV统计。

HyperLogLog常用于基数统计，如统计UV、PV、DAU、MAU，来对一个功能的使用情况进行评估。

UV：Unique Visitor，独立访客，一般指客户端IP，需要去重。

PV：Page View，页面浏览量，不用去重。

DAU：Daily Active User，日活跃用户量，常用于反映网站、互联网应用等的运营情况。

MAU：Mouthly Active User，月活跃用户量。

HyperLogLog相关概念

HyperLogLog仅支持计数，而无法获得具体数据，因此只适用于主要目标为高效、巨量地进行计数，对存储的数据的内容并不太关心的场景，比如每日注册 IP 数、每日访问 IP 数、页面实时访问数 PV、访问用户数 UV等。

此外，需要注意的是HyperLogLog不支持精确统计，其误差在0.81%左右。（数据来源：Redis new data structure: the HyperLogLog - <antirez>）

去重统计为什么要用HyperLogLog？

mysql：mysql在五百万数据后就需要分库分表，显然无法支持亿级数据。
redis的hash结构：在亿级数据下，一个ip就要15b，7*15b*100000000/8/1000/1000/1000≈1.3GB，一周的数据就要使用1.3G的内容，如果不止一亿，时间为一个月，则内存根本无法存下，并且会导致BigKey问题。
HyperLogLog：只需要12Kb即可存储2^64次计数。

应用代码举例

本文将模拟用户ip访问网页的情况，启动3个线程不断向服务器发送随机ip地址，使用HyperLogLog记录UV并定期打印。

另起java程序用于模拟发送请求：

public class Main2 {
    public static void main(String[] args) throws InterruptedException {
        new Main2().createSendThread();
        Thread.sleep(1000000);
    }

    public void createSendThread() {
        System.out.println("------五个线程开始发送请求，每个请求来自不同ip地址--------");
        new Thread(new SendThread()).start();
        new Thread(new SendThread()).start();
        new Thread(new SendThread()).start();
    }

    public class SendThread implements Runnable {
        @Override
        public void run() {
            while (true){
                try {
                    Random r = new Random();
                    String ip = r.nextInt(256) + "." + r.nextInt(256) + "." + r.nextInt(256) + "." + r.nextInt(256);
                    String postData = "ip=" + ip;

                    URL url = new URL("http://192.168.146.1:8080/sendIP");// 目标URL
                    HttpURLConnection connection = (HttpURLConnection) url.openConnection();// 打开连接

                    connection.setRequestMethod("POST");// 设置请求方法为POST
                    connection.setDoOutput(true);// 允许输出数据
                    // 设置请求头
                    connection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
                    connection.setRequestProperty("Content-Length", String.valueOf(postData.length()));
                    // 获取输出流并写入请求体数据
                    OutputStream os = connection.getOutputStream();
                    os.write(postData.getBytes());
                    os.flush();
                    os.close();
                    // 发送请求
                    connection.connect();
                    System.out.println("发生请求" + ip);
                    // 获取响应
                    int responseCode = connection.getResponseCode();
                    if (responseCode == HttpURLConnection.HTTP_OK) {
                        // 读取响应
                        BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
                        String inputLine;
                        StringBuilder response = new StringBuilder();

                        while ((inputLine = in.readLine()) != null) {
                            response.append(inputLine);
                        }

                        // 打印响应
                        System.out.println("Response: " + response.toString());

                        in.close();
                    } else {
                        System.out.println("GET request failed with response code: " + responseCode);
                    }
                    // 断开连接
                    connection.disconnect();
                    // 添加适当的线程睡眠时间，以控制请求频率
                    Thread.sleep(10); // 睡眠100毫秒，可以根据需要调整
                } catch (IOException | InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

SpringBoot项目负责接收请求并记录到Redis的HyperLogLog数据类型中：

Controller：

@RestController
public class ReceiveIPController {

    @Autowired
    ReceiveIPService service;

    @PostMapping("/sendIP")
    public void receiveIP(@RequestParam String ip){
        System.out.println("接收请求："+ip);
        service.saveIP(ip);
    }

    @GetMapping("/countIP")
    public String countIP(){
        return service.countIP();
    }
}

Service：

@Service
public class ReceiveIPService {

    @Autowired
    RedisTemplate redisTemplate;

    public void saveIP(String ip){
        redisTemplate.opsForHyperLogLog().add("ip",ip);
    }

    public String countIP() {
        Long ip2 = redisTemplate.opsForHyperLogLog().size("ip");
        return String.valueOf(ip2);
    }
}

启动程序后即可发现：