需求
根据定位找出所有的同城发布的短视频;展示展示如下:
- 海量视频;
- 不能刷出重复视频
实现
- 用es 的geo 实现同城数据搜索
- 根据es scroll 优化深度分页问题
- 用redis 的bitmap 实现数据过滤
es map 创建
在帖子index map 添加location 字段用于geo 搜索
常见mapping
'posts_index' => [
'settings' => [
'refresh_interval' => '5s',
'number_of_shards' => 3,
'number_of_replicas' => 0,
],
'mappings' => [
'properties' => [
'id' => [
'type' => 'long',
],
...
'location' => [
"type" => "geo_point"
],
...
],
]
],
es scroll 查询
- 根据地理位置搜索同城信息
$query = [
'bool' => [
'must' => [
[
'term' => [
'city_code' => $cityCode,
],
],
...
],
'must_not' => [
[
'term' => [
'user_id' => $userId,
],
],
],
"filter" => [
"geo_distance" => [
"distance" => "200km",
"location" => [
"lat" => (float)$lat,
"lon" => (float)$lon,
],
],
],
],
];
$sort = [
"_geo_distance" => [
"location" => [
"lat" => (float)$lat,
"lon" => (float)$lon,
],
"order" => "asc",
"unit" => "km",
],
];
$option = ['scroll' => '5m']; //快照时间
self::find()->query($query)->limit($limit)->orderBy($sort)->createCommand()->search($option);
第二次后根据返回的scroll_id 请求es,获取快照信息
$url = ['_search', 'scroll'];
$query = [
"scroll" => "5m",
"scroll_id" => $scoreId,
];
self::getDb()->get($url, [], json_encode($query));
redis bitmap 过滤重复数据
1.把帖子id 作为移动下标
$this->redis()->setbit($this->bucket, $id, 1);
2.根据帖子id 判断是否存在
$bit = $this->redis()->getbit($this->bucket, $id);
总结
- 如果是忠实用户会出现大量的已读的中间数据,目前只是重复取相关数据然后去重,因有预加载暂时不影响用户体验
- 对于一点的考虑,如果真要实现过滤功能,需要考虑实现一个用户一份全量数据的设计,但是目前我们是因为帖子数据较少通过去重来提高用户体验,后期数据量大后考虑不需要去重,添加随机因子来排序