提供给非技术人员,用于查询品类内各个筛选key和筛选值的次数,用于数据分析,话不多说,直接上代码
from optparse import OptionParser
import time,datetime
import requests,json
parser = OptionParser()
parser.add_option("-a","--add_group",dest = "group",help = "get all count of choosen option",default = None)
parser.add_option("-c","--category_id",dest="category_id",help="you have to choose a category_id, if you don't know,check the document",default = None)
parser.add_option("-k","--condition_key",dest="condition_key",help = "choose condition, if you don't know,check the document",default = None)
parser.add_option("-v","--condition_value",dest="condition_value",help = "choose value, if you don't know,check the document",default = None)
parser.add_option("--gte",dest="gte",help = "after this time, use millsecond",default = 1563854400000)
parser.add_option("--lte",dest="lte",help = "before this time, user millsecond",default = int(time.time())*1000)
(options, args) = parser.parse_args()
must_list = []
cate_id_term = {}
cond_key_term = {}
cond_val_term = {}
if options.category_id == None:
print("category_id is empty")
options.category_id = "DEFAULT_CAT_ID"
cate_id_term["term"] = {"category_id": options.category_id}
if options.condition_key != None:
cond_key_term["term"] = {"condition_key":options.condition_key}
if options.condition_value!=None:
cond_val_term["term"] = {"condition_value":options.condition_value}
range_map = {}
range_map["range"]={"time_stamp":{"gte":options.gte,"lte":options.lte}}
must_list.append(cate_id_term)
if len(cond_key_term) > 0:
must_list.append(cond_key_term)
if len(cond_val_term) > 0:
must_list.append(cond_val_term)
must_list.append(range_map)
query_map = {}
query_map["query"]={"bool":{"must":must_list}}
if options.group != None:
query_map["aggs"] = {"group_by_option":{"terms":{"field":options.group+".keyword"}}}
query_map["size"] = 0
print(query_map)
headers = {"Content-Type":"application/json;charset = UTF-8"}
url = "http://ES_HOSTNAME/ES_INDEX/ES_TYPE/_search"
res = requests.post(url = url,headers = headers,auth=('ES_USERNAME', 'ES_PASSWORD'),data = json.dumps(query_map))
if options.group == None:
print(res.json().get("hits").get("total"))
else:
print(res.json().get("aggregations").get("group_by_option").get("buckets"))
使用手册:
环境配置:
首先需要安装python,mac系统自带python2
参数列表:
-c 品类id 必须要传!!!!
-k 需要统计的筛选条件,如:level段位、gender性别
-v 筛选条件的值,如:白银、男
-a 对指定字段进行组合,只允许跟的参数为:condition_key或condition_value 第一种是获得使用各种标签进行筛选的次数统计 第二种是使用各种具体标签值进行筛选的次数统计
--gte 大于指定时间,注意请使用时间戳转换工具转为毫秒数,例:1563854400000 附在线转换网址:https://tool.lu/timestamp/
--lte 小于指定时间,默认值为当前时间,注意转换单位是毫秒不是秒
脚本运行示例:
python get_count_op.py -c 8efb76c4477637c4c70352b8ce2be686 -k level -v 永恒钻石 --gte 1563899400000
获取7月24日0点30至今王者荣耀品类筛选段位为永恒钻石的记录数量(其实这里不用-k也可以)
python get_count_op.py -c 412592068762796032 -a condition_key -k gender
获取和平精英品类性别下选择的值的记录数量,返回结果:
[{u'key': u'0', u'doc_count': 10449}, {u'key': u'1', u'doc_count': 3688}]
即选筛选女性10449次男性3688次,看来用户确实更偏向选女性
附上查询语句,可以对照的自己测试一下:
{
"size":0,
"query": {
"bool": {
"must": [{
"term": {
"category_id": "8efb76c4477637c4c70352b8ce2be686"
}
},{
"term":{
"condition_key":"level"
}
}]
}
},
"aggs": {
"group_by_gender": {
"terms": {
"field": "condition_value.keyword"
}
}
}
}