又是让chatGPT写代码了~
需求的输入输出是:
# 输入:
dataclasses = {'took': 1327, 'responses': [
{'took': 1327, 'aggregations': {
'2': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0,
'buckets': [{'key': 'AAA', 'doc_count': 111},
{'key': 'BBB', 'doc_count': 222},
{'key': 'CCC', 'doc_count': 333},
{'key': 'DDD', 'doc_count': 444}]}}, 'status': 200},
{'took': 1327, 'aggregations': {
'4': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0,
'buckets': [{'key': 'AAA', 'doc_count': 555},
{'key': 'BBB', 'doc_count': 666},
{'key': 'CCC', 'doc_count': 777},
{'key': 'DDD', 'doc_count': 888}]}}, 'status': 200},
]}
输出为:
'AAA', 111, 555
'BBB', 222, 666
'CCC', 333, 777
'DDD', 444, 888
或
AAA' 'BBB' CCC' DDD'
111 222 333 444
555 666 777 888
多次提问之后,得到满足需求的答案,其中提问为
提供python代码,将字典dataclasses中的每个buckets当做不同样本,其中key为特征名,doc_count为特征值,最终输出一份excel文件的特征表,其中
dataclasses = {'took': 1327, 'responses': [
{'took': 1327, 'aggregations': {
'2': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0,
'buckets': [{'key': 'AAA', 'doc_count': 111},
{'key': 'BBB', 'doc_count': 222},
{'key': 'CCC', 'doc_count': 333},
{'key': 'DDD', 'doc_count': 444}]}}, 'status': 200},
{'took': 1327, 'aggregations': {
'4': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0,
'buckets': [{'key': 'AAA', 'doc_count': 555},
{'key': 'BBB', 'doc_count': 666},
{'key': 'CCC', 'doc_count': 777},
{'key': 'DDD', 'doc_count': 888}]}}, 'status': 200},
]}
特征表的示例为:
AAA' 'BBB' CCC' DDD'
111 222 333 444
555 666 777 888
答案为
import pandas as pd
dataclasses = {'took': 1327, 'responses': [
{'took': 1327, 'aggregations': {
'2': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0,
'buckets': [{'key': 'AAA', 'doc_count': 111},
{'key': 'BBB', 'doc_count': 222},
{'key': 'CCC', 'doc_count': 333},
{'key': 'DDD', 'doc_count': 444}]}}, 'status': 200},
{'took': 1327, 'aggregations': {
'4': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0,
'buckets': [{'key': 'AAA', 'doc_count': 555},
{'key': 'BBB', 'doc_count': 666},
{'key': 'CCC', 'doc_count': 777},
{'key': 'DDD', 'doc_count': 888}]}}, 'status': 200},
]}
# 创建一个空的DataFrame
df = pd.DataFrame()
# 遍历每个response
for response in dataclasses['responses']:
# 获取buckets
buckets = response['aggregations'][str(list(response['aggregations'].keys())[0])]['buckets']
# 将buckets转换为字典
bucket_dict = {bucket['key']: bucket['doc_count'] for bucket in buckets}
# 将字典转换为DataFrame的一行
row = pd.DataFrame(bucket_dict, index=[0])
# 将行添加到DataFrame中
df = pd.concat([df, row])
# 将DataFrame写入Excel文件
df.to_excel('特征表.xlsx', index=False)