操作sqlite数据库,需要将字段提取出来,经过处理之后为了方便构造DataFrame,所以构成列表形式,方便后续处理
实际字段较多,假如给空列表,那么就需要不断添加元素,也就是多次书写append方法,代码可读性就会变得较差。此时可通过构造函数进行处理。
源代码:
for i in comp:
tmp = []
t = data[data.comp == i]
total = len(t.comp)
mid = len(t[t.bak == '2'])
high = len(t[t.bak == '3'])
ips_info = list(set(t.ips))
# ips = '\n'.join(ips_info)
ips_num = len(ips_info)
tmp = pd.value_counts(t.ips)
ips = '\n'.join([i + "(" + str(tmp[i]) + "次)" for i in ips_info])
domains_info = list(set(t.domains))
# domains = '\n'.join(domains_info)
domains_num = len(domains_info)
tmp = pd.value_counts(t.domains)
domains = '\n'.join([i + "(" + str(tmp[i]) + "次)" for i in domains_info])
tmp.append(i)
tmp.append(total)
tmp.append(high)
tmp.append(mid)
tmp.append(ips_num)
tmp.append(ips)
tmp.append(domains_num)
tmp.append(domains)
ret.append(tmp)
可以观察到,这种写法比较呆。
通过*args处理之后的demo:
def sum_tmp(*args):
ret = []
for i in args:
ret.append(i)
return ret
for i in comp:
tmp = []
t = data[data.comp == i]
total = len(t.comp)
mid = len(t[t.bak == '2'])
high = len(t[t.bak == '3'])
ips_info = list(set(t.ips))
# ips = '\n'.join(ips_info)
ips_num = len(ips_info)
tmp = pd.value_counts(t.ips)
ips = '\n'.join([i + "(" + str(tmp[i]) + "次)" for i in ips_info])
domains_info = list(set(t.domains))
# domains = '\n'.join(domains_info)
domains_num = len(domains_info)
tmp = pd.value_counts(t.domains)
domains = '\n'.join([i + "(" + str(tmp[i]) + "次)" for i in domains_info])
tmp_ret = sum_tmp(i, total, high, mid, ips_num, ips, domains_num, domains)
通过这种方式,可以汇聚多个元素,后期如果需要继续添加元素,则在调用函数的时候继续增加值就行。代码维护的成本就低了很多。