- 博客(11)
- 资源 (5)
- 收藏
- 关注
原创 pandas set_value用法
原始代码:df_valid_inventory_sel_tmp['col1']=-9999for index in df_valid_inventory_sel_tmp.index: time_stamp = time.time() df_valid_inventory_sel_tmp['col1'][index] =su.sta_mean(index) ##传统拷贝 40ms...
2018-03-09 11:54:42 7421
原创 pandas解决“pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in line 3, s”
df_status0_invertory = pd.read_csv(inventory_dir + inventory_status0_file_name, delimiter=',', header=None, error_bad_lines=False)解决方法:加入参数error_bad_lines=False...
2018-03-08 11:34:57 55349
原创 巧用groupby解决Dataframe筛选分组效率慢问题
原代码:for name in list_valid_perfor_inventory: time_stamp = time.time() df_tmp1 = df_all_performance[df_all_performance['res_ins_id'] == name] ###170万行,该语句大约需要2S if df_tmp1.empty: co...
2018-03-06 11:36:40 5035
原创 python时戳转换成字符串日期,并形成时间序列文件
######### Get all performance data of one objectdef get_one_object_perfor_data(object_id,dst_dir,src_file_name): df = pd.read_csv(src_file_name,delimiter=',',header=0) df_tmp1 = df[df['res_i
2018-01-26 14:42:17 1181
原创 Python异常处理实例
######### File is too big, read file line by line(if file is small, we can use pandas)def get_valid_inventory(src_dir,tmp_file,des_dir): if not os.path.exists(des_dir): os.mkdir(des_dir)
2018-01-26 11:01:47 2119
原创 python解压TAR文件至至指定文件夹
######### Extract all files from src_dir to des_dirdef extract_tar_files(src_dir,des_dir): files = os.listdir(src_dir) for file in files: dir_tmp = os.path.join(src_dir, file)
2018-01-26 10:59:10 5621
原创 SQL模糊匹配后再分组
####按硬件厂家统计覆盖质量create table Manufacture_STAT(select (case when DeviceId like '%CIOT%' then 'CIOT'when DeviceId like '%FHTT%' then 'FHTT'when DeviceId like '%HWTC%' then 'HWTC'when De
2017-11-21 15:51:19 2354
原创 SQL 先分组再分段统计每段个数
####按家庭维度统计信号强度create table Power_STAT(select DeviceId,count(DeviceId) as AllNum,sum(case when SubdeviceWlanRadioPower >= -67 then 1 else 0 end) as GoodNum,sum(case when SubdeviceWlanRadio
2017-11-20 21:31:33 3916
原创 MySQL Workbench出现:Error Code: 2013. Lost connection to MySQL server during query的问题解决
问题描述; 1.表规模是 150万行,每行18列。 执行查询语句: selectcount(DeviceId) from(select DeviceId, count(distinct SubdeviceMac) as MACnumfrom wide_table_all_row where TIMESTAMPP LIKE "%2017-10-13%" group by Device
2017-11-20 11:47:55 9899 1
原创 遍历文件夹下压缩文件并解压至指定文件夹
# -*- coding:utf-8 -*-import Cons as csimport osimport zipfile######### Extract all files in Dirdef extract_to(dir): #os.chdir(dir) files = os.listdir(dir) for file in files:
2017-11-08 17:41:04 2172
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人