- 博客(4)
- 资源 (6)
- 收藏
- 关注
原创 Put string into dataframe
Only a reminder...import pandas as pdfrom io import StringIOtitleData = StringIO("""col1;col2;col3""")dk = pd.read_csv(titleData,sep=";")for i in all_instruments('CS').order_book_id: k = in...
2018-02-27 19:55:01 254
原创 Easy way to prevent accidental deletion under ubuntu
As it happens many many times, people under cmd mode, use 'rm -rf' so unfetteredly including me... And I finally realized it's better to make some adjustment. Most of the time, I see ppl alias rm='r...
2018-02-20 11:53:47 275
原创 mongoDB 批量dump模板
常用备忘,这款其实效率挺低。只能小打小闹。import pymongofrom pymongo import MongoClient#import pprintimport pandas as pdclient = MongoClient('mongodb://root:333@127.0.0.1:27017/')db = client.datahubcollection = db....
2018-02-13 17:07:54 256
原创 又升级啦,玩低配多进程啦~
觉得自己是老鼠掉进米缸。今天观察发现hdf5文件检查各种慢,各种内存外溢。请自动化谷溪同学指点。竟然开始了多进程操作...总之就是很开心。还额外告诉我一个笼统的处理手法,一般爬虫用多线程,关联不大的batch运行用多进程。def open_hdf5(path_file): hdf5_addr_list = os.listdir(path_file) abs_files = [] fo...
2018-02-08 15:37:35 208
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人