Python
日常基本操作
捣蛋怪大眼萌
好好学习|积极乐观
展开
-
【Anaconda】在Anaconda里建立Python虚拟环境
Set up virtual environment for Python using Anaconda转载 2022-08-10 14:22:19 · 4518 阅读 · 0 评论 -
【Python】函数比较 split() os.path.split() os.path.splitext()
split() os.path.splitext() os.path.split()函数用法转载 2022-07-20 10:11:02 · 116 阅读 · 0 评论 -
【mongo】去重插入数据
百万级数据无重复地insert / update到mongo转载 2022-06-15 15:36:06 · 712 阅读 · 1 评论 -
【Python】Add header to dataframe | Change column names of dataframe
Add a Header Row to a Pandas DataFrame (3 methods)Change Column Names of DataFrame (3 methods)转载 2022-05-26 12:08:55 · 339 阅读 · 1 评论 -
【Python】get all the files under the specific directory
需求:需要对一系列的csv文件进行合并方法:将文件都放在一个文件夹下,然后读取该文件夹下的所有文件,批量处理import osimport pandas as pd# get the current absolute pathos.gecwd()# change the path to the directory of files need to editpath = 'c:\\Users\\username\\filepath'os.chdir(path)# get all the转载 2022-04-27 14:31:19 · 85 阅读 · 0 评论 -
【error solved】AttributeError: module ‘time‘ has no attribute ‘clock‘
报错源码:import time# active the time clockstart = time.clock()# here is the code need to test running time# calculate the time between start and endend = time.clock()print('运行时间 : %s 秒' %(end-start))报错内容:AttributeError module ‘time’ has no attribut转载 2022-04-27 11:41:34 · 285 阅读 · 0 评论 -
【Pandas】chunksize分块处理大型csv文件
– 错误的操作导致保存了1TB以上的csv,要对csv重新读取处理,直接使用read_csv()不带任何参数,会把RAM撑爆。– 所以使用chunksize:不一次性将文件读入内存(RAM)中,而是分多次。官方示例: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-chunkingimport pandas as pdimport timestart = time.perf_counter() # calcula原创 2022-04-27 11:21:01 · 2379 阅读 · 0 评论 -
【Pandas】根据column计算rows的次数,用groupby
retrun a Series, to get row counts per group is by calling .size() :df.groupby(['col_name']).size()retrun a DataFrame (instead of a Series) :df.groupby(['col_name']).size().reset_index(name='counts')Ref: stackoverflow: Get statistics for each group (原创 2022-04-27 10:04:46 · 2430 阅读 · 0 评论 -
【error solved】Pandas DataFrame.to_csv raising IOError: No such file or directory
报错的意思是, 没有对应的文件夹。可能是在上一步操作的时候把路径指定了?os.chdir(path)用了网上的方法,先判断是否文件夹存在,如果不存在,就先创建对应的文件夹。import osoutname = 'filename.csv'outdir = './dir'if not os.path.exists(outdir): os.mkdir(outdir)fullname = os.path.join(outdir, outname)df.to_csv(fullname)还是原创 2022-04-27 09:46:00 · 446 阅读 · 0 评论 -
【Python】Open jupyter notebook through cmd
从cmd打开jupyter notebookWindows 10 - 'jupyter' is not recognized as an internal or external command, operable program or batch fileWindows 10 - ‘jupyter’ is not recognized as an internal or external command, operable program or batch file$ python -m pip in原创 2021-09-27 15:37:14 · 381 阅读 · 0 评论