Python pandas 小技巧
怎样将dataframe中的字符串日期转化为日期
方法一
data['交易时间'] = pd.to_datetime(data['交易时间'])
方法二
源自利用python进行数据分析P304
使用python的datetime包中的
strptime函数,datetime.strptime(value,’%Y/%M/%D’)
strftime函数,datetime.strftime(‘%Y/%M/%D’)
注意使用datetime包中后面的字符串匹配需要和原字符串的格式相同,才能转义过来,相当于yyyy-mm-dd格式的需要按照’%Y-%M-%D’来实现,而不是’%Y/%M/%D’
data['交易时间']=data['交易时间'].apply(lambda x:datetime.strptime(x,'%Y-%m-%d %H:%M:%S'))
去重(行)
data = data.drop_duplicates()
read_csv
pandas.read_csv(filepath_or_buffer: Union[str, pathlib.Path, IO[~AnyStr]], sep=',', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal: str = '.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, dialect=None, error_bad_lines=True, warn_bad_lines=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)[source]
- sepstr, default ‘,’
Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python’s builtin sniffer tool, csv.Sniffer. In addition, separators longer than 1 character and different from ‘\s+’ will be interpreted as regular expressions and will also force the use of the Python parsing engine. Note that regex delimiters are prone to ignoring quoted data. Regex example: ‘\r\t’. - skiprowslist-like, int or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file.
If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be lambda x: x in [0, 2]. - encodingstr, optional
Encoding to use for UTF when reading/writing (ex. ‘utf-8’). List of Python standard encodings .
遍历迭代
https://includestdio.com/1419.html
for index, row in df.iterrows():
print row['c1'], row['c2']
Output:
10 100
11 110
12 120
将两列构建成一个dict
df.set_index("A")["B"].to_dict()