《Pyhton数据分析》阅读摘要 CH2 Introductory Examples

1 usa.gov data from bit.ly


1. read txt: open(path).readline()
2. converting json: json.loads(line)
3. list comprehension: records = [json.loads(line) for line in open(path)]

1.2 Counting Time Zones in Pure Python


A Python standard library: collections

Creating dict: 
collections.defaultdict

Getting counts:
collections.Counter

1.3 Counting Time Zones with pandas


Turn a list of dicts into a pandas data frame: DataFrame(records)

methods:
DataFrame.<col>: 
get column

DataFrame['col']: 
get column

DataFrame.sum(): 
sum columns

DataFrame.div(DataFrame.sum(1), axis=0): 
normalize to sum to 1

DataFrame.groupby([]): 
get tables

DataFrameGroupby.size().unstack(): 
build table

Series.value_count(): 
count values

Series.fillna('...'): 
fill all NA with given argument

Series.plot(kind='barh', (stacked=True)): 
draw bar plot

Series.argsort(): 
get sorted index


2. MovieLens 1M Data Set


DataFrame methods:

read .dat files: 
pandas.read table('file', sep='', header=None, names=)

merge(join) two and more datasets: 
pd.merge(pd.merge(ratings, users), movies)

get statistics grouped by variables: 
DataFrame.pivot_table('var', index=[], columns='', aggfunc='mean')

counts grouped by: 
DataFrame.groupby('title').size()

select rows from Series: 
DataFrame.ix[<Series>]

sort: 
DataFrame.sort_index(by='F', ascending=False)

Series methods:

filter values: 
Series.index[a >= 250]

2.1 Measuring rating disagreement

DataFrame methods:


Add a column in data frame: 
mean_ratings['diff'] = mean_ratings['M'] - mean_ratings['F']

reverse the order:
 DataFrame.[::-1]

Series methods:


get standard deviation: 
Series.std()

reorder: 
Series.order(ascending=False)


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值