Notes-Python for Data Science and Machine Learning Bootcamp notes

matplotlib、seaborn

 

1、matplotlib

anaconda安装 matplotlib:  conda  install matplotlib
python 下安装matplotlib: pip install matplotlib
import matplotlib.pyplot asplt
%matplotlib inline
import numpy as np
x=np.linspace(0,5,11)
y=x**2
#functional 
plt.plot(x,y,'r-')
plt.xlabel('xlabel‘)
plt.ylabel('ylabel')
plt.title('title')
plt.show()  # if not in the jypyter notebook 
#plot many plots  functions
plt.subplot(1,2,1)
plt.plt(x,y,'r')

plt.subplot(1,2,2)
plt.plot(y,x,'b')

plt.show()
# figure object
fig=plt.figure()

axes=fig.add_axes([0.1,0.1,0.8,0.8])  # add the axes
axes.plot(x,y) #plot the axes
axes.set_xlabel('x label')
axes.set_ylabel('y label')
axes.set_title('set title‘)
# plot many plots ,the plots in diffent axes
fig=plt.figure()
axes1=fig.add_axes([0.1,0.1,0.8,0.8])
axes2=fig.add_axes([0.2,0.5,0.4,0.3])

axes1.plot(x,y)
axes2.plot(y,x)

axes1.set_title('larger plot')
axes2.set_title('smaller plot')
# subplots
fig,axes=plt.subplots(nrows=1,ncols=2)  # 3*3plots

for current_ax in axes:
    current_ax.plot(x,y)

fig.axes=plt.subplots(nrows=1,ncols=2)
axes[0].plot(x,y)
axes[0].set_title('')

axes[1].plot(y,x)
axes[1].set_title('second plot’)
# figure size and dpi

fig=plt.figure(figsize=(8,2))

ax=fig.add_axes([0,0,1,1])
ax.plot(x,y)
#figure size and dpi
fig.axes=plt.subplots(nrows=2,figsize=(8,2))
axes[0].plot(x,y)
axes[1].plot(y,x)
plt.tight_layout()
#save the figure
fig.savefig('my_picture.png',dpi=200)
#figure size and dpi
fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.plot(x,y)

ax.set_tilte()
ax.set_xlabel()
ax.set_ylabel()
# figure size and dpi
fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.plot(x,x*9,label='x squared')
ax.plot(x,x*3,label='x cubed')
ax.legend(loc=(0.1,0.1))
#plot appearance
fig=plt.figure()
ax=fig.add_axes([0,0,1,1,])
ax.plot(x,y,color='blue')== ax.plot(x,y,color='#fffff')
#plot appearance  line marke style
fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.plot(x,y,color='purple',linewidth=3,aplha=0.5)
#ax.plot(x,y,color='purple',lw=3,alpha=0.5)
ax.plt(x,y,color='purple',lw=3,linestyle='--')
ax.plot(x,y,color='purple',lw=3,ls='--')
ax.plot(x,y,color='purple',lw=3,ls='-',marker='2',markersie=10,markerfacefolor='yellow',markeredgewith=3,markeredgecolor='gree')# marker and markersize
# apperanrance style xlimt,ylimt

fig=plot.figure()
ax=fig.add_axes()
ax.plot(x,y,color='purple',lw=2)
ax.set_xlim([0,1])
ax.set_ylim([0,1])
# special plot types
plt.scatter(x,y)



from random import sample
data=sample(range(1,1000),100)
plt.hist(data)


data=[np.random.normal(0,std,100) for std in range(1,4)

#retangular box plot
plt.boxplot(data,vert=Ture,patch_artist=True);

further reading

http://www.loria.fr/~rougier/teaching/matplotlib

#matplotlib exercises

#data
import numpy as np
x=np.arange(0,100)
y=x*2
z=x**2

import matplotlinb.pyplot as plt
%matplpot inline

fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.plot(x,y)

ax.set_alabel('x')
ax.set_ylabel('title)

##########################

#exerise 2

fig=plt.figure()
ax1=fig.add_axes([0,0,1,1])
ax2=fig.add_axes(0.2,0.5,0.2,0.2])

ax1.plot(x,y,color='red')
ax2.plot(x,y,color='red')



#########################333

#exercise 3
fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax2=fig.add_axes([0.2,0.5,0.4,0.4])

ax.plot(x,z)
ax.set_xlabel(‘x’)
ax.set_ylabel('z')

ax2.plot(x,y)
ax2.set_title('zoom')
ax2.set_xlabel('x')
ax2.set_ylabe('y')
ax2.set_xlim([20,22])
ax2.set_ylim([30,50])

##########################33

# exerciser 4.01

fig.axes=plt.subplots(1,2)
axes[0].plot(x,y,ls='-',color='blue',lw=3)

axes[1].plot(x,z,color='red,lw=3)


# exerciser 4.01

fig.axes=plt.subplots(1,2,figsize=(12,2))
axes[0].plot(x,y,ls='-',color='blue',lw=3)

axes[1].plot(x,z,color='red,lw=3)


2、seaborn

 anconda 安装 seaborn: conda install seaborn
 python 环境下安装seaborn: pip install seaborn
# Distribtion Plots
import seaborn as sns
%matplorlib inline
tips=sns.load_dataset('tips')
tips.head()

sns.displot(tips['total_bill'])

sns.displot(tips['total_bill'],kde=False,bins=30)


################

sns.jointplot(x='total_bill',y='tip',data=tips)
sns.jointplot(x='total-bill',y='tip',data=tips,kind='hex')
sns.jointplot(x='total-bill',y='tip',data=tips,kind='reg')

#######################

sns.pairplot(tips)
sns.pairplot(tips,hue='sex',palette='coolwards')

############################

sns.rugplot(tips['total_bill'])


##############

sns.kdeplot(tips['total_bill'])


###########################################

#Categorical plots

import seaborn as sns
import numpy as np
% matplotlib inline
tips=sns.load_dataset('tops')
tips.head()

sns.barplot(x='sex',y='total_bill',data=tips)

sns.barplot(x='sex',y='total_bill',data=tips,estimator=np.std)

sns.countplot(X='sex',data=tips)

sns.boxplot(x='day',y='total_bill',data=tips)

sns.boxplot(x='day',y='total_bill',data=tips,hue='smoker')

sns.violinplot(x='day',y='total_bill',data=tips)

sns.violinplot(x='day',y='total_bill',data=tips,hue='sex',spit=True)

sns.stripplot(x='day',y='total_bill',data=tips,jitter=True,hue='sex',split=True)

#######
sns.violinplot(x='day',y='total_bill',data=tips)
sns.swarmplot(x='day',y='total_bill',data=tips,color='black')
##########

sns.factorplot(x='day',y='total_bill',data=tips,kind='bar')

#####################################################

#Matrix Plots

import seaborn as sns
%matplotlib inline
tips=sns.load_dataset('tips')
flights=sns.load_dataset('flights')
tips.head()
flights.head()

tc=.corr()
sns.heatmap(tc)
sns.heatmap(tc,annot=True,cmap='colorwarm')

flights.pivot_table(index='month',columns='years',values='passengers')

fp=flights.pivot_table(index='month',columns='years',values='passengers')
sns.headmap(fp)
sns.heatmap(fp,cmap='magma',linecolor='white',linwidths=1)

sns.clustermap(fp)
sns.clustermap(fp,cmap='coolwarm',standard_scale=1)



############################################################
####GRID

import seaborn as sns
%matplotlib inline
iris=sns.load_dataset('iris')
iris.head()

sns.pairplot(iris)
sns.PairGrid(iris)
g=snsn.PairGrid(iris)
g.map(plt.scatter)
g.map_diag(sns.displot)
g.map_upper(plt.scatter)
g.map_lower(sns.kedplot)


tips=sns.load_dataset('tips')
tips.head()
g=sns.FaceGrid(data=tips,col='time',row='smoker')
g.map(sns.distplot,'total_bill')
g.map(plt.scatter,'total_bill','tip')

##############################################
#regression  plots
import seaborn sa sns
%matplotlib inline
tips=sns.load_dataset('tips')
tips.head()
sns.lmplot(x='total_bill',y='tip',data=tips,hue='sex',markers=['o','v'],scatter_kws={'s:100})

sns.lmplot(x='total_bill',y='tip',data=tips,col='sex',row='time')
sns.lmplot(x='total_bill',y='tip',data=tips,col='day',hue='sex',aspect=0.6,size=8)


#################################
#style and color
import seaborn as sns
%matplotlib inline
tips=sns.load_datasets(tips)

sns.setstyle('white') # whitegrid 
sns.countplot(x='sex',data=tips)

sns.set_style('tickes')
sns.despine(left=Ture,riht=False)

plt.figure=(figsize=(12,3))
sns.countplot(x='sex',data=tips)

sns.set_context('poster',font_scal=13) #notbook
sns.countplot(x='sex',data=tips)


sns.lmplot(x='total_bill',y=tip,data=tips,hue='sex',palette='coolwarm')#semisc


######################################
#seaborn exercises

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.set_style('whitegrid')
titanic=sns.load_ dataset('titanic')

sns.jointplot(x='fare',y='age',data=titanic)

sns.distplot(titanic['fare'],ked=False,color='red',bins=30)

sns.boxplot(x='class',y='age',data=titanic,palette='rainbow')

sns.swrmplot(x='class',y='age',data=titanic,palette='Set2')

sns.countplot(x='sex',data=titanic)

sns.heatmap(titanic.corr(),cmap='collwarm')
plt.title('titanic’)

sns.FaceGrid(data=titanic,col='sex')
g.map(plt.hist,'age')

g.map(sns.displot,'age')

pandas data visualization exercise solutions

import pandas as pd
import matplotlib.pyplot as plt
df3=pd.read_csv('df3')

%matplotlib inline

df3.plot.scatter(x='a',y='b',s=50,c='red',figsize=(10,10))

df3['a'].hist()

df3['a'].plot.hist()

plt.style.use('ggplot')

df3['a'].plot.hist(bins=20,alpha=0.5)

df3['d'].plot.kde(lw=5,ls='--')

df3.ix[0:30].plot.area()


#####
f=plt.figure()
df3.ix[0:30].plot.area(alpha=0.4)
plt.legend(loc='center left',bbox_to_anchor=(1.0,0.5))
plt.show()
#######


######################################
#pandas built-in data visualization

import numpy as np
import pandas as pd
import seaborn as sns
%matplotlib inline
df1=pd.read_csv('df1',index_col=0)
df2=pd.read_csv('df2')

df1['A'].hist(bins=30)

df1['A'].plot(kind='hist',bins=30)

df1['A'].hist()

df2.plot.bar()

df.plt.bar(stacked=True)

df1['A'].plot.hist(bins=50)

df1.head()

df1.plot.line(x=df1,index,y='B')
df2.plot.area()

df1.plot.line(x=df1,inde,y='B',figsize=(12,10))

df1.plot.scatter(x='A',y='B',c='C',cmap='coolwarm')

df1.plot.scatter(x='A',y='B',s=df1['C']*10)



df2.plot.box()

df=pd.DataFrame(np.random.randn(1000,2),columns=['a','b'])

df.plot.hexbin(x='a',y='b',gridsize=25,cmap='coolwarm')

df2['a'].plot.kde()

df2['a'].plot.density()

df2.plot.density()

############################################

df2.plot.density()

plotly and cufflinks

python 下安装 plotly 和 cufflinks:(使用pip安装完成后,在jupyter note book 中能正常使用)
pip install plotly
pip install cufflinks

anaconda 安装 plotly,,命令
conda install -c https://conda.anaconda.org/plotly 
或 conda install -c plotly plotly=3.6.0
import pandas as pd
import numpy as np
from ploty import __version__

print(__version__)

#######
import cufflinks as cf
from plotly.offline import download_plotlyjs,init_notebook_node,plot,iplot

init_notebook_mode(connected=True)

cf.go_offline()

###DATA
df=pd.DataFrame(np.random.randn(100,4),columns='A B C D'.split())
df.head()

df2=pd.DataFrame({'Category':['A','B','C'],'Values':[32,43,50]})

df2.plot()

df.plot()

%matplotlib inline

df.iplot()

df.iplot(kind='scatter',x='A',y='B',mode='markers',size=20)

df2.iplot(kind='bar',x='Category',y='values')

df.iplot.(kind='bar')

df.count().iplot(kind='bar')

df.sum().iplot(kind='bar')

df.iplot(kind='box')

df3=pd.DataFrame({'x':[1,2,3,4,5],'y':[10,20,30,20,10],'z':[500,400,300,200,100]})

df3.iplot(kind='surface',colorscale='rdylbu')

df['A'].iplot(kind='hist',bins=50)

df.iplot(kind='hist')

df[['A','B']].iplot(kind='spread')

df.ipot(kind='bubble',x='A',y='B',size='C')


df.scatter_matrix()

Geographical plotting

 

method 1: plotly 
method 2: matplotlib basemap

#################

choropleth maps

#################################
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot

init_notebook_mode(connected=True)
import pandas as pd

data=dict(type='choropleth',locations=['A2','CA','WV'],
locationmode='USA-states',
colorscale='Porland',
text=['text 1','text 2','text 3'],
z=[1.0,2.0,3.0],
colorbar={'title:'Colorbar Title Goes Here'})
)

layout=dict(geo=['scope':'usa'})
choromap=go.Figure(data=[data],layout=layout)
iplot(choromap)

 

Hands-On Data Science and Python Machine Learning by Frank Kane English | 31 July 2017 | ISBN: 1787280748 | ASIN: B072QBVXGH | 420 Pages | AZW3 | 7.21 MB Key Features Take your first steps in the world of data science by understanding the tools and techniques of data analysis Train efficient Machine Learning models in Python using the supervised and unsupervised learning methods Learn how to use Apache Spark for processing Big Data efficiently Book Description Join Frank Kane, who worked on Amazon and IMDb's machine learning algorithms, as he guides you on your first steps into the world of data science. Hands-On Data Science and Python Machine Learning gives you the tools that you need to understand and explore the core topics in the field, and the confidence and practice to build and analyze your own machine learning models. With the help of interesting and easy-to-follow practical examples, Frank Kane explains potentially complex topics such as Bayesian methods and K-means clustering in a way that anybody can understand them. Based on Frank's successful data science course, Hands-On Data Science and Python Machine Learning empowers you to conduct data analysis and perform efficient machine learning using Python. Let Frank help you unearth the value in your data using the various data mining and data analysis techniques available in Python, and to develop efficient predictive models to predict future results. You will also learn how to perform large-scale machine learning on Big Data using Apache Spark. The book covers preparing your data for analysis, training machine learning models, and visualizing the final data analysis. What you will learn Learn how to clean your data and ready it for analysis Implement the popular clustering and regression methods in Python Train efficient machine learning models using decision trees and random forests Visualize the results of your analysis using Python's Matplotlib library Use Apache Spark's MLlib package to perform
Deep learning is making waves. At the time of this writing (March 2016), Google’s AlghaGo program just beat 9-dan professional Go player Lee Sedol at the game of Go, a Chinese board game. Experts in the field of Artificial Intelligence thought we were 10 years away from achieving a victory against a top professional Go player, but progress seems to have accelerated! While deep learning is a complex subject, it is not any more difficult to learn than any other machine learning algorithm. I wrote this book to introduce you to the basics of neural networks. You will get along fine with undergraduate-level math and programming skill. All the materials in this book can be downloaded and installed for free. We will use the Python programming language, along with the numerical computing library Numpy. I will also show you in the later chapters how to build a deep network using Theano and TensorFlow, which are libraries built specifically for deep learning and can accelerate computation by taking advantage of the GPU. Unlike other machine learning algorithms, deep learning is particularly powerful because it automatically learns features. That means you don’t need to spend your time trying to come up with and test “kernels” or “interaction effects” - something only statisticians love to do. Instead, we will let the neural network learn these things for us. Each layer of the neural network learns a different abstraction than the previous layers. For example, in image classification, the first layer might learn different strokes, and in the next layer put the strokes together to learn shapes, and in the next layer put the shapes together to form facial features, and in the next layer have a high level representation of faces. On top of all this, deep learning is known for winning its fair share Kaggle contests. These are machine learning contests that are open to anyone in the world who are allowed to use any machine learning technique they want. Deep learning is that powerful. Do you want a gentle introduction to this “dark art”, with practical code examples that you can try right away and apply to your own data? Then this book is for you. Who is this book NOT for? Deep Learning and Neural Networks are usually taught at the upper-year undergraduate level. That should give you some idea of the type of knowledge you need to understand this kind of material. You absolutely need exposure to calculus to understand deep learning, no matter how simple the instructor makes things. Linear algebra would help. I will assume familiarity with Python (although it is an easy language to pick up). You will need to have some concept of machine learning. If you know about algorithms like logistic regression already, this book is perfect for you. If not, you might want to check out my “prerequisites” book, at: http://amzn.com/B01D7GDRQ2 On the other hand, this book is more like a casual primer than a dry textbook. If you are looking for material on more advanced topics, like LSTMs, convolutional neural networks, or reinforcement learning, I have online courses that teach this material, for example: https://www.udemy.com/deep-learning-convolutional-neural-networks-theano-tensorflow New libraries like TensorFlow are being updated constantly. This is not an encyclopedia for these libraries (as such a thing would be impossible to keep up to date). In the one (1!!!) month since the book was first published, no less than THREE new wrapper libraries for TensorFlow have been released to make coding deep networks easier. To try and incorporate every little update would not only be impossible, but would continually cause parts of the book to be obsolete. Nobody wants that. This book, rather, includes fundamentals. Understanding these building blocks will make tackling these new libraries and features a piece of cake - that is my goal.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值