titanic survival 1

最新推荐文章于 2024-05-22 13:02:04 发布

qq_35679961

最新推荐文章于 2024-05-22 13:02:04 发布

阅读量233

点赞数

本文链接：https://blog.csdn.net/qq_35679961/article/details/52023453

版权

###python has a nice csv reader,which reads each line of a file into memory.You can read in each row and just append a list.From there,you can

####quickly trun it into an array.The first thing to do is to import the relevant package,that i will need for my script.These include the numpy (for

#####maths and arrays),and csv for reading and writing csv files .If I want to use something from this I need to call csv.[function] or np.[function]

###first

import csv as csv

import numpy as np

#######open up the csv file in to a Python object

csv_file_object=csv.reader(open(D:\udacity P2/train.csv','rb''))

header=csv_file_object.next() #### the next() command just skipts the first line which is a header

data=[] ######Creat a variable called 'data'

for row in csv_file_object: ####run through each row in the csv file,

data.append(row) ####adding each row to data variable

data=np.array(data) ####then convert from a list to an array.Be aware that each item is currently a string in this format

#######now if you want to call a specific column fo data,say,the gender column,i can just typt data[0::,4,remembering that "0::" means all

######(from start to end), and Python starts indices from 0(not 1).You should be aware that the csv reader works by default wiht strings,so you

####will need to convert to floats in order to do numerical calculations.For example,you can turn the Pclass variable into floats by using

#####data[0::,2].astype(np.float).Using this,we can calculate the proportion of survivors on the Titanic:

##### The size() function counts how many elements are in the array and sum() (as you would expects) sums up the elements in array.

number_passagers=np.size(data[0::,1].astype(np.float))

number_survived=np.sum(data[0::,1].astype(np.float))

proportion_survivors=number_survived / number_passengers

######numpy has some lovely functions.For example,we can search the gender column,find wherw any elements equal female(and for males

######'do not equal female'),and then use this to determine the numver of females and males that survived:

women_only_stats=data[0::,4]=="female" ###this finds where all the elements in the gender column that equals "female"

men_only_stats=data[0::,4]!="female" ####this finds where all the elements do not equal female (i.e.male)

########we use these two new variables as "mask" on our original train data,so we can select only those women,and only those men on

########board,then calculate the proportion of those who survived:

######using the index from above we select the females and males separately

women_onboard=data[women_only_stats,1].astype(np.float)

men_onboard=data[men_only_stats,1].astype(np.float)

####then we finds the proportions of them that survived

proportion_women_survived=\

np.sum(women_onboard)/np.size(women_onboard)

proportion_men_survived=\

np.sum(men_onboard)/np.size(men_onboard)

####and then print it out

print 'Proportion of women who survived is %s' % proportion_women_survived

print'Proportion fo men who surivived is %s' % proportion_men_survived

#####now that i have my indication that women were much more likely to survive,I am done with the training set.

######reading the test data and writing the gender modle as a csv

######as before,we need to read in the test file by opening a python object to read and another to write.First,we read in the test.csc file and

####skip the header line:

test_file=open('D:\udacity P2/test.csv','rb')

test_file_object=csv.reader(test_file)

header=test_file_object.next()

#####now,let's open a pointer to a new file so we can write to it (this file does not exist yet).Call it something descriptive so that it si recognizable

#####whnen we ipload it:

prediction_file=open("genderbasedmodel.csv","rb")

prediction_file_object=csv.writer(prediction_file)

#####we now want to read in the test file row by row,see if it is female or male,and writer our survival prediciton to a new file

prideiction_file_object.writerow(["PassengerId","Survived"])

for row in test_file_object: #######for each row in test.csv

if row [3] =='female': ############is it a female,if yes then

prediction_file_obgect.writerow([row[0],'1') #############predict 1

else:

prediction_file_object.writerow([row[0],'0']) #########predict 0

test_file.close()

prediction_file.close()

qq_35679961

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
titanic survival 1

###python has a nice csv reader,which reads each line of a file into memory.You can read in each row and just append a list.From there,you can ####quickly trun it into an array.The first thing to do
复制链接

扫一扫