推特狗狗档案数据集清洗分析及可视化

收集数据

##导入相关模块
import requests
import os
import pandas as pd
import json
import numpy as np
import re

twitter_archive_enhanced表格

twitter_archive_enhanced = pd.read_csv("twitter-archive-enhanced.csv") #将csv文件读取到dataframe
#tweet_id = twitter_archive_enhanced["tweet_id"].astype(str).tolist()

extra_data表格

# 将txt文本文件读取到dataframe,但文本内容是Json格式,需要JSON库来读取
tweet_list =[]  # 为方便读取到dataframe,首先创建一个列表
with open("tweet_json.txt") as file: # 打开txt文件
    for line in file:   # 采用循环遍历,单行读取Json文件
        tweet_id = json.loads(line)["id_str"] # 获取每行Json文件的"id_str","id"的数据类型后续需要处理,所以提取"id_str"
        retweet_count = json.loads(line)["retweet_count"]  # 获取每行json文件的转发数
        favorite_count = json.loads(line)["favorite_count"] # 获取每行json文件的喜爱数
        
        tweet_list.append({"tweet_id":tweet_id,         #将字典添加到列表中
                           "retweet_count":retweet_count,
                           "favorite_count":favorite_count})
extra_data = pd.DataFrame(tweet_list,columns = ["tweet_id","retweet_count","favorite_count"])    #转换为dataframe
len(tweet_list)
2352
extra_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2352 entries, 0 to 2351
Data columns (total 3 columns):
tweet_id          2352 non-null object
retweet_count     2352 non-null int64
favorite_count    2352 non-null int64
dtypes: int64(2), object(1)
memory usage: 55.2+ KB

image_predictions表格

##将下载的文件存储至folder_name路径文件夹下,如果文件夹不存在,通过如下代码创建.
folder_name = os.getcwd() ##在当前文件夹下
if not os.path.exists(folder_name):
    os.makedirs(folder_name)
##在url上使用requests.get,返回一个响应,这个url是推特图像的预测数据url
url = "https://raw.githubusercontent.com/udacity/new-dand-advanced-china/master/%E6%95%B0%E6%8D%AE%E6%B8%85%E6%B4%97/WeRateDogs%E9%A1%B9%E7%9B%AE/image-predictions.tsv"
response = requests.get(url)
##将文件保存至所建路径
with open(os.path.join(folder_name,url.split("/")[-1]),mode = "wb") as file:
    file.write(response.content)
    
image_predictions = pd.read_csv(os.path.join(folder_name,url.split("/")[-1]),sep="\t") # 将tsv文件读取为python的dataframe

评估

# 将表中每个单元的内容显示完整
pd.set_option('max_colwidth', 1000)
twitter_archive_enhanced
tweet_idin_reply_to_status_idin_reply_to_user_idtimestampsourcetextretweeted_status_idretweeted_status_user_idretweeted_status_timestampexpanded_urlsrating_numeratorrating_denominatornamedoggoflooferpupperpuppo
0892420643555336193NaNNaN2017-08-01 16:23:56 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 https://t.co/MgUWQ76dJUNaNNaNNaNhttps://twitter.com/dog_rates/status/892420643555336193/photo/11310PhineasNoneNoneNoneNone
1892177421306343426NaNNaN2017-08-01 00:17:27 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 https://t.co/0Xxu71qeIVNaNNaNNaNhttps://twitter.com/dog_rates/status/892177421306343426/photo/11310TillyNoneNoneNoneNone
2891815181378084864NaNNaN2017-07-31 00:18:03 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 https://t.co/wUnZnhtVJBNaNNaNNaNhttps://twitter.com/dog_rates/status/891815181378084864/photo/11210ArchieNoneNoneNoneNone
3891689557279858688NaNNaN2017-07-30 15:58:51 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us https://t.co/tD36da7qLQNaNNaNNaNhttps://twitter.com/dog_rates/status/891689557279858688/photo/11310DarlaNoneNoneNoneNone
4891327558926688256NaNNaN2017-07-29 16:00:24 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek https://t.co/AtUZn91f7fNaNNaNNaNhttps://twitter.com/dog_rates/status/891327558926688256/photo/1,https://twitter.com/dog_rates/status/891327558926688256/photo/11210FranklinNoneNoneNoneNone
5891087950875897856NaNNaN2017-07-29 00:08:17 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here we have a majestic great white breaching off South Africa's coast. Absolutely h*ckin breathtaking. 13/10 (IG: tucker_marlo) #BarkWeek https://t.co/kQ04fDDRmhNaNNaNNaNhttps://twitter.com/dog_rates/status/891087950875897856/photo/11310NoneNoneNoneNoneNone
6890971913173991426NaNNaN2017-07-28 16:27:12 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Meet Jax. He enjoys ice cream so much he gets nervous around it. 13/10 help Jax enjoy more things by clicking below\n\nhttps://t.co/Zr4hWfAs1H https://t.co/tVJBRMnhxlNaNNaNNaNhttps://gofundme.com/ydvmve-surgery-for-jax,https://twitter.com/dog_rates/status/890971913173991426/photo/11310JaxNoneNoneNoneNone
7890729181411237888NaNNaN2017-07-28 00:22:40 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>When you watch your owner call another dog a good boy but then they turn back to you and say you're a great boy. 13/10 https://t.co/v0nONBcwxqNaNNaNNaNhttps://twitter.com/dog_rates/status/890729181411237888/photo/1,https://twitter.com/dog_rates/status/890729181411237888/photo/11310NoneNoneNoneNoneNone
8890609185150312448NaNNaN2017-07-27 16:25:51 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Zoey. She doesn't want to be one of the scary sharks. Just wants to be a snuggly pettable boatpet. 13/10 #BarkWeek https://t.co/9TwLuAGH0bNaNNaNNaNhttps://twitter.com/dog_rates/status/890609185150312448/photo/11310ZoeyNoneNoneNoneNone
9890240255349198849NaNNaN2017-07-26 15:59:51 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Cassie. She is a college pup. Studying international doggo communication and stick theory. 14/10 so elegant much sophisticate https://t.co/t1bfwz5S2ANaNNaNNaNhttps://twitter.com/dog_rates/status/890240255349198849/photo/11410CassiedoggoNoneNoneNone
10890006608113172480NaNNaN2017-07-26 00:31:25 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Koda. He is a South Australian deckshark. Deceptively deadly. Frighteningly majestic. 13/10 would risk a petting #BarkWeek https://t.co/dVPW0B0MmeNaNNaNNaNhttps://twitter.com/dog_rates/status/890006608113172480/photo/1,https://twitter.com/dog_rates/status/890006608113172480/photo/11310KodaNoneNoneNoneNone
11889880896479866881NaNNaN2017-07-25 16:11:53 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Bruno. He is a service shark. Only gets out of the water to assist you. 13/10 terrifyingly good boy https://t.co/u1XPQMl29gNaNNaNNaNhttps://twitter.com/dog_rates/status/889880896479866881/photo/11310BrunoNoneNoneNoneNone
12889665388333682689NaNNaN2017-07-25 01:55:32 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here's a puppo that seems to be on the fence about something haha no but seriously someone help her. 13/10 https://t.co/BxvuXk0UCmNaNNaNNaNhttps://twitter.com/dog_rates/status/889665388333682689/photo/11310NoneNoneNoneNonepuppo
13889638837579907072NaNNaN2017-07-25 00:10:02 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Ted. He does his best. Sometimes that's not enough. But it's ok. 12/10 would assist https://t.co/f8dEDcrKSRNaNNaNNaNhttps://twitter.com/dog_rates/status/889638837579907072/photo/1,https://twitter.com/dog_rates/status/889638837579907072/photo/11210TedNoneNoneNoneNone
14889531135344209921NaNNaN2017-07-24 17:02:04 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Stuart. He's sporting his favorite fanny pack. Secretly filled with bones only. 13/10 puppared puppo #BarkWeek https://t.co/y70o6h3isqNaNNaNNaNhttps://twitter.com/dog_rates/status/889531135344209921/photo/11310StuartNoneNoneNonepuppo
15889278841981685760NaNNaN2017-07-24 00:19:32 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Oliver. You're witnessing one of his many brutal attacks. Seems to be playing with his victim. 13/10 fr*ckin frightening #BarkWeek https://t.co/WpHvrQedPbNaNNaNNaNhttps://twitter.com/dog_rates/status/889278841981685760/video/11310OliverNoneNoneNoneNone
16888917238123831296NaNNaN2017-07-23 00:22:39 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Jim. He found a fren. Taught him how to sit like the good boys. 12/10 for both https://t.co/chxruIOUJNNaNNaNNaNhttps://twitter.com/dog_rates/status/888917238123831296/photo/11210JimNoneNoneNoneNone
17888804989199671297NaNNaN2017-07-22 16:56:37 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Zeke. He has a new stick. Very proud of it. Would like you to throw it for him without taking it. 13/10 would do my best https://t.co/HTQ77yNQ5KNaNNaNNaNhttps://twitter.com/dog_rates/status/888804989199671297/photo/1,https://twitter.com/dog_rates/status/888804989199671297/photo/11310ZekeNoneNoneNoneNone
18888554962724278272NaNNaN2017-07-22 00:23:06 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Ralphus. He's powering up. Attempting maximum borkdrive. 13/10 inspirational af https://t.co/YnYAFCTTiKNaNNaNNaNhttps://twitter.com/dog_rates/status/888554962724278272/photo/1,https://twitter.com/dog_rates/status/888554962724278272/photo/1,https://twitter.com/dog_rates/status/888554962724278272/photo/1,https://twitter.com/dog_rates/status/888554962724278272/photo/11310RalphusNoneNoneNoneNone
19888202515573088257NaNNaN2017-07-21 01:02:36 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>RT @dog_rates: This is Canela. She attempted some fancy porch pics. They were unsuccessful. 13/10 someone help her https://t.co/cLyzpcUcMX8.874740e+174.196984e+092017-07-19 00:47:34 +0000https://twitter.com/dog_rates/status/887473957103951883/photo/1,https://twitter.com/dog_rates/status/887473957103951883/photo/1,https://twitter.com/dog_rates/status/887473957103951883/photo/1,https://twitter.com/dog_rates/status/887473957103951883/photo/11310CanelaNoneNoneNoneNone
20888078434458587136NaNNaN2017-07-20 16:49:33 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Gerald. He was just told he didn't get the job he interviewed for. A h*ckin injustice. 12/10 didn't want the job anyway https://t.co/DK7iDPfuRXNaNNaNNaNhttps://twitter.com/dog_rates/status/888078434458587136/photo/1,https://twitter.com/dog_rates/status/888078434458587136/photo/11210GeraldNoneNoneNoneNone
21887705289381826560NaNNaN2017-07-19 16:06:48 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Jeffrey. He has a monopoly on the pool noodles. Currently running a 'boop for two' midweek sale. 13/10 h*ckin strategic https://t.co/PhrUk20Q64NaNNaNNaNhttps://twitter.com/dog_rates/status/887705289381826560/photo/11310JeffreyNoneNoneNoneNone
22887517139158093824NaNNaN2017-07-19 03:39:09 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>I've yet to rate a Venezuelan Hover Wiener. This is such an honor. 14/10 paw-inspiring af (IG: roxy.thedoxy) https://t.co/20VrLAA8baNaNNaNNaNhttps://twitter.com/dog_rates/status/887517139158093824/video/11410suchNoneNoneNoneNone
23887473957103951883NaNNaN2017-07-19 00:47:34 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Canela. She attempted some fancy porch pics. They were unsuccessful. 13/10 someone help her https://t.co/cLyzpcUcMXNaNNaNNaNhttps://twitter.com/dog_rates/status/887473957103951883/photo/1,https://twitter.com/dog_rates/status/887473957103951883/photo/11310CanelaNoneNoneNoneNone
24887343217045368832NaNNaN2017-07-18 16:08:03 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>You may not have known you needed to see this today. 13/10 please enjoy (IG: emmylouroo) https://t.co/WZqNqygEyVNaNNaNNaNhttps://twitter.com/dog_rates/status/887343217045368832/video/11310NoneNoneNoneNoneNone
25887101392804085760NaNNaN2017-07-18 00:07:08 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This... is a Jubilant Antarctic House Bear. We only rate dogs. Please only send dogs. Thank you... 12/10 would suffocate in floof https://t.co/4Ad1jzJSdpNaNNaNNaNhttps://twitter.com/dog_rates/status/887101392804085760/photo/11210NoneNoneNoneNoneNone
26886983233522544640NaNNaN2017-07-17 16:17:36 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Maya. She's very shy. Rarely leaves her cup. 13/10 would find her an environment to thrive in https://t.co/I6oNy0CgiTNaNNaNNaNhttps://twitter.com/dog_rates/status/886983233522544640/photo/1,https://twitter.com/dog_rates/status/886983233522544640/photo/11310MayaNoneNoneNoneNone
27886736880519319552NaNNaN2017-07-16 23:58:41 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Mingus. He's a wonderful father to his smol pup. Confirmed 13/10, but he needs your help\n\nhttps://t.co/bVi0Yr4Cff https://t.co/ISvKOSkd5bNaNNaNNaNhttps://www.gofundme.com/mingusneedsus,https://twitter.com/dog_rates/status/886736880519319552/photo/1,https://twitter.com/dog_rates/status/886736880519319552/photo/11310MingusNoneNoneNoneNone
28886680336477933568NaNNaN2017-07-16 20:14:00 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Derek. He's late for a dog meeting. 13/10 pet...al to the metal https://t.co/BCoWue0abANaNNaNNaNhttps://twitter.com/dog_rates/status/886680336477933568/photo/11310DerekNoneNoneNoneNone
29886366144734445568NaNNaN2017-07-15 23:25:31 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is Roscoe. Another pupper fallen victim to spontaneous tongue ejections. Get the BlepiPen immediate. 12/10 deep breaths Roscoe https://t.co/RGE08MIJoxNaNNaNNaNhttps://twitter.com/dog_rates/status/886366144734445568/photo/1,https://twitter.com/dog_rates/status/886366144734445568/photo/11210RoscoeNoneNonepupperNone
......................................................
2326666411507551481857NaNNaN2015-11-17 00:24:19 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is quite the dog. Gets really excited when not in water. Not very soft tho. Bad at fetch. Can't do tricks. 2/10 https://t.co/aMCTNWO94tNaNNaNNaNhttps://twitter.com/dog_rates/status/666411507551481857/photo/1210quiteNoneNoneNoneNone
2327666407126856765440NaNNaN2015-11-17 00:06:54 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is a southern Vesuvius bumblegruff. Can drive a truck (wow). Made friends with 5 other nifty dogs (neat). 7/10 https://t.co/LopTBkKa8hNaNNaNNaNhttps://twitter.com/dog_rates/status/666407126856765440/photo/1710aNoneNoneNoneNone
2328666396247373291520NaNNaN2015-11-16 23:23:41 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Oh goodness. A super rare northeast Qdoba kangaroo mix. Massive feet. No pouch (disappointing). Seems alert. 9/10 https://t.co/Dc7b0E8qFENaNNaNNaNhttps://twitter.com/dog_rates/status/666396247373291520/photo/1910NoneNoneNoneNoneNone
2329666373753744588802NaNNaN2015-11-16 21:54:18 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Those are sunglasses and a jean jacket. 11/10 dog cool af https://t.co/uHXrPkUEylNaNNaNNaNhttps://twitter.com/dog_rates/status/666373753744588802/photo/11110NoneNoneNoneNoneNone
2330666362758909284353NaNNaN2015-11-16 21:10:36 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Unique dog here. Very small. Lives in container of Frosted Flakes (?). Short legs. Must be rare 6/10 would still pet https://t.co/XMD9CwjEnMNaNNaNNaNhttps://twitter.com/dog_rates/status/666362758909284353/photo/1610NoneNoneNoneNoneNone
2331666353288456101888NaNNaN2015-11-16 20:32:58 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here we have a mixed Asiago from the Galápagos Islands. Only one ear working. Big fan of marijuana carpet. 8/10 https://t.co/tltQ5w9aUONaNNaNNaNhttps://twitter.com/dog_rates/status/666353288456101888/photo/1810NoneNoneNoneNoneNone
2332666345417576210432NaNNaN2015-11-16 20:01:42 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Look at this jokester thinking seat belt laws don't apply to him. Great tongue tho 10/10 https://t.co/VFKG1vxGjBNaNNaNNaNhttps://twitter.com/dog_rates/status/666345417576210432/photo/11010NoneNoneNoneNoneNone
2333666337882303524864NaNNaN2015-11-16 19:31:45 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is an extremely rare horned Parthenon. Not amused. Wears shoes. Overall very nice. 9/10 would pet aggressively https://t.co/QpRjllzWALNaNNaNNaNhttps://twitter.com/dog_rates/status/666337882303524864/photo/1910anNoneNoneNoneNone
2334666293911632134144NaNNaN2015-11-16 16:37:02 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is a funny dog. Weird toes. Won't come down. Loves branch. Refuses to eat his food. Hard to cuddle with. 3/10 https://t.co/IIXis0zta0NaNNaNNaNhttps://twitter.com/dog_rates/status/666293911632134144/photo/1310aNoneNoneNoneNone
2335666287406224695296NaNNaN2015-11-16 16:11:11 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is an Albanian 3 1/2 legged Episcopalian. Loves well-polished hardwood flooring. Penis on the collar. 9/10 https://t.co/d9NcXFKwLvNaNNaNNaNhttps://twitter.com/dog_rates/status/666287406224695296/photo/112anNoneNoneNoneNone
2336666273097616637952NaNNaN2015-11-16 15:14:19 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Can take selfies 11/10 https://t.co/ws2AMaNwPWNaNNaNNaNhttps://twitter.com/dog_rates/status/666273097616637952/photo/11110NoneNoneNoneNoneNone
2337666268910803644416NaNNaN2015-11-16 14:57:41 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Very concerned about fellow dog trapped in computer. 10/10 https://t.co/0yxApIikpkNaNNaNNaNhttps://twitter.com/dog_rates/status/666268910803644416/photo/11010NoneNoneNoneNoneNone
2338666104133288665088NaNNaN2015-11-16 04:02:55 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Not familiar with this breed. No tail (weird). Only 2 legs. Doesn't bark. Surprisingly quick. Shits eggs. 1/10 https://t.co/Asgdc6kuLXNaNNaNNaNhttps://twitter.com/dog_rates/status/666104133288665088/photo/1110NoneNoneNoneNoneNone
2339666102155909144576NaNNaN2015-11-16 03:55:04 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Oh my. Here you are seeing an Adobe Setter giving birth to twins!!! The world is an amazing place. 11/10 https://t.co/11LvqN4WLqNaNNaNNaNhttps://twitter.com/dog_rates/status/666102155909144576/photo/11110NoneNoneNoneNoneNone
2340666099513787052032NaNNaN2015-11-16 03:44:34 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Can stand on stump for what seems like a while. Built that birdhouse? Impressive. Made friends with a squirrel. 8/10 https://t.co/Ri4nMTLq5CNaNNaNNaNhttps://twitter.com/dog_rates/status/666099513787052032/photo/1810NoneNoneNoneNoneNone
2341666094000022159362NaNNaN2015-11-16 03:22:39 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This appears to be a Mongolian Presbyterian mix. Very tired. Tongue slip confirmed. 9/10 would lie down with https://t.co/mnioXo3IfPNaNNaNNaNhttps://twitter.com/dog_rates/status/666094000022159362/photo/1910NoneNoneNoneNoneNone
2342666082916733198337NaNNaN2015-11-16 02:38:37 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here we have a well-established sunblockerspaniel. Lost his other flip-flop. 6/10 not very waterproof https://t.co/3RU6x0vHB7NaNNaNNaNhttps://twitter.com/dog_rates/status/666082916733198337/photo/1610NoneNoneNoneNoneNone
2343666073100786774016NaNNaN2015-11-16 01:59:36 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Let's hope this flight isn't Malaysian (lol). What a dog! Almost completely camouflaged. 10/10 I trust this pilot https://t.co/Yk6GHE9tOYNaNNaNNaNhttps://twitter.com/dog_rates/status/666073100786774016/photo/11010NoneNoneNoneNoneNone
2344666071193221509120NaNNaN2015-11-16 01:52:02 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here we have a northern speckled Rhododendron. Much sass. Gives 0 fucks. Good tongue. 9/10 would caress sensually https://t.co/ZoL8kq2XFxNaNNaNNaNhttps://twitter.com/dog_rates/status/666071193221509120/photo/1910NoneNoneNoneNoneNone
2345666063827256086533NaNNaN2015-11-16 01:22:45 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is the happiest dog you will ever see. Very committed owner. Nice couch. 10/10 https://t.co/RhUEAloehKNaNNaNNaNhttps://twitter.com/dog_rates/status/666063827256086533/photo/11010theNoneNoneNoneNone
2346666058600524156928NaNNaN2015-11-16 01:01:59 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here is the Rand Paul of retrievers folks! He's probably good at poker. Can drink beer (lol rad). 8/10 good dog https://t.co/pYAJkAe76pNaNNaNNaNhttps://twitter.com/dog_rates/status/666058600524156928/photo/1810theNoneNoneNoneNone
2347666057090499244032NaNNaN2015-11-16 00:55:59 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>My oh my. This is a rare blond Canadian terrier on wheels. Only $8.98. Rather docile. 9/10 very rare https://t.co/yWBqbrzy8ONaNNaNNaNhttps://twitter.com/dog_rates/status/666057090499244032/photo/1910aNoneNoneNoneNone
2348666055525042405380NaNNaN2015-11-16 00:49:46 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here is a Siberian heavily armored polar bear mix. Strong owner. 10/10 I would do unspeakable things to pet this dog https://t.co/rdivxLiqEtNaNNaNNaNhttps://twitter.com/dog_rates/status/666055525042405380/photo/11010aNoneNoneNoneNone
2349666051853826850816NaNNaN2015-11-16 00:35:11 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is an odd dog. Hard on the outside but loving on the inside. Petting still fun. Doesn't play catch well. 2/10 https://t.co/v5A4vzSDdcNaNNaNNaNhttps://twitter.com/dog_rates/status/666051853826850816/photo/1210anNoneNoneNoneNone
2350666050758794694657NaNNaN2015-11-16 00:30:50 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is a truly beautiful English Wilson Staff retriever. Has a nice phone. Privileged. 10/10 would trade lives with https://t.co/fvIbQfHjIeNaNNaNNaNhttps://twitter.com/dog_rates/status/666050758794694657/photo/11010aNoneNoneNoneNone
2351666049248165822465NaNNaN2015-11-16 00:24:50 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here we have a 1949 1st generation vulpix. Enjoys sweat tea and Fox News. Cannot be phased. 5/10 https://t.co/4B7cOc1EDqNaNNaNNaNhttps://twitter.com/dog_rates/status/666049248165822465/photo/1510NoneNoneNoneNoneNone
2352666044226329800704NaNNaN2015-11-16 00:04:52 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is a purebred Piers Morgan. Loves to Netflix and chill. Always looks like he forgot to unplug the iron. 6/10 https://t.co/DWnyCjf2mxNaNNaNNaNhttps://twitter.com/dog_rates/status/666044226329800704/photo/1610aNoneNoneNoneNone
2353666033412701032449NaNNaN2015-11-15 23:21:54 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here is a very happy pup. Big fan of well-maintained decks. Just look at that tongue. 9/10 would cuddle af https://t.co/y671yMhoiRNaNNaNNaNhttps://twitter.com/dog_rates/status/666033412701032449/photo/1910aNoneNoneNoneNone
2354666029285002620928NaNNaN2015-11-15 23:05:30 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>This is a western brown Mitsubishi terrier. Upset about leaf. Actually 2 dogs here. 7/10 would walk the shit out of https://t.co/r7mOb2m0UINaNNaNNaNhttps://twitter.com/dog_rates/status/666029285002620928/photo/1710aNoneNoneNoneNone
2355666020888022790149NaNNaN2015-11-15 22:32:08 +0000<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>Here we have a Japanese Irish Setter. Lost eye in Vietnam (?). Big fan of relaxing on stair. 8/10 would pet https://t.co/BLDqew2IjjNaNNaNNaNhttps://twitter.com/dog_rates/status/666020888022790149/photo/1810NoneNoneNoneNoneNone

2356 rows × 17 columns

extra_data
tweet_idretweet_countfavorite_count
0892420643555336193884239492
1892177421306343426648033786
2891815181378084864430125445
3891689557279858688892542863
4891327558926688256972141016
5891087950875897856324020548
6890971913173991426214212053
78907291814112378881954866596
8890609185150312448440328187
9890240255349198849768432467
10890006608113172480758431127
11889880896479866881511628208
12889665388333682689850238745
13889638837579907072470527633
14889531135344209921230915329
15889278841981685760563525712
16888917238123831296468129555
17888804989199671297453526021
18888554962724278272372220267
19888078434458587136363722144
20887705289381826560558430690
218875171391580938241205346940
228874739571039518831881370007
238873432170453688321071334223
24887101392804085760614731045
25886983233522544640804535786
26886736880519319552342012286
27886680336477933568459722802
28886366144734445568329721488
298862670092850176004117
............
2322666411507551481857337457
232366640712685676544043113
232466639624737329152091171
232566637375374458880299194
2326666362758909284353590801
232766635328845610188876228
2328666345417576210432146308
232966633788230352486496203
2330666293911632134144365519
233166628740622469529671152
233266627309761663795281183
233366626891080364441637108
2334666104133288665088683514703
23356661021559091445761581
233666609951378705203273160
233766609400002215936278168
233866608291673319833747121
2339666073100786774016173334
234066607119322150912067154
2341666063827256086533230494
234266605860052415692861117
2343666057090499244032146304
2344666055525042405380261449
23456660518538268508168771250
234666605075879469465760136
234766604924816582246541111
2348666044226329800704147309
234966603341270103244947128
235066602928500262092848132
23516660208880227901495302528

2352 rows × 3 columns

image_predictions
tweet_idjpg_urlimg_nump1p1_confp1_dogp2p2_confp2_dogp3p3_confp3_dog
0666020888022790149https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg1Welsh_springer_spaniel0.465074Truecollie0.156665TrueShetland_sheepdog0.061428True
1666029285002620928https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg1redbone0.506826Trueminiature_pinscher0.074192TrueRhodesian_ridgeback0.072010True
2666033412701032449https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg1German_shepherd0.596461Truemalinois0.138584Truebloodhound0.116197True
3666044226329800704https://pbs.twimg.com/media/CT5Dr8HUEAA-lEu.jpg1Rhodesian_ridgeback0.408143Trueredbone0.360687Trueminiature_pinscher0.222752True
4666049248165822465https://pbs.twimg.com/media/CT5IQmsXIAAKY4A.jpg1miniature_pinscher0.560311TrueRottweiler0.243682TrueDoberman0.154629True
5666050758794694657https://pbs.twimg.com/media/CT5Jof1WUAEuVxN.jpg1Bernese_mountain_dog0.651137TrueEnglish_springer0.263788TrueGreater_Swiss_Mountain_dog0.016199True
6666051853826850816https://pbs.twimg.com/media/CT5KoJ1WoAAJash.jpg1box_turtle0.933012Falsemud_turtle0.045885Falseterrapin0.017885False
7666055525042405380https://pbs.twimg.com/media/CT5N9tpXIAAifs1.jpg1chow0.692517TrueTibetan_mastiff0.058279Truefur_coat0.054449False
8666057090499244032https://pbs.twimg.com/media/CT5PY90WoAAQGLo.jpg1shopping_cart0.962465Falseshopping_basket0.014594Falsegolden_retriever0.007959True
9666058600524156928https://pbs.twimg.com/media/CT5Qw94XAAA_2dP.jpg1miniature_poodle0.201493Truekomondor0.192305Truesoft-coated_wheaten_terrier0.082086True
10666063827256086533https://pbs.twimg.com/media/CT5Vg_wXIAAXfnj.jpg1golden_retriever0.775930TrueTibetan_mastiff0.093718TrueLabrador_retriever0.072427True
11666071193221509120https://pbs.twimg.com/media/CT5cN_3WEAAlOoZ.jpg1Gordon_setter0.503672TrueYorkshire_terrier0.174201TruePekinese0.109454True
12666073100786774016https://pbs.twimg.com/media/CT5d9DZXAAALcwe.jpg1Walker_hound0.260857TrueEnglish_foxhound0.175382TrueIbizan_hound0.097471True
13666082916733198337https://pbs.twimg.com/media/CT5m4VGWEAAtKc8.jpg1pug0.489814Truebull_mastiff0.404722TrueFrench_bulldog0.048960True
14666094000022159362https://pbs.twimg.com/media/CT5w9gUW4AAsBNN.jpg1bloodhound0.195217TrueGerman_shepherd0.078260Truemalinois0.075628True
15666099513787052032https://pbs.twimg.com/media/CT51-JJUEAA6hV8.jpg1Lhasa0.582330TrueShih-Tzu0.166192TrueDandie_Dinmont0.089688True
16666102155909144576https://pbs.twimg.com/media/CT54YGiWUAEZnoK.jpg1English_setter0.298617TrueNewfoundland0.149842Trueborzoi0.133649True
17666104133288665088https://pbs.twimg.com/media/CT56LSZWoAAlJj2.jpg1hen0.965932Falsecock0.033919Falsepartridge0.000052False
18666268910803644416https://pbs.twimg.com/media/CT8QCd1WEAADXws.jpg1desktop_computer0.086502Falsedesk0.085547Falsebookcase0.079480False
19666273097616637952https://pbs.twimg.com/media/CT8T1mtUwAA3aqm.jpg1Italian_greyhound0.176053Truetoy_terrier0.111884Truebasenji0.111152True
20666287406224695296https://pbs.twimg.com/media/CT8g3BpUEAAuFjg.jpg1Maltese_dog0.857531Truetoy_poodle0.063064Trueminiature_poodle0.025581True
21666293911632134144https://pbs.twimg.com/media/CT8mx7KW4AEQu8N.jpg1three-toed_sloth0.914671Falseotter0.015250Falsegreat_grey_owl0.013207False
22666337882303524864https://pbs.twimg.com/media/CT9OwFIWEAMuRje.jpg1ox0.416669FalseNewfoundland0.278407Truegroenendael0.102643True
23666345417576210432https://pbs.twimg.com/media/CT9Vn7PWoAA_ZCM.jpg1golden_retriever0.858744TrueChesapeake_Bay_retriever0.054787TrueLabrador_retriever0.014241True
24666353288456101888https://pbs.twimg.com/media/CT9cx0tUEAAhNN_.jpg1malamute0.336874TrueSiberian_husky0.147655TrueEskimo_dog0.093412True
25666362758909284353https://pbs.twimg.com/media/CT9lXGsUcAAyUFt.jpg1guinea_pig0.996496Falseskunk0.002402Falsehamster0.000461False
26666373753744588802https://pbs.twimg.com/media/CT9vZEYWUAAlZ05.jpg1soft-coated_wheaten_terrier0.326467TrueAfghan_hound0.259551Truebriard0.206803True
27666396247373291520https://pbs.twimg.com/media/CT-D2ZHWIAA3gK1.jpg1Chihuahua0.978108Truetoy_terrier0.009397Truepapillon0.004577True
28666407126856765440https://pbs.twimg.com/media/CT-NvwmW4AAugGZ.jpg1black-and-tan_coonhound0.529139Truebloodhound0.244220Trueflat-coated_retriever0.173810True
29666411507551481857https://pbs.twimg.com/media/CT-RugiWIAELEaq.jpg1coho0.404640Falsebarracouta0.271485Falsegar0.189945False
.......................................
2045886366144734445568https://pbs.twimg.com/media/DE0BTnQUwAApKEH.jpg1French_bulldog0.999201TrueChihuahua0.000361TrueBoston_bull0.000076True
2046886680336477933568https://pbs.twimg.com/media/DE4fEDzWAAAyHMM.jpg1convertible0.738995Falsesports_car0.139952Falsecar_wheel0.044173False
2047886736880519319552https://pbs.twimg.com/media/DE5Se8FXcAAJFx4.jpg1kuvasz0.309706TrueGreat_Pyrenees0.186136TrueDandie_Dinmont0.086346True
2048886983233522544640https://pbs.twimg.com/media/DE8yicJW0AAAvBJ.jpg2Chihuahua0.793469Truetoy_terrier0.143528Truecan_opener0.032253False
2049887101392804085760https://pbs.twimg.com/media/DE-eAq6UwAA-jaE.jpg1Samoyed0.733942TrueEskimo_dog0.035029TrueStaffordshire_bullterrier0.029705True
2050887343217045368832https://pbs.twimg.com/ext_tw_video_thumb/887343120832229379/pu/img/6HSuFrW1lzI_9Mht.jpg1Mexican_hairless0.330741Truesea_lion0.275645FalseWeimaraner0.134203True
2051887473957103951883https://pbs.twimg.com/media/DFDw2tyUQAAAFke.jpg2Pembroke0.809197TrueRhodesian_ridgeback0.054950Truebeagle0.038915True
2052887517139158093824https://pbs.twimg.com/ext_tw_video_thumb/887517108413886465/pu/img/WanJKwssZj4VJvL9.jpg1limousine0.130432Falsetow_truck0.029175Falseshopping_cart0.026321False
2053887705289381826560https://pbs.twimg.com/media/DFHDQBbXgAEqY7t.jpg1basset0.821664Trueredbone0.087582TrueWeimaraner0.026236True
2054888078434458587136https://pbs.twimg.com/media/DFMWn56WsAAkA7B.jpg1French_bulldog0.995026Truepug0.000932Truebull_mastiff0.000903True
2055888202515573088257https://pbs.twimg.com/media/DFDw2tyUQAAAFke.jpg2Pembroke0.809197TrueRhodesian_ridgeback0.054950Truebeagle0.038915True
2056888554962724278272https://pbs.twimg.com/media/DFTH_O-UQAACu20.jpg3Siberian_husky0.700377TrueEskimo_dog0.166511Truemalamute0.111411True
2057888804989199671297https://pbs.twimg.com/media/DFWra-3VYAA2piG.jpg1golden_retriever0.469760TrueLabrador_retriever0.184172TrueEnglish_setter0.073482True
2058888917238123831296https://pbs.twimg.com/media/DFYRgsOUQAARGhO.jpg1golden_retriever0.714719TrueTibetan_mastiff0.120184TrueLabrador_retriever0.105506True
2059889278841981685760https://pbs.twimg.com/ext_tw_video_thumb/889278779352338437/pu/img/VlbFB3v8H8VwzVNY.jpg1whippet0.626152Trueborzoi0.194742TrueSaluki0.027351True
2060889531135344209921https://pbs.twimg.com/media/DFg_2PVW0AEHN3p.jpg1golden_retriever0.953442TrueLabrador_retriever0.013834Trueredbone0.007958True
2061889638837579907072https://pbs.twimg.com/media/DFihzFfXsAYGDPR.jpg1French_bulldog0.991650Trueboxer0.002129TrueStaffordshire_bullterrier0.001498True
2062889665388333682689https://pbs.twimg.com/media/DFi579UWsAAatzw.jpg1Pembroke0.966327TrueCardigan0.027356Truebasenji0.004633True
2063889880896479866881https://pbs.twimg.com/media/DFl99B1WsAITKsg.jpg1French_bulldog0.377417TrueLabrador_retriever0.151317Truemuzzle0.082981False
2064890006608113172480https://pbs.twimg.com/media/DFnwSY4WAAAMliS.jpg1Samoyed0.957979TruePomeranian0.013884Truechow0.008167True
2065890240255349198849https://pbs.twimg.com/media/DFrEyVuW0AAO3t9.jpg1Pembroke0.511319TrueCardigan0.451038TrueChihuahua0.029248True
2066890609185150312448https://pbs.twimg.com/media/DFwUU__XcAEpyXI.jpg1Irish_terrier0.487574TrueIrish_setter0.193054TrueChesapeake_Bay_retriever0.118184True
2067890729181411237888https://pbs.twimg.com/media/DFyBahAVwAAhUTd.jpg2Pomeranian0.566142TrueEskimo_dog0.178406TruePembroke0.076507True
2068890971913173991426https://pbs.twimg.com/media/DF1eOmZXUAALUcq.jpg1Appenzeller0.341703TrueBorder_collie0.199287Trueice_lolly0.193548False
2069891087950875897856https://pbs.twimg.com/media/DF3HwyEWsAABqE6.jpg1Chesapeake_Bay_retriever0.425595TrueIrish_terrier0.116317TrueIndian_elephant0.076902False
2070891327558926688256https://pbs.twimg.com/media/DF6hr6BUMAAzZgT.jpg2basset0.555712TrueEnglish_springer0.225770TrueGerman_short-haired_pointer0.175219True
2071891689557279858688https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg1paper_towel0.170278FalseLabrador_retriever0.168086Truespatula0.040836False
2072891815181378084864https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg1Chihuahua0.716012Truemalamute0.078253Truekelpie0.031379True
2073892177421306343426https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg1Chihuahua0.323581TruePekinese0.090647Truepapillon0.068957True
2074892420643555336193https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg1orange0.097049Falsebagel0.085851Falsebanana0.076110False

2075 rows × 12 columns

twitter_archive_enhanced.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2356 entries, 0 to 2355
Data columns (total 17 columns):
tweet_id                      2356 non-null int64
in_reply_to_status_id         78 non-null float64
in_reply_to_user_id           78 non-null float64
timestamp                     2356 non-null object
source                        2356 non-null object
text                          2356 non-null object
retweeted_status_id           181 non-null float64
retweeted_status_user_id      181 non-null float64
retweeted_status_timestamp    181 non-null object
expanded_urls                 2297 non-null object
rating_numerator              2356 non-null int64
rating_denominator            2356 non-null int64
name                          2356 non-null object
doggo                         2356 non-null object
floofer                       2356 non-null object
pupper                        2356 non-null object
puppo                         2356 non-null object
dtypes: float64(4), int64(3), object(10)
memory usage: 313.0+ KB
extra_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2352 entries, 0 to 2351
Data columns (total 3 columns):
tweet_id          2352 non-null object
retweet_count     2352 non-null int64
favorite_count    2352 non-null int64
dtypes: int64(2), object(1)
memory usage: 55.2+ KB
image_predictions.jpg_url.isnull().value_counts()
False    2075
Name: jpg_url, dtype: int64
image_predictions
tweet_idjpg_urlimg_nump1p1_confp1_dogp2p2_confp2_dogp3p3_confp3_dog
0666020888022790149https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg1Welsh_springer_spaniel0.465074Truecollie0.156665TrueShetland_sheepdog0.061428True
1666029285002620928https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg1redbone0.506826Trueminiature_pinscher0.074192TrueRhodesian_ridgeback0.072010True
2666033412701032449https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg1German_shepherd0.596461Truemalinois0.138584Truebloodhound0.116197True
3666044226329800704https://pbs.twimg.com/media/CT5Dr8HUEAA-lEu.jpg1Rhodesian_ridgeback0.408143Trueredbone0.360687Trueminiature_pinscher0.222752True
4666049248165822465https://pbs.twimg.com/media/CT5IQmsXIAAKY4A.jpg1miniature_pinscher0.560311TrueRottweiler0.243682TrueDoberman0.154629True
5666050758794694657https://pbs.twimg.com/media/CT5Jof1WUAEuVxN.jpg1Bernese_mountain_dog0.651137TrueEnglish_springer0.263788TrueGreater_Swiss_Mountain_dog0.016199True
6666051853826850816https://pbs.twimg.com/media/CT5KoJ1WoAAJash.jpg1box_turtle0.933012Falsemud_turtle0.045885Falseterrapin0.017885False
7666055525042405380https://pbs.twimg.com/media/CT5N9tpXIAAifs1.jpg1chow0.692517TrueTibetan_mastiff0.058279Truefur_coat0.054449False
8666057090499244032https://pbs.twimg.com/media/CT5PY90WoAAQGLo.jpg1shopping_cart0.962465Falseshopping_basket0.014594Falsegolden_retriever0.007959True
9666058600524156928https://pbs.twimg.com/media/CT5Qw94XAAA_2dP.jpg1miniature_poodle0.201493Truekomondor0.192305Truesoft-coated_wheaten_terrier0.082086True
10666063827256086533https://pbs.twimg.com/media/CT5Vg_wXIAAXfnj.jpg1golden_retriever0.775930TrueTibetan_mastiff0.093718TrueLabrador_retriever0.072427True
11666071193221509120https://pbs.twimg.com/media/CT5cN_3WEAAlOoZ.jpg1Gordon_setter0.503672TrueYorkshire_terrier0.174201TruePekinese0.109454True
12666073100786774016https://pbs.twimg.com/media/CT5d9DZXAAALcwe.jpg1Walker_hound0.260857TrueEnglish_foxhound0.175382TrueIbizan_hound0.097471True
13666082916733198337https://pbs.twimg.com/media/CT5m4VGWEAAtKc8.jpg1pug0.489814Truebull_mastiff0.404722TrueFrench_bulldog0.048960True
14666094000022159362https://pbs.twimg.com/media/CT5w9gUW4AAsBNN.jpg1bloodhound0.195217TrueGerman_shepherd0.078260Truemalinois0.075628True
15666099513787052032https://pbs.twimg.com/media/CT51-JJUEAA6hV8.jpg1Lhasa0.582330TrueShih-Tzu0.166192TrueDandie_Dinmont0.089688True
16666102155909144576https://pbs.twimg.com/media/CT54YGiWUAEZnoK.jpg1English_setter0.298617TrueNewfoundland0.149842Trueborzoi0.133649True
17666104133288665088https://pbs.twimg.com/media/CT56LSZWoAAlJj2.jpg1hen0.965932Falsecock0.033919Falsepartridge0.000052False
18666268910803644416https://pbs.twimg.com/media/CT8QCd1WEAADXws.jpg1desktop_computer0.086502Falsedesk0.085547Falsebookcase0.079480False
19666273097616637952https://pbs.twimg.com/media/CT8T1mtUwAA3aqm.jpg1Italian_greyhound0.176053Truetoy_terrier0.111884Truebasenji0.111152True
20666287406224695296https://pbs.twimg.com/media/CT8g3BpUEAAuFjg.jpg1Maltese_dog0.857531Truetoy_poodle0.063064Trueminiature_poodle0.025581True
21666293911632134144https://pbs.twimg.com/media/CT8mx7KW4AEQu8N.jpg1three-toed_sloth0.914671Falseotter0.015250Falsegreat_grey_owl0.013207False
22666337882303524864https://pbs.twimg.com/media/CT9OwFIWEAMuRje.jpg1ox0.416669FalseNewfoundland0.278407Truegroenendael0.102643True
23666345417576210432https://pbs.twimg.com/media/CT9Vn7PWoAA_ZCM.jpg1golden_retriever0.858744TrueChesapeake_Bay_retriever0.054787TrueLabrador_retriever0.014241True
24666353288456101888https://pbs.twimg.com/media/CT9cx0tUEAAhNN_.jpg1malamute0.336874TrueSiberian_husky0.147655TrueEskimo_dog0.093412True
25666362758909284353https://pbs.twimg.com/media/CT9lXGsUcAAyUFt.jpg1guinea_pig0.996496Falseskunk0.002402Falsehamster0.000461False
26666373753744588802https://pbs.twimg.com/media/CT9vZEYWUAAlZ05.jpg1soft-coated_wheaten_terrier0.326467TrueAfghan_hound0.259551Truebriard0.206803True
27666396247373291520https://pbs.twimg.com/media/CT-D2ZHWIAA3gK1.jpg1Chihuahua0.978108Truetoy_terrier0.009397Truepapillon0.004577True
28666407126856765440https://pbs.twimg.com/media/CT-NvwmW4AAugGZ.jpg1black-and-tan_coonhound0.529139Truebloodhound0.244220Trueflat-coated_retriever0.173810True
29666411507551481857https://pbs.twimg.com/media/CT-RugiWIAELEaq.jpg1coho0.404640Falsebarracouta0.271485Falsegar0.189945False
.......................................
2045886366144734445568https://pbs.twimg.com/media/DE0BTnQUwAApKEH.jpg1French_bulldog0.999201TrueChihuahua0.000361TrueBoston_bull0.000076True
2046886680336477933568https://pbs.twimg.com/media/DE4fEDzWAAAyHMM.jpg1convertible0.738995Falsesports_car0.139952Falsecar_wheel0.044173False
2047886736880519319552https://pbs.twimg.com/media/DE5Se8FXcAAJFx4.jpg1kuvasz0.309706TrueGreat_Pyrenees0.186136TrueDandie_Dinmont0.086346True
2048886983233522544640https://pbs.twimg.com/media/DE8yicJW0AAAvBJ.jpg2Chihuahua0.793469Truetoy_terrier0.143528Truecan_opener0.032253False
2049887101392804085760https://pbs.twimg.com/media/DE-eAq6UwAA-jaE.jpg1Samoyed0.733942TrueEskimo_dog0.035029TrueStaffordshire_bullterrier0.029705True
2050887343217045368832https://pbs.twimg.com/ext_tw_video_thumb/887343120832229379/pu/img/6HSuFrW1lzI_9Mht.jpg1Mexican_hairless0.330741Truesea_lion0.275645FalseWeimaraner0.134203True
2051887473957103951883https://pbs.twimg.com/media/DFDw2tyUQAAAFke.jpg2Pembroke0.809197TrueRhodesian_ridgeback0.054950Truebeagle0.038915True
2052887517139158093824https://pbs.twimg.com/ext_tw_video_thumb/887517108413886465/pu/img/WanJKwssZj4VJvL9.jpg1limousine0.130432Falsetow_truck0.029175Falseshopping_cart0.026321False
2053887705289381826560https://pbs.twimg.com/media/DFHDQBbXgAEqY7t.jpg1basset0.821664Trueredbone0.087582TrueWeimaraner0.026236True
2054888078434458587136https://pbs.twimg.com/media/DFMWn56WsAAkA7B.jpg1French_bulldog0.995026Truepug0.000932Truebull_mastiff0.000903True
2055888202515573088257https://pbs.twimg.com/media/DFDw2tyUQAAAFke.jpg2Pembroke0.809197TrueRhodesian_ridgeback0.054950Truebeagle0.038915True
2056888554962724278272https://pbs.twimg.com/media/DFTH_O-UQAACu20.jpg3Siberian_husky0.700377TrueEskimo_dog0.166511Truemalamute0.111411True
2057888804989199671297https://pbs.twimg.com/media/DFWra-3VYAA2piG.jpg1golden_retriever0.469760TrueLabrador_retriever0.184172TrueEnglish_setter0.073482True
2058888917238123831296https://pbs.twimg.com/media/DFYRgsOUQAARGhO.jpg1golden_retriever0.714719TrueTibetan_mastiff0.120184TrueLabrador_retriever0.105506True
2059889278841981685760https://pbs.twimg.com/ext_tw_video_thumb/889278779352338437/pu/img/VlbFB3v8H8VwzVNY.jpg1whippet0.626152Trueborzoi0.194742TrueSaluki0.027351True
2060889531135344209921https://pbs.twimg.com/media/DFg_2PVW0AEHN3p.jpg1golden_retriever0.953442TrueLabrador_retriever0.013834Trueredbone0.007958True
2061889638837579907072https://pbs.twimg.com/media/DFihzFfXsAYGDPR.jpg1French_bulldog0.991650Trueboxer0.002129TrueStaffordshire_bullterrier0.001498True
2062889665388333682689https://pbs.twimg.com/media/DFi579UWsAAatzw.jpg1Pembroke0.966327TrueCardigan0.027356Truebasenji0.004633True
2063889880896479866881https://pbs.twimg.com/media/DFl99B1WsAITKsg.jpg1French_bulldog0.377417TrueLabrador_retriever0.151317Truemuzzle0.082981False
2064890006608113172480https://pbs.twimg.com/media/DFnwSY4WAAAMliS.jpg1Samoyed0.957979TruePomeranian0.013884Truechow0.008167True
2065890240255349198849https://pbs.twimg.com/media/DFrEyVuW0AAO3t9.jpg1Pembroke0.511319TrueCardigan0.451038TrueChihuahua0.029248True
2066890609185150312448https://pbs.twimg.com/media/DFwUU__XcAEpyXI.jpg1Irish_terrier0.487574TrueIrish_setter0.193054TrueChesapeake_Bay_retriever0.118184True
2067890729181411237888https://pbs.twimg.com/media/DFyBahAVwAAhUTd.jpg2Pomeranian0.566142TrueEskimo_dog0.178406TruePembroke0.076507True
2068890971913173991426https://pbs.twimg.com/media/DF1eOmZXUAALUcq.jpg1Appenzeller0.341703TrueBorder_collie0.199287Trueice_lolly0.193548False
2069891087950875897856https://pbs.twimg.com/media/DF3HwyEWsAABqE6.jpg1Chesapeake_Bay_retriever0.425595TrueIrish_terrier0.116317TrueIndian_elephant0.076902False
2070891327558926688256https://pbs.twimg.com/media/DF6hr6BUMAAzZgT.jpg2basset0.555712TrueEnglish_springer0.225770TrueGerman_short-haired_pointer0.175219True
2071891689557279858688https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg1paper_towel0.170278FalseLabrador_retriever0.168086Truespatula0.040836False
2072891815181378084864https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg1Chihuahua0.716012Truemalamute0.078253Truekelpie0.031379True
2073892177421306343426https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg1Chihuahua0.323581TruePekinese0.090647Truepapillon0.068957True
2074892420643555336193https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg1orange0.097049Falsebagel0.085851Falsebanana0.076110False

2075 rows × 12 columns

twitter_archive_enhanced.describe()
tweet_idin_reply_to_status_idin_reply_to_user_idretweeted_status_idretweeted_status_user_idrating_numeratorrating_denominator
count2.356000e+037.800000e+017.800000e+011.810000e+021.810000e+022356.0000002356.000000
mean7.427716e+177.455079e+172.014171e+167.720400e+171.241698e+1613.12648610.455433
std6.856705e+167.582492e+161.252797e+176.236928e+169.599254e+1645.8766486.745237
min6.660209e+176.658147e+171.185634e+076.661041e+177.832140e+050.0000000.000000
25%6.783989e+176.757419e+173.086374e+087.186315e+174.196984e+0910.00000010.000000
50%7.196279e+177.038708e+174.196984e+097.804657e+174.196984e+0911.00000010.000000
75%7.993373e+178.257804e+174.196984e+098.203146e+174.196984e+0912.00000010.000000
max8.924206e+178.862664e+178.405479e+178.874740e+177.874618e+171776.000000170.000000
twitter_archive_enhanced.rating_denominator.value_counts()
10     2333
11        3
50        3
80        2
20        2
2         1
16        1
40        1
70        1
15        1
90        1
110       1
120       1
130       1
150       1
170       1
7         1
0         1
Name: rating_denominator, dtype: int64
twitter_archive_enhanced.name.value_counts()
None            745
a                55
Charlie          12
Cooper           11
Lucy             11
Oliver           11
Penny            10
Tucker           10
Lola             10
Winston           9
Bo                9
Sadie             8
the               8
Toby              7
Daisy             7
Bailey            7
Buddy             7
an                7
Scout             6
Dave              6
Bella             6
Rusty             6
Koda              6
Oscar             6
Stanley           6
Jax               6
Jack              6
Leo               6
Milo              6
Phil              5
               ... 
Dunkin            1
Crimson           1
Brian             1
Randall           1
Snoopy            1
Puff              1
Sid               1
Huck              1
Pete              1
Antony            1
Stephanus         1
Striker           1
Rizzo             1
Marvin            1
Perry             1
Cleopatricia      1
Siba              1
Rontu             1
Boston            1
Filup             1
Deacon            1
Anna              1
Dudley            1
Jangle            1
Dallas            1
Emma              1
Izzy              1
Rascal            1
Willow            1
Alf               1
Name: name, Length: 957, dtype: int64
twitter_archive_enhanced[twitter_archive_enhanced.tweet_id.duplicated()]
tweet_idin_reply_to_status_idin_reply_to_user_idtimestampsourcetextretweeted_status_idretweeted_status_user_idretweeted_status_timestampexpanded_urlsrating_numeratorrating_denominatornamedoggoflooferpupperpuppo
extra_data.describe()
retweet_countfavorite_count
count2352.0000002352.000000
mean3134.9323988109.198980
std5237.84629611980.795669
min0.0000000.000000
25%618.0000001417.000000
50%1456.5000003596.500000
75%3628.75000010118.000000
max79116.000000132318.000000
image_predictions.describe()
tweet_idimg_nump1_confp2_confp3_conf
count2.075000e+032075.0000002075.0000002.075000e+032.075000e+03
mean7.384514e+171.2038550.5945481.345886e-016.032417e-02
std6.785203e+160.5618750.2711741.006657e-015.090593e-02
min6.660209e+171.0000000.0443331.011300e-081.740170e-10
25%6.764835e+171.0000000.3644125.388625e-021.622240e-02
50%7.119988e+171.0000000.5882301.181810e-014.944380e-02
75%7.932034e+171.0000000.8438551.955655e-019.180755e-02
max8.924206e+174.0000001.0000004.880140e-012.734190e-01
image_predictions[image_predictions.tweet_id.duplicated()]
tweet_idjpg_urlimg_nump1p1_confp1_dogp2p2_confp2_dogp3p3_confp3_dog

质量

twitter_archive_enhanced表:

  • 数据集中存在转发的条目及无图片推特(项目动机中要求我们只需要含有图片的原始评级 (不包括转发))
  • 下列数据缺失(in_reply_to_status_id,in_reply_to_user_id,retweeted_status_id,retweeted_status_user_id,retweeted_status_timestamp,expanded_urls )
  • 狗狗评分数据不完整,分母有不为10的数据(rating_denominator列)
  • rating_numerator列中,数据提取有误,如:11.27/10 分子提取的是27,11.26\10,分子提取的是26/10
  • text列中有多个狗狗评分数据
  • 狗狗姓名缺失,且有"a",“the”,“an”,及小写字母开头的单词,如"quite"
  • 狗狗地位数据缺失
  • 从text列可发现,狗狗的地位数据有两个
  • 错误的数据类型(tweet_idtimestamp列)
  • name 至 puppo 列中空值用None表示
image_predictions表:
  • 错误的数据类型(tweet_id列)

清洁度

twitter_archive_enhanced表:

  • doggo,floofer,pupper,puppo四个列标题是值

三个数据集都是以 tweet_id 为观察对象,却未合并为一张表

清理

# 创建各表的副本
twitter_archive_enhanced_clean = twitter_archive_enhanced.copy()
extra_data_clean = extra_data.copy()
image_predictions_clean = image_predictions.copy()

冗余数据

twitter_archive_enhanced数据集中存在转发的条目及无图片推特(项目动机中要求我们只需要含有图片的原始评级 (不包括转发))
定义

通过 isnull()函数筛选出retweeted_status_id, retweeted_status_user_idretweeted_status_timestamp这三列为空值的行

通过 notnull()函数筛选出expanded_urls列不为空值的行

代码
#删除转发内容
twitter_archive_enhanced_clean = twitter_archive_enhanced_clean[twitter_archive_enhanced_clean.retweeted_status_id.isnull()&
                                                                twitter_archive_enhanced_clean.retweeted_status_user_id.isnull()&
                                                                twitter_archive_enhanced_clean.retweeted_status_timestamp.isnull()]
#删除无图片推特
twitter_archive_enhanced_clean = twitter_archive_enhanced_clean[twitter_archive_enhanced_clean.expanded_urls.notnull()]
测试
twitter_archive_enhanced_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2117 entries, 0 to 2355
Data columns (total 17 columns):
tweet_id                      2117 non-null int64
in_reply_to_status_id         23 non-null float64
in_reply_to_user_id           23 non-null float64
timestamp                     2117 non-null object
source                        2117 non-null object
text                          2117 non-null object
retweeted_status_id           0 non-null float64
retweeted_status_user_id      0 non-null float64
retweeted_status_timestamp    0 non-null object
expanded_urls                 2117 non-null object
rating_numerator              2117 non-null int64
rating_denominator            2117 non-null int64
name                          2117 non-null object
doggo                         2117 non-null object
floofer                       2117 non-null object
pupper                        2117 non-null object
puppo                         2117 non-null object
dtypes: float64(4), int64(3), object(10)
memory usage: 297.7+ KB

缺失数据

twitter_archive_enhanced:如下列数据缺失

(in_reply_to_status_id,in_reply_to_user_id,retweeted_status_id,retweeted_status_user_id,retweeted_status_timestamp,expanded_urls )

定义

这几列数据虽然缺失,但是对于我们之后的分析没有多大意义,可以通过 .drop函数将其删除,source列也可以删除掉。

代码
twitter_archive_enhanced_clean.drop(["in_reply_to_status_id","in_reply_to_user_id","retweeted_status_id","source",
                                     "retweeted_status_user_id","retweeted_status_timestamp","expanded_urls"],axis=1,inplace=True)
# 要删除列,所以轴的值为1,inplace=True,在原表上进行修改
测试
twitter_archive_enhanced_clean.head()
tweet_idtimestamptextrating_numeratorrating_denominatornamedoggoflooferpupperpuppo
08924206435553361932017-08-01 16:23:56 +0000This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 https://t.co/MgUWQ76dJU1310PhineasNoneNoneNoneNone
18921774213063434262017-08-01 00:17:27 +0000This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 https://t.co/0Xxu71qeIV1310TillyNoneNoneNoneNone
28918151813780848642017-07-31 00:18:03 +0000This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 https://t.co/wUnZnhtVJB1210ArchieNoneNoneNoneNone
38916895572798586882017-07-30 15:58:51 +0000This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us https://t.co/tD36da7qLQ1310DarlaNoneNoneNoneNone
48913275589266882562017-07-29 16:00:24 +0000This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek https://t.co/AtUZn91f7f1210FranklinNoneNoneNoneNone
twitter_archive_enhanced:狗狗评分数据不完整,分母有不为10的数据(rating_denominator列,text列中有多个狗狗的评分数据
定义

重新提取:
通过str.findall函数使用正则表达从text列中提取狗狗评分

代码
#twitter_archive_enhanced_clean['rate'] = twitter_archive_enhanced_clean.text.str.extract('(\d+\.?\d+\/10)',expand=True)
#def extract_num(string):
   
    #rating = []
    #string = string.split()
    #for x in string:
        #match = re.search(r'\d+\.?\d+\/d+',x)
        #if match:
            #rating.append(match.group())
            
    #return rating

#twitter_archive_enhanced_clean.text.apply(extract_num)
## 如何让多个列的值连接到一起

#for string in twitter_archive_enhanced_clean.text:
    #extract_num(string)
# 通过正则表达式提取出狗狗评分,并赋值给"rating"列
#正则表达如下 ('(\d+\.?\d+\/10)')这样为什么匹配不到分子是个位数的数,('((?:\d+\.)?\d+\/10)')却可以
twitter_archive_enhanced_clean['rating'] = twitter_archive_enhanced_clean.text.str.findall('((?:\d+\.)?\d+\/10)')
twitter_archive_enhanced_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2117 entries, 0 to 2355
Data columns (total 11 columns):
tweet_id              2117 non-null int64
timestamp             2117 non-null object
text                  2117 non-null object
rating_numerator      2117 non-null int64
rating_denominator    2117 non-null int64
name                  2117 non-null object
doggo                 2117 non-null object
floofer               2117 non-null object
pupper                2117 non-null object
puppo                 2117 non-null object
rating                2117 non-null object
dtypes: int64(3), object(8)
memory usage: 198.5+ KB
# 查看提取情况,结果为列表
twitter_archive_enhanced_clean['rating']
0       [13/10]
1       [13/10]
2       [12/10]
3       [13/10]
4       [12/10]
5       [13/10]
6       [13/10]
7       [13/10]
8       [13/10]
9       [14/10]
10      [13/10]
11      [13/10]
12      [13/10]
13      [12/10]
14      [13/10]
15      [13/10]
16      [12/10]
17      [13/10]
18      [13/10]
20      [12/10]
21      [13/10]
22      [14/10]
23      [13/10]
24      [13/10]
25      [12/10]
26      [13/10]
27      [13/10]
28      [13/10]
29      [12/10]
31      [13/10]
         ...   
2326     [2/10]
2327     [7/10]
2328     [9/10]
2329    [11/10]
2330     [6/10]
2331     [8/10]
2332    [10/10]
2333     [9/10]
2334     [3/10]
2335     [9/10]
2336    [11/10]
2337    [10/10]
2338     [1/10]
2339    [11/10]
2340     [8/10]
2341     [9/10]
2342     [6/10]
2343    [10/10]
2344     [9/10]
2345    [10/10]
2346     [8/10]
2347     [9/10]
2348    [10/10]
2349     [2/10]
2350    [10/10]
2351     [5/10]
2352     [6/10]
2353     [9/10]
2354     [7/10]
2355     [8/10]
Name: rating, Length: 2117, dtype: object
# 查看是否有多个评分存在
for rate in twitter_archive_enhanced_clean.rating:
    if len(rate) >1:
        print(rate)
['12/10', '11/10']
['10/10', '7/10']
['10/10', '8/10']
['9/10', '2/10']
['4/10', '13/10']
['10/10', '5/10']
['5/10', '10/10']
['10/10', '6/10']
['11/10', '10/10']
['10/10', '11/10']
['10/10', '7/10']
['10/10', '4/10']
['5/10', '8/10']
['8/10', '11/10']
['10/10', '7/10', '12/10']
['11/10', '8/10']
['11/10', '8/10']
['10/10', '7/10']
['8/10', '1/10']
['10/10', '4/10']
['7/10', '8/10']
['10/10', '10/10']
# 将多个评分的结果通过“&”符号连接,因为函数功能比较简单,通过apply中匿名函数lambda实现
# 通过上述结果可以看到有提取出两个相同的结果,所以利用集合set进行去重
twitter_archive_enhanced_clean['rating'] = twitter_archive_enhanced_clean['rating'].apply(lambda x:"&".join(set(x)))
twitter_archive_enhanced_clean['rating'].value_counts()
12/10               488
10/10               427
11/10               415
13/10               296
9/10                153
8/10                 96
7/10                 50
14/10                41
6/10                 32
5/10                 31
3/10                 19
4/10                 14
                     13
2/10                  9
1/10                  4
7/10&10/10            3
8/10&11/10            2
0/10                  2
10/10&11/10           2
5/10&10/10            2
10/10&4/10            2
12/10&11/10           1
1776/10               1
11/10&8/10            1
13/10&4/10            1
9.75/10               1
13.5/10               1
12/10&7/10&10/10      1
6/10&10/10            1
11.26/10              1
11.27/10              1
420/10                1
1/10&8/10             1
7/10&8/10             1
5/10&8/10             1
10/10&8/10            1
9/10&2/10             1
Name: rating, dtype: int64
# 打印出有多个评分结果的数据,根据"text"列内容,手动去除多余的评分
twitter_archive_enhanced_clean[twitter_archive_enhanced_clean['rating'].str.len()>7]
tweet_idtimestamptextrating_numeratorrating_denominatornamedoggoflooferpupperpupporating
7637780270342201262082016-09-20 00:24:34 +0000This is Sophie. She's a Jubilant Bush Pupper. Super h*ckin rare. Appears at random just to smile at the locals. 11.27/10 would smile back https://t.co/QFaUiIHxHq2710SophieNoneNonepupperNone11.27/10
7667776842335402065922016-09-19 01:42:24 +0000"Yep... just as I suspected. You're not flossing." 12/10 and 11/10 for the pup not flossing https://t.co/SuXcI9B7pQ1210NoneNoneNoneNoneNone12/10&11/10
10077476007694786928642016-06-28 01:21:27 +0000This is Bookstore and Seaweed. Bookstore is tired and Seaweed is an asshole. 10/10 and 7/10 respectively https://t.co/eUGjGjjFVJ1010BookstoreNoneNoneNoneNone7/10&10/10
12227142582587903877132016-03-28 01:10:13 +0000Meet Travis and Flurp. Travis is pretty chill but Flurp can't lie down properly. 10/10 &amp; 8/10\nget it together Flurp https://t.co/Akzl5ynMmE1010TravisNoneNoneNoneNone10/10&8/10
13597033563937813299222016-02-26 23:10:06 +0000This is Socks. That water pup w the super legs just splashed him. Socks did not appreciate that. 9/10 and 2/10 https://t.co/8rc5I22bBf910SocksNoneNoneNoneNone9/10&2/10
14596950643441917214722016-02-04 02:00:27 +0000This may be the greatest video I've ever been sent. 4/10 for Charles the puppy, 13/10 overall. (Vid by @stevenxx_) https://t.co/uaJmNgXR2P410NoneNoneNoneNoneNone13/10&4/10
14656943528399933440002016-02-02 02:53:12 +0000Meet Oliviér. He takes killer selfies. Has a dog of his own. It leaps at random &amp; can't bark for shit. 10/10 &amp; 5/10 https://t.co/6NgsQJuSBJ1010OliviérNoneNoneNoneNone5/10&10/10
15086914830413242040332016-01-25 04:49:38 +0000When bae says they can't go out but you see them with someone else that same night. 5/10 &amp; 10/10 for heartbroken pup https://t.co/aenk0KpoWM510NoneNoneNoneNoneNone5/10&10/10
15256904003676962979852016-01-22 05:07:29 +0000This is Eriq. His friend just reminded him of last year's super bowl. Not cool friend\n10/10 for Eriq\n6/10 for friend https://t.co/PlEXTofdpf1010EriqNoneNoneNoneNone6/10&10/10
15386898359781319352332016-01-20 15:44:48 +0000Meet Fynn &amp; Taco. Fynn is an all-powerful leaf lord and Taco is in the wrong place at the wrong time. 11/10 &amp; 10/10 https://t.co/MuqHPvtL8c1110FynnNoneNoneNoneNone10/10&11/10
17126804947266430689292015-12-25 21:06:00 +0000Here we have uncovered an entire battalion of holiday puppers. Average of 11.26/10 https://t.co/eNm2S6p9BD2610NoneNoneNoneNoneNone11.26/10
17956773148121253232652015-12-17 02:30:09 +0000Meet Tassy &amp; Bee. Tassy is pretty chill, but Bee is convinced the Ruffles are haunted. 10/10 &amp; 11/10 respectively https://t.co/fgORpmTN9C1010TassyNoneNoneNoneNone10/10&11/10
18326761918324858101772015-12-14 00:07:50 +0000These two pups just met and have instantly bonded. Spectacular scene. Mesmerizing af. 10/10 and 7/10 for blue dog https://t.co/gwryaJO4tC1010NoneNoneNoneNoneNone7/10&10/10
18976747371309130711042015-12-09 23:47:22 +0000Meet Rufio. He is unaware of the pink legless pupper wrapped around him. Might want to get that checked 10/10 &amp; 4/10 https://t.co/KNfLnYPmYh1010RufioNoneNonepupperNone10/10&4/10
19016746463920449413122015-12-09 17:46:48 +0000Two gorgeous dogs here. Little waddling dog is a rebel. Refuses to look at camera. Must be a preteen. 5/10 &amp; 8/10 https://t.co/YPfw7oahbD510NoneNoneNoneNoneNone5/10&8/10
19706732952685536051202015-12-06 00:17:55 +0000Meet Eve. She's a raging alcoholic 8/10 (would b 11/10 but pupper alcoholism is a tragic issue that I can't condone) https://t.co/U36HYQIijg810EveNoneNonepupperNone11/10&8/10
20106722480132937523202015-12-03 02:56:30 +000010/10 for dog. 7/10 for cat. 12/10 for human. Much skill. Would pet all https://t.co/uhx5gfpx5k1010NoneNoneNoneNoneNone12/10&7/10&10/10
20646711545720444682252015-11-30 02:31:34 +0000Meet Holly. She's trying to teach small human-like pup about blocks but he's not paying attention smh. 11/10 &amp; 8/10 https://t.co/RcksaUrGNu1110HollyNoneNoneNoneNone8/10&11/10
21136704341279387197442015-11-28 02:48:46 +0000Meet Hank and Sully. Hank is very proud of the pumpkin they found and Sully doesn't give a shit. 11/10 and 8/10 https://t.co/cwoP1ftbrj1110HankNoneNoneNoneNone8/10&11/10
21776690370583636623362015-11-24 06:17:19 +0000Here we have Pancho and Peaches. Pancho is a Condoleezza Gryffindor, and Peaches is just an asshole. 10/10 &amp; 7/10 https://t.co/Lh1BsJrWPp1010NoneNoneNoneNoneNone7/10&10/10
22166685378375124336652015-11-22 21:13:35 +0000This is Spark. He's nervous. Other dog hasn't moved in a while. Won't come when called. Doesn't fetch well 8/10&amp;1/10 https://t.co/stEodX9Aba810SparkNoneNoneNoneNone1/10&8/10
22636675443205563351042015-11-20 03:25:43 +0000This is Kial. Kial is either wearing a cape, which would be rad, or flashing us, which would be rude. 10/10 or 4/10 https://t.co/8zcwIoiuqR1010KialNoneNoneNoneNone10/10&4/10
22726674910093796065282015-11-19 23:53:52 +0000Two dogs in this one. Both are rare Jujitsu Pythagoreans. One slightly whiter than other. Long legs. 7/10 and 8/10 https://t.co/ITxxcc4v9y710NoneNoneNoneNoneNone7/10&8/10
sum(twitter_archive_enhanced_clean['rating'].str.len()>7)
# 其中有两条数据是小数,长度大于7,实际需要修改21条数据
23
# 手动修改,去重
twitter_archive_enhanced_clean.loc[766,"rating"] = "11.5/10"  
twitter_archive_enhanced_clean.loc[1007,"rating"] = "8.6/10" 
twitter_archive_enhanced_clean.loc[1222,"rating"] = "9/10" 
twitter_archive_enhanced_clean.loc[1359,"rating"] = "9/10" 
twitter_archive_enhanced_clean.loc[1459,"rating"] = "4/10" 
twitter_archive_enhanced_clean.loc[1465,"rating"] = "10/10" 
twitter_archive_enhanced_clean.loc[1508,"rating"] = "5/10" 
twitter_archive_enhanced_clean.loc[1525,"rating"] = "6/10" 
twitter_archive_enhanced_clean.loc[1538,"rating"] = "10.5/10" 
twitter_archive_enhanced_clean.loc[1795,"rating"] = "10.5/10" 
twitter_archive_enhanced_clean.loc[1832,"rating"] = "8.5/10" 
twitter_archive_enhanced_clean.loc[1897,"rating"] = "4/10" 
twitter_archive_enhanced_clean.loc[1901,"rating"] = "6.5/10" 
twitter_archive_enhanced_clean.loc[1970,"rating"] = "8/10" 
twitter_archive_enhanced_clean.loc[2010,"rating"] = "10/10" 
twitter_archive_enhanced_clean.loc[2064,"rating"] = "8/10" 
twitter_archive_enhanced_clean.loc[2113,"rating"] = "9.5/10" 
twitter_archive_enhanced_clean.loc[2177,"rating"] = "8.5/10" 
twitter_archive_enhanced_clean.loc[2216,"rating"] = "8/10" 
twitter_archive_enhanced_clean.loc[2263,"rating"] = "7/10" 
twitter_archive_enhanced_clean.loc[2272,"rating"] = "7.5/10" 
twitter_archive_enhanced_clean[twitter_archive_enhanced_clean['rating'].str.len()>7]
tweet_idtimestamptextrating_numeratorrating_denominatornamedoggoflooferpupperpupporating
7637780270342201262082016-09-20 00:24:34 +0000This is Sophie. She's a Jubilant Bush Pupper. Super h*ckin rare. Appears at random just to smile at the locals. 11.27/10 would smile back https://t.co/QFaUiIHxHq2710SophieNoneNonepupperNone11.27/10
17126804947266430689292015-12-25 21:06:00 +0000Here we have uncovered an entire battalion of holiday puppers. Average of 11.26/10 https://t.co/eNm2S6p9BD2610NoneNoneNoneNoneNone11.26/10
# 通过/分隔符进行分隔,提取分子,赋值给"rating"列
twitter_archive_enhanced_clean["rating"] = twitter_archive_enhanced_clean.rating.str.split("/").str[0]
#将"rating"列的数据类型转换为数字类型
twitter_archive_enhanced_clean["rating"] = pd.to_numeric(twitter_archive_enhanced_clean["rating"],errors='coerce')
# "rating"列缺失值
twitter_archive_enhanced_clean[twitter_archive_enhanced_clean.rating.isnull()]
tweet_idtimestamptextrating_numeratorrating_denominatornamedoggoflooferpupperpupporating
4338206901766451404812017-01-15 17:52:40 +0000The floofs have been released I repeat the floofs have been released. 84/70 https://t.co/NIYC820tmd8470NoneNoneNoneNoneNoneNaN
5168109846524124241922016-12-19 23:06:23 +0000Meet Sam. She smiles 24/7 &amp; secretly aspires to be a reindeer. \nKeep Sam smiling by clicking and sharing this link:\nhttps://t.co/98tB8y7y7t https://t.co/LouL5vdvxx247SamNoneNoneNoneNoneNaN
9027584672447624970242016-07-28 01:00:57 +0000Why does this never happen at my front door... 165/150 https://t.co/HmwrdfEfUE165150NoneNoneNoneNoneNoneNaN
11207311560237429882882016-05-13 16:15:54 +0000Say hello to this unbelievably well behaved squad of doggos. 204/170 would try to pet all at once https://t.co/yGQI3He3xv204170thisNoneNoneNoneNoneNaN
12287139006034376212492016-03-27 01:29:02 +0000Happy Saturday here's 9 puppers on a bench. 99/90 good work everybody https://t.co/mpvaVxKmc19990NoneNoneNoneNoneNoneNaN
12547106586908865863722016-03-18 02:46:49 +0000Here's a brigade of puppers. All look very prepared for whatever happens next. 80/80 https://t.co/0eb7R1Om128080NoneNoneNoneNoneNoneNaN
12747091983956430684162016-03-14 02:04:08 +0000From left to right:\nCletus, Jerome, Alejandro, Burp, &amp; Titson\nNone know where camera is. 45/50 would hug all at once https://t.co/sedre1ivTK4550NoneNoneNoneNoneNoneNaN
13517040548451211427842016-02-28 21:25:30 +0000Here is a whole flock of puppers. 60/50 I'll take the lot https://t.co/9dpcw6MdWa6050aNoneNoneNoneNoneNaN
14336974630318827642882016-02-10 16:51:59 +0000Happy Wednesday here's a bucket of pups. 44/40 would pet all at once https://t.co/HppvrYuamZ4440NoneNoneNoneNoneNoneNaN
16346842257444074946562016-01-05 04:11:44 +0000Two sneaky puppers were not initially seen, moving the rating to 143/130. Please forgive us. Thank you https://t.co/kRK51Y5ac3143130NoneNoneNoneNoneNoneNaN
16356842228683355054152016-01-05 04:00:18 +0000Someone help the girl is being mugged. Several are distracting her while two steal her shoes. Clever puppers 121/110 https://t.co/1zfnTJLt55121110NoneNoneNoneNoneNoneNaN
17796777165157943296002015-12-18 05:06:23 +0000IT'S PUPPERGEDDON. Total of 144/120 ...I think https://t.co/ZanVtAtvIq144120NoneNoneNoneNoneNoneNaN
18436758530644363919362015-12-13 01:41:41 +0000Here we have an entire platoon of puppers. Total score: 88/80 would pet all at once https://t.co/y93p6FLvVw8880NoneNoneNoneNoneNoneNaN
## 将"rating_numerator","rating_denominator"列删除
twitter_archive_enhanced_clean.drop(["rating_numerator","rating_denominator"],axis = 1,inplace=True)
测试
twitter_archive_enhanced_clean.head()
tweet_idtimestamptextnamedoggoflooferpupperpupporating
08924206435553361932017-08-01 16:23:56 +0000This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 https://t.co/MgUWQ76dJUPhineasNoneNoneNoneNone13.0
18921774213063434262017-08-01 00:17:27 +0000This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 https://t.co/0Xxu71qeIVTillyNoneNoneNoneNone13.0
28918151813780848642017-07-31 00:18:03 +0000This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 https://t.co/wUnZnhtVJBArchieNoneNoneNoneNone12.0
38916895572798586882017-07-30 15:58:51 +0000This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us https://t.co/tD36da7qLQDarlaNoneNoneNoneNone13.0
48913275589266882562017-07-29 16:00:24 +0000This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek https://t.co/AtUZn91f7fFranklinNoneNoneNoneNone12.0
twitter_archive_enhanced_clean.rating.value_counts()
12.00      488
10.00      429
11.00      415
13.00      296
9.00       155
8.00        99
7.00        51
14.00       41
6.00        33
5.00        32
3.00        19
4.00        16
2.00         9
1.00         4
8.50         2
10.50        2
0.00         2
11.27        1
420.00       1
13.50        1
8.60         1
11.26        1
11.50        1
7.50         1
9.75         1
6.50         1
9.50         1
1776.00      1
Name: rating, dtype: int64
twitter_archive_enhance:狗狗姓名缺失,且有"a",“the”,“an”,及小写字母开头的单词,如"quite"
定义

重新提取:
通过str.extract函数使用正则表达从text列中提取狗狗姓名

代码
测试
## 观察狗狗姓名的所处位置,可以发现“hello to”,"Meet","This is"语句后通常都是狗狗姓名.
## 狗狗姓名首字母都是大写,通过使用python正则表达的 "分组"及"环视"用法,构造搜索的正则表达式
twitter_archive_enhanced_clean['name'] = twitter_archive_enhanced_clean.text.str.extract('(?:This is|named|Meet|hello to|name is|Here we have|Here is)\s([A-Z].*?(?=\\.))',expand=True)
twitter_archive_enhanced_clean.name.value_counts()
Charlie                                 11
Cooper                                  10
Lucy                                    10
Tucker                                   9
Oliver                                   9
Lola                                     8
Penny                                    8
Winston                                  8
Daisy                                    7
Toby                                     7
Bailey                                   6
Bella                                    6
Stanley                                  6
Koda                                     6
Sadie                                    6
Oscar                                    5
Buddy                                    5
Dave                                     5
Leo                                      5
Bo                                       5
Jax                                      5
Scout                                    5
Louis                                    5
Rusty                                    5
Chip                                     4
Cassie                                   4
Finn                                     4
Milo                                     4
Gus                                      4
Duke                                     4
                                        ..
Travis and Flurp                         1
Taco                                     1
Emma                                     1
Jangle                                   1
Jersey                                   1
Dudley                                   1
Moofasa                                  1
Hercules                                 1
Petrick                                  1
Yoda                                     1
Rooney                                   1
Fabio                                    1
Klein                                    1
Birf                                     1
Cheryl AKA Queen Pupper of the Skies     1
Huck                                     1
Antony                                   1
Stephanus                                1
Lassie                                   1
Howard                                   1
Striker                                  1
Cali                                     1
Marvin                                   1
Perry                                    1
Cleopatricia                             1
Siba                                     1
Pete                                     1
Boston                                   1
Deacon                                   1
Alf                                      1
Name: name, Length: 992, dtype: int64
## 对于有两个姓名的狗狗,将"and"换为连接符"&"
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace(r'(\s)and(\s)',"&")
## 同理,将"&amp;"替换为"&"
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace('&amp;',"&")
## 手动清洗如下情况的值
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace("Gary, Carrie Fisher's dog","Gary")
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace("her 2 pups"," ")
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace("his son"," ")
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace("her son"," ")
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.str.replace("Zeke the Wonder Dog","Zeke")
twitter_archive_enhanced_clean["name"] = twitter_archive_enhanced_clean.name.replace("None","np.nan")
twitter_archive_enhanced_clean[twitter_archive_enhanced_clean.name.isnull()]
tweet_idtimestamptextnamedoggoflooferpupperpupporating
58910879508758978562017-07-29 00:08:17 +0000Here we have a majestic great white breaching off South Africa's coast. Absolutely h*ckin breathtaking. 13/10 (IG: tucker_marlo) #BarkWeek https://t.co/kQ04fDDRmhNaNNoneNoneNoneNone13.0
78907291814112378882017-07-28 00:22:40 +0000When you watch your owner call another dog a good boy but then they turn back to you and say you're a great boy. 13/10 https://t.co/v0nONBcwxqNaNNoneNoneNoneNone13.0
128896653883336826892017-07-25 01:55:32 +0000Here's a puppo that seems to be on the fence about something haha no but seriously someone help her. 13/10 https://t.co/BxvuXk0UCmNaNNoneNoneNonepuppo13.0
228875171391580938242017-07-19 03:39:09 +0000I've yet to rate a Venezuelan Hover Wiener. This is such an honor. 14/10 paw-inspiring af (IG: roxy.thedoxy) https://t.co/20VrLAA8baNaNNoneNoneNoneNone14.0
248873432170453688322017-07-18 16:08:03 +0000You may not have known you needed to see this today. 13/10 please enjoy (IG: emmylouroo) https://t.co/WZqNqygEyVNaNNoneNoneNoneNone13.0
258871013928040857602017-07-18 00:07:08 +0000This... is a Jubilant Antarctic House Bear. We only rate dogs. Please only send dogs. Thank you... 12/10 would suffocate in floof https://t.co/4Ad1jzJSdpNaNNoneNoneNoneNone12.0
378851676198836387842017-07-12 16:03:00 +0000Here we have a corgi undercover as a malamute. Pawbably doing important investigative work. Zero control over tongue happenings. 13/10 https://t.co/44ItaMubBfNaNNoneNoneNoneNone13.0
418844418053827174402017-07-10 15:58:53 +0000I present to you, Pup in Hat. Pup in Hat is great for all occasions. Extremely versatile. Compact as h*ck. 14/10 (IG: itselizabethgales) https://t.co/vvBOcC2VdCNaNNoneNoneNoneNone14.0
428842478788514938882017-07-10 03:08:17 +0000OMG HE DIDN'T MEAN TO HE WAS JUST TRYING A LITTLE BARKOUR HE'S SUPER SORRY 13/10 WOULD FORGIVE IMMEDIATE https://t.co/uF3pQ8WubjNaNNoneNoneNoneNone13.0
478831178360460861442017-07-07 00:17:54 +0000Please only send dogs. We don't rate mechanics, no matter how h*ckin good. Thank you... 13/10 would sneak a pat https://t.co/Se5fZ9wp5ENaNNoneNoneNoneNone13.0
568815360043808727062017-07-02 15:32:16 +0000Here is a pupper approaching maximum borkdrive. Zooming at never before seen speeds. 14/10 paw-inspiring af \n(IG: puffie_the_chow) https://t.co/ghXBIIeQZFNaNNoneNonepupperNone14.0
598808724488157716482017-06-30 19:35:32 +0000Ugh not again. We only rate dogs. Please don't send in well-dressed floppy-tongued street penguins. Dogs only please. Thank you... 12/10 https://t.co/WiAMbTkDPfNaNNoneNoneNoneNone12.0
628800957828708966412017-06-28 16:09:20 +0000Please don't send in photos without dogs in them. We're not @porch_rates. Insubordinate and churlish. Pretty good porch tho 11/10 https://t.co/HauE8M3Bu4NaNNoneNoneNoneNone11.0
728786047072117268522017-06-24 13:24:20 +0000Martha is stunning how h*ckin dare you. 13/10 https://t.co/9uABQXgjwaNaNNoneNoneNoneNone13.0
838765376660612218892017-06-18 20:30:39 +0000I can say with the pupmost confidence that the doggos who assisted with this search are heroic as h*ck. 14/10 for all https://t.co/8yoc1CNTsuNaNNoneNoneNoneNone14.0
888750971926120775682017-06-14 21:06:43 +0000You'll get your package when that precious man is done appreciating the pups. 13/10 for everyone https://t.co/PFp4MghzBWNaNNoneNoneNoneNone13.0
898750212112515973122017-06-14 16:04:48 +0000Guys please stop sending pictures without any dogs in th- oh never mind hello excuse me sir. 12/10 stealthy as h*ck https://t.co/brCQoqc8AWNaNNoneNoneNoneNone12.0
938740575629368115202017-06-12 00:15:36 +0000I can't believe this keeps happening. This, is a birb taking a bath. We only rate dogs. Please only send dogs. Thank you... 12/10 https://t.co/pwY9PQhtP2NaNNoneNoneNoneNone12.0
968735802838403440652017-06-10 16:39:04 +0000We usually don't rate Deck-bound Saskatoon Black Bears, but this one is h*ckin flawless. Sneaky tongue slip too. 13/10 would hug firmly https://t.co/mNuMH9400nNaNNoneNoneNoneNone13.0
998729671041477632002017-06-09 00:02:31 +0000Here's a very large dog. He has a date later. Politely asked this water person to check if his breath is bad. 12/10 good to go doggo https://t.co/EMYIdoblMRNaNdoggoNoneNoneNone12.0
1008728206835412377602017-06-08 14:20:41 +0000Here are my favorite #dogsatpollingstations \nMost voted for a more consistent walking schedule and to increase daily pats tenfold. All 13/10 https://t.co/17FVMl4VZ5NaNNoneNoneNoneNone13.0
1038724869791617966082017-06-07 16:14:40 +0000We. Only. Rate. Dogs. Do not send in other things like this fluffy floor shark clearly ready to attack. Get it together guys... 12/10 https://t.co/BZHiKx3FpQNaNNoneNoneNoneNone12.0
1108711025206382673922017-06-03 20:33:19 +0000Never doubt a doggo 14/10 https://t.co/AbBLh2FZCHNaNdoggoNoneNoneNone14.0
1128708043173678817282017-06-03 00:48:22 +0000Real funny guys. Sending in a pic without a dog in it. Hilarious. We'll rate the rug tho because it's giving off a very good vibe. 11/10 https://t.co/GCD1JccCyiNaNNoneNoneNoneNone11.0
1258686224954436321282017-05-28 00:18:35 +0000Here's a h*ckin peaceful boy. Unbothered by the comings and goings. 13/10 please reveal your wise ways https://t.co/yeaH8Ej5eMNaNNoneNoneNoneNone13.0
1278679004954106716162017-05-26 00:29:37 +0000Unbelievable. We only rate dogs. Please don't send in non-canines like the "I" from Pixar's opening credits. Thank you... 12/10 https://t.co/JMhDNv5wXZNaNNoneNoneNoneNone12.0
1318670515209021685762017-05-23 16:16:06 +0000Oh my this spooked me up. We only rate dogs, not happy ghosts. Please send dogs only. It's a very simple premise. Thank you... 13/10 https://t.co/M5Rz0R8SIQNaNNoneNoneNoneNone13.0
1338667206848730562602017-05-22 18:21:28 +0000He was providing for his family 13/10 how dare you https://t.co/Q8mVwWN3f4NaNNoneNoneNoneNone13.0
1418648732064984145922017-05-17 16:00:15 +0000We only rate dogs. Please don't send in Jesus. We're trying to remain professional and legitimate. Thank you... 14/10 https://t.co/wr3xsjeCIRNaNNoneNoneNoneNone14.0
1498630795471887851542017-05-12 17:12:53 +0000Ladies and gentlemen... I found Pipsy. He may have changed his name to Pablo, but he never changed his love for the sea. Pupgraded to 14/10 https://t.co/lVU5GyNFenNaNNoneNoneNoneNone14.0
..............................
23266664115075514818572015-11-17 00:24:19 +0000This is quite the dog. Gets really excited when not in water. Not very soft tho. Bad at fetch. Can't do tricks. 2/10 https://t.co/aMCTNWO94tNaNNoneNoneNoneNone2.0
23276664071268567654402015-11-17 00:06:54 +0000This is a southern Vesuvius bumblegruff. Can drive a truck (wow). Made friends with 5 other nifty dogs (neat). 7/10 https://t.co/LopTBkKa8hNaNNoneNoneNoneNone7.0
23286663962473732915202015-11-16 23:23:41 +0000Oh goodness. A super rare northeast Qdoba kangaroo mix. Massive feet. No pouch (disappointing). Seems alert. 9/10 https://t.co/Dc7b0E8qFENaNNoneNoneNoneNone9.0
23296663737537445888022015-11-16 21:54:18 +0000Those are sunglasses and a jean jacket. 11/10 dog cool af https://t.co/uHXrPkUEylNaNNoneNoneNoneNone11.0
23306663627589092843532015-11-16 21:10:36 +0000Unique dog here. Very small. Lives in container of Frosted Flakes (?). Short legs. Must be rare 6/10 would still pet https://t.co/XMD9CwjEnMNaNNoneNoneNoneNone6.0
23316663532884561018882015-11-16 20:32:58 +0000Here we have a mixed Asiago from the Galápagos Islands. Only one ear working. Big fan of marijuana carpet. 8/10 https://t.co/tltQ5w9aUONaNNoneNoneNoneNone8.0
23326663454175762104322015-11-16 20:01:42 +0000Look at this jokester thinking seat belt laws don't apply to him. Great tongue tho 10/10 https://t.co/VFKG1vxGjBNaNNoneNoneNoneNone10.0
23336663378823035248642015-11-16 19:31:45 +0000This is an extremely rare horned Parthenon. Not amused. Wears shoes. Overall very nice. 9/10 would pet aggressively https://t.co/QpRjllzWALNaNNoneNoneNoneNone9.0
23346662939116321341442015-11-16 16:37:02 +0000This is a funny dog. Weird toes. Won't come down. Loves branch. Refuses to eat his food. Hard to cuddle with. 3/10 https://t.co/IIXis0zta0NaNNoneNoneNoneNone3.0
23356662874062246952962015-11-16 16:11:11 +0000This is an Albanian 3 1/2 legged Episcopalian. Loves well-polished hardwood flooring. Penis on the collar. 9/10 https://t.co/d9NcXFKwLvNaNNoneNoneNoneNone9.0
23366662730976166379522015-11-16 15:14:19 +0000Can take selfies 11/10 https://t.co/ws2AMaNwPWNaNNoneNoneNoneNone11.0
23376662689108036444162015-11-16 14:57:41 +0000Very concerned about fellow dog trapped in computer. 10/10 https://t.co/0yxApIikpkNaNNoneNoneNoneNone10.0
23386661041332886650882015-11-16 04:02:55 +0000Not familiar with this breed. No tail (weird). Only 2 legs. Doesn't bark. Surprisingly quick. Shits eggs. 1/10 https://t.co/Asgdc6kuLXNaNNoneNoneNoneNone1.0
23396661021559091445762015-11-16 03:55:04 +0000Oh my. Here you are seeing an Adobe Setter giving birth to twins!!! The world is an amazing place. 11/10 https://t.co/11LvqN4WLqNaNNoneNoneNoneNone11.0
23406660995137870520322015-11-16 03:44:34 +0000Can stand on stump for what seems like a while. Built that birdhouse? Impressive. Made friends with a squirrel. 8/10 https://t.co/Ri4nMTLq5CNaNNoneNoneNoneNone8.0
23416660940000221593622015-11-16 03:22:39 +0000This appears to be a Mongolian Presbyterian mix. Very tired. Tongue slip confirmed. 9/10 would lie down with https://t.co/mnioXo3IfPNaNNoneNoneNoneNone9.0
23426660829167331983372015-11-16 02:38:37 +0000Here we have a well-established sunblockerspaniel. Lost his other flip-flop. 6/10 not very waterproof https://t.co/3RU6x0vHB7NaNNoneNoneNoneNone6.0
23436660731007867740162015-11-16 01:59:36 +0000Let's hope this flight isn't Malaysian (lol). What a dog! Almost completely camouflaged. 10/10 I trust this pilot https://t.co/Yk6GHE9tOYNaNNoneNoneNoneNone10.0
23446660711932215091202015-11-16 01:52:02 +0000Here we have a northern speckled Rhododendron. Much sass. Gives 0 fucks. Good tongue. 9/10 would caress sensually https://t.co/ZoL8kq2XFxNaNNoneNoneNoneNone9.0
23456660638272560865332015-11-16 01:22:45 +0000This is the happiest dog you will ever see. Very committed owner. Nice couch. 10/10 https://t.co/RhUEAloehKNaNNoneNoneNoneNone10.0
23466660586005241569282015-11-16 01:01:59 +0000Here is the Rand Paul of retrievers folks! He's probably good at poker. Can drink beer (lol rad). 8/10 good dog https://t.co/pYAJkAe76pNaNNoneNoneNoneNone8.0
23476660570904992440322015-11-16 00:55:59 +0000My oh my. This is a rare blond Canadian terrier on wheels. Only $8.98. Rather docile. 9/10 very rare https://t.co/yWBqbrzy8ONaNNoneNoneNoneNone9.0
23486660555250424053802015-11-16 00:49:46 +0000Here is a Siberian heavily armored polar bear mix. Strong owner. 10/10 I would do unspeakable things to pet this dog https://t.co/rdivxLiqEtNaNNoneNoneNoneNone10.0
23496660518538268508162015-11-16 00:35:11 +0000This is an odd dog. Hard on the outside but loving on the inside. Petting still fun. Doesn't play catch well. 2/10 https://t.co/v5A4vzSDdcNaNNoneNoneNoneNone2.0
23506660507587946946572015-11-16 00:30:50 +0000This is a truly beautiful English Wilson Staff retriever. Has a nice phone. Privileged. 10/10 would trade lives with https://t.co/fvIbQfHjIeNaNNoneNoneNoneNone10.0
23516660492481658224652015-11-16 00:24:50 +0000Here we have a 1949 1st generation vulpix. Enjoys sweat tea and Fox News. Cannot be phased. 5/10 https://t.co/4B7cOc1EDqNaNNoneNoneNoneNone5.0
23526660442263298007042015-11-16 00:04:52 +0000This is a purebred Piers Morgan. Loves to Netflix and chill. Always looks like he forgot to unplug the iron. 6/10 https://t.co/DWnyCjf2mxNaNNoneNoneNoneNone6.0
23536660334127010324492015-11-15 23:21:54 +0000Here is a very happy pup. Big fan of well-maintained decks. Just look at that tongue. 9/10 would cuddle af https://t.co/y671yMhoiRNaNNoneNoneNoneNone9.0
23546660292850026209282015-11-15 23:05:30 +0000This is a western brown Mitsubishi terrier. Upset about leaf. Actually 2 dogs here. 7/10 would walk the shit out of https://t.co/r7mOb2m0UINaNNoneNoneNoneNone7.0
23556660208880227901492015-11-15 22:32:08 +0000Here we have a Japanese Irish Setter. Lost eye in Vietnam (?). Big fan of relaxing on stair. 8/10 would pet https://t.co/BLDqew2IjjNaNNoneNoneNoneNone8.0

694 rows × 9 columns

测试
twitter_archive_enhanced_clean.name.value_counts()
Charlie                                 11
Cooper                                  10
Lucy                                    10
Oliver                                   9
Tucker                                   9
Winston                                  8
Lola                                     8
Penny                                    8
Daisy                                    7
Toby                                     7
Sadie                                    6
Koda                                     6
Bailey                                   6
Bella                                    6
Stanley                                  6
Bo                                       5
Scout                                    5
Louis                                    5
Leo                                      5
Buddy                                    5
Jax                                      5
Dave                                     5
Rusty                                    5
Oscar                                    5
Boomer                                   4
Chester                                  4
Sophie                                   4
Dexter                                   4
George                                   4
Winnie                                   4
                                        ..
Obi                                      1
Ester                                    1
Dallas                                   1
Beau & Wilbur                            1
Moofasa                                  1
Dudley                                   1
Hercules                                 1
Petrick                                  1
Yoda                                     1
Rooney                                   1
Fabio                                    1
Balto                                    1
Birf                                     1
Cheryl AKA Queen Pupper of the Skies     1
Huck                                     1
Antony                                   1
Stephanus                                1
Lassie                                   1
Howard                                   1
Striker                                  1
Cali                                     1
Marvin                                   1
Perry                                    1
Cleopatricia                             1
Siba                                     1
Opie&Clarkus                             1
Pete                                     1
Boston                                   1
Deacon                                   1
Alf                                      1
Name: name, Length: 990, dtype: int64
twitter_archive_enhanced_clean.name.sample(15)
1456                     Colin
1142                       NaN
279                  Sojourner
1782                       NaN
381                    Ralphie
122                      Gizmo
898     Lilli Bee & Honey Bear
873                      Bruce
2003                     Buddy
1489                     Wally
2082                      Sage
1569            Trooper & Maya
324                     Lipton
717                     Loomis
2163                     Billl
Name: name, dtype: object
twitter_archive_enhanced_clean[twitter_archive_enhanced_clean["text"].str.contains("Zeke")==True]
tweet_idtimestamptextnamedoggoflooferpupperpupporating
178888049891996712972017-07-22 16:56:37 +0000This is Zeke. He has a new stick. Very proud of it. Would like you to throw it for him without taking it. 13/10 would do my best https://t.co/HTQ77yNQ5KZekeNoneNoneNoneNone13.0
1818570298237970472962017-04-26 00:33:27 +0000This is Zeke. He performs group cheeky wink tutorials. Pawfect execution here. 12/10 would wink back https://t.co/uMH5CLjXJuZekeNoneNoneNoneNone12.0
5478055206356906762242016-12-04 21:14:20 +0000This is Zeke the Wonder Dog. He never let that poor man keep his frisbees. One of the Spartans all time greatest receivers. 13/10 RIP Zeke https://t.co/zacX7S6GyJZekeNoneNoneNoneNone13.0
twitter_archive_enhanced:狗狗地位数据缺失
定义

重新提取:
方法同上,通过str.findall函数使用正则表达从text列中提取狗狗地位

代码
twitter_archive_enhanced_clean["stage"] = twitter_archive_enhanced_clean.text.str.lower().str.findall('doggo|floofer|pupper|puppo')
twitter_archive_enhanced_clean["stage"]
0             []
1             []
2             []
3             []
4             []
5             []
6             []
7             []
8             []
9        [doggo]
10            []
11            []
12       [puppo]
13            []
14       [puppo]
15            []
16            []
17            []
18            []
20            []
21            []
22            []
23            []
24            []
25            []
26            []
27            []
28            []
29      [pupper]
31            []
          ...   
2326          []
2327          []
2328          []
2329          []
2330          []
2331          []
2332          []
2333          []
2334          []
2335          []
2336          []
2337          []
2338          []
2339          []
2340          []
2341          []
2342          []
2343          []
2344          []
2345          []
2346          []
2347          []
2348          []
2349          []
2350          []
2351          []
2352          []
2353          []
2354          []
2355          []
Name: stage, Length: 2117, dtype: object
#检测是否有缺失值 
#twitter_archive_enhanced_clean.stage.isnull().sum()
for i in twitter_archive_enhanced_clean.stage:
    if len(i)>1:
        print(i)
 
['puppo', 'doggo', 'puppo']
['puppo', 'doggo']
['doggo', 'floofer']
['doggo', 'doggo']
['pupper', 'doggo']
['pupper', 'doggo', 'pupper', 'doggo']
['doggo', 'pupper']
['doggo', 'pupper']
['pupper', 'pupper']
['doggo', 'pupper']
['pupper', 'doggo']
['doggo', 'doggo']
['doggo', 'pupper']
['doggo', 'pupper']
['pupper', 'pupper']
['pupper', 'doggo']
['doggo', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper']
['pupper', 'pupper', 'pupper']
twitter_archive_enhanced_clean["stage"] = twitter_archive_enhanced_clean.stage.apply(lambda x: ','.join(set(x)))
twitter_archive_enhanced_clean["stage"].value_counts()
                 1740
pupper            248
doggo              79
puppo              28
doggo,pupper       10
floofer             9
doggo,puppo         2
doggo,floofer       1
Name: stage, dtype: int64
# 删除"doggo","floofer","pupper","puppo"列
twitter_archive_enhanced_clean.drop(["doggo","floofer","pupper","puppo"],axis=1,inplace = True)
twitter_archive_enhanced_clean["stage"] =twitter_archive_enhanced_clean["stage"].replace("",np.nan)
twitter_archive_enhanced_clean[twitter_archive_enhanced_clean["stage"].isnull()]
tweet_idtimestamptextnameratingstage
08924206435553361932017-08-01 16:23:56 +0000This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 https://t.co/MgUWQ76dJUPhineas13.0NaN
18921774213063434262017-08-01 00:17:27 +0000This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 https://t.co/0Xxu71qeIVTilly13.0NaN
28918151813780848642017-07-31 00:18:03 +0000This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 https://t.co/wUnZnhtVJBArchie12.0NaN
38916895572798586882017-07-30 15:58:51 +0000This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us https://t.co/tD36da7qLQDarla13.0NaN
48913275589266882562017-07-29 16:00:24 +0000This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek https://t.co/AtUZn91f7fFranklin12.0NaN
58910879508758978562017-07-29 00:08:17 +0000Here we have a majestic great white breaching off South Africa's coast. Absolutely h*ckin breathtaking. 13/10 (IG: tucker_marlo) #BarkWeek https://t.co/kQ04fDDRmhNaN13.0NaN
68909719131739914262017-07-28 16:27:12 +0000Meet Jax. He enjoys ice cream so much he gets nervous around it. 13/10 help Jax enjoy more things by clicking below\n\nhttps://t.co/Zr4hWfAs1H https://t.co/tVJBRMnhxlJax13.0NaN
78907291814112378882017-07-28 00:22:40 +0000When you watch your owner call another dog a good boy but then they turn back to you and say you're a great boy. 13/10 https://t.co/v0nONBcwxqNaN13.0NaN
88906091851503124482017-07-27 16:25:51 +0000This is Zoey. She doesn't want to be one of the scary sharks. Just wants to be a snuggly pettable boatpet. 13/10 #BarkWeek https://t.co/9TwLuAGH0bZoey13.0NaN
108900066081131724802017-07-26 00:31:25 +0000This is Koda. He is a South Australian deckshark. Deceptively deadly. Frighteningly majestic. 13/10 would risk a petting #BarkWeek https://t.co/dVPW0B0MmeKoda13.0NaN
118898808964798668812017-07-25 16:11:53 +0000This is Bruno. He is a service shark. Only gets out of the water to assist you. 13/10 terrifyingly good boy https://t.co/u1XPQMl29gBruno13.0NaN
138896388375799070722017-07-25 00:10:02 +0000This is Ted. He does his best. Sometimes that's not enough. But it's ok. 12/10 would assist https://t.co/f8dEDcrKSRTed12.0NaN
158892788419816857602017-07-24 00:19:32 +0000This is Oliver. You're witnessing one of his many brutal attacks. Seems to be playing with his victim. 13/10 fr*ckin frightening #BarkWeek https://t.co/WpHvrQedPbOliver13.0NaN
168889172381238312962017-07-23 00:22:39 +0000This is Jim. He found a fren. Taught him how to sit like the good boys. 12/10 for both https://t.co/chxruIOUJNJim12.0NaN
178888049891996712972017-07-22 16:56:37 +0000This is Zeke. He has a new stick. Very proud of it. Would like you to throw it for him without taking it. 13/10 would do my best https://t.co/HTQ77yNQ5KZeke13.0NaN
188885549627242782722017-07-22 00:23:06 +0000This is Ralphus. He's powering up. Attempting maximum borkdrive. 13/10 inspirational af https://t.co/YnYAFCTTiKRalphus13.0NaN
208880784344585871362017-07-20 16:49:33 +0000This is Gerald. He was just told he didn't get the job he interviewed for. A h*ckin injustice. 12/10 didn't want the job anyway https://t.co/DK7iDPfuRXGerald12.0NaN
218877052893818265602017-07-19 16:06:48 +0000This is Jeffrey. He has a monopoly on the pool noodles. Currently running a 'boop for two' midweek sale. 13/10 h*ckin strategic https://t.co/PhrUk20Q64Jeffrey13.0NaN
228875171391580938242017-07-19 03:39:09 +0000I've yet to rate a Venezuelan Hover Wiener. This is such an honor. 14/10 paw-inspiring af (IG: roxy.thedoxy) https://t.co/20VrLAA8baNaN14.0NaN
238874739571039518832017-07-19 00:47:34 +0000This is Canela. She attempted some fancy porch pics. They were unsuccessful. 13/10 someone help her https://t.co/cLyzpcUcMXCanela13.0NaN
248873432170453688322017-07-18 16:08:03 +0000You may not have known you needed to see this today. 13/10 please enjoy (IG: emmylouroo) https://t.co/WZqNqygEyVNaN13.0NaN
258871013928040857602017-07-18 00:07:08 +0000This... is a Jubilant Antarctic House Bear. We only rate dogs. Please only send dogs. Thank you... 12/10 would suffocate in floof https://t.co/4Ad1jzJSdpNaN12.0NaN
268869832335225446402017-07-17 16:17:36 +0000This is Maya. She's very shy. Rarely leaves her cup. 13/10 would find her an environment to thrive in https://t.co/I6oNy0CgiTMaya13.0NaN
278867368805193195522017-07-16 23:58:41 +0000This is Mingus. He's a wonderful father to his smol pup. Confirmed 13/10, but he needs your help\n\nhttps://t.co/bVi0Yr4Cff https://t.co/ISvKOSkd5bMingus13.0NaN
288866803364779335682017-07-16 20:14:00 +0000This is Derek. He's late for a dog meeting. 13/10 pet...al to the metal https://t.co/BCoWue0abADerek13.0NaN
318862583841518878732017-07-15 16:17:19 +0000This is Waffles. His doggles are pupside down. Unsure how to fix. 13/10 someone assist Waffles https://t.co/xZDA9Qsq1OWaffles13.0NaN
338859848000199475202017-07-14 22:10:11 +0000Viewer discretion advised. This is Jimbo. He will rip ur finger right h*ckin off. Other dog clearly an accessory. 12/10 pls pet with caution https://t.co/BuveP0uMF1Jimbo12.0NaN
348855289432054702082017-07-13 15:58:47 +0000This is Maisey. She fell asleep mid-excavation. Happens to the best of us. 13/10 would pat noggin approvingly https://t.co/tp1kQ8i9JFMaisey13.0NaN
358855189715287203852017-07-13 15:19:09 +0000I have a new hero and his name is Howard. 14/10 https://t.co/gzLHboL7SkHoward14.0NaN
378851676198836387842017-07-12 16:03:00 +0000Here we have a corgi undercover as a malamute. Pawbably doing important investigative work. Zero control over tongue happenings. 13/10 https://t.co/44ItaMubBfNaN13.0NaN
.....................
23266664115075514818572015-11-17 00:24:19 +0000This is quite the dog. Gets really excited when not in water. Not very soft tho. Bad at fetch. Can't do tricks. 2/10 https://t.co/aMCTNWO94tNaN2.0NaN
23276664071268567654402015-11-17 00:06:54 +0000This is a southern Vesuvius bumblegruff. Can drive a truck (wow). Made friends with 5 other nifty dogs (neat). 7/10 https://t.co/LopTBkKa8hNaN7.0NaN
23286663962473732915202015-11-16 23:23:41 +0000Oh goodness. A super rare northeast Qdoba kangaroo mix. Massive feet. No pouch (disappointing). Seems alert. 9/10 https://t.co/Dc7b0E8qFENaN9.0NaN
23296663737537445888022015-11-16 21:54:18 +0000Those are sunglasses and a jean jacket. 11/10 dog cool af https://t.co/uHXrPkUEylNaN11.0NaN
23306663627589092843532015-11-16 21:10:36 +0000Unique dog here. Very small. Lives in container of Frosted Flakes (?). Short legs. Must be rare 6/10 would still pet https://t.co/XMD9CwjEnMNaN6.0NaN
23316663532884561018882015-11-16 20:32:58 +0000Here we have a mixed Asiago from the Galápagos Islands. Only one ear working. Big fan of marijuana carpet. 8/10 https://t.co/tltQ5w9aUONaN8.0NaN
23326663454175762104322015-11-16 20:01:42 +0000Look at this jokester thinking seat belt laws don't apply to him. Great tongue tho 10/10 https://t.co/VFKG1vxGjBNaN10.0NaN
23336663378823035248642015-11-16 19:31:45 +0000This is an extremely rare horned Parthenon. Not amused. Wears shoes. Overall very nice. 9/10 would pet aggressively https://t.co/QpRjllzWALNaN9.0NaN
23346662939116321341442015-11-16 16:37:02 +0000This is a funny dog. Weird toes. Won't come down. Loves branch. Refuses to eat his food. Hard to cuddle with. 3/10 https://t.co/IIXis0zta0NaN3.0NaN
23356662874062246952962015-11-16 16:11:11 +0000This is an Albanian 3 1/2 legged Episcopalian. Loves well-polished hardwood flooring. Penis on the collar. 9/10 https://t.co/d9NcXFKwLvNaN9.0NaN
23366662730976166379522015-11-16 15:14:19 +0000Can take selfies 11/10 https://t.co/ws2AMaNwPWNaN11.0NaN
23376662689108036444162015-11-16 14:57:41 +0000Very concerned about fellow dog trapped in computer. 10/10 https://t.co/0yxApIikpkNaN10.0NaN
23386661041332886650882015-11-16 04:02:55 +0000Not familiar with this breed. No tail (weird). Only 2 legs. Doesn't bark. Surprisingly quick. Shits eggs. 1/10 https://t.co/Asgdc6kuLXNaN1.0NaN
23396661021559091445762015-11-16 03:55:04 +0000Oh my. Here you are seeing an Adobe Setter giving birth to twins!!! The world is an amazing place. 11/10 https://t.co/11LvqN4WLqNaN11.0NaN
23406660995137870520322015-11-16 03:44:34 +0000Can stand on stump for what seems like a while. Built that birdhouse? Impressive. Made friends with a squirrel. 8/10 https://t.co/Ri4nMTLq5CNaN8.0NaN
23416660940000221593622015-11-16 03:22:39 +0000This appears to be a Mongolian Presbyterian mix. Very tired. Tongue slip confirmed. 9/10 would lie down with https://t.co/mnioXo3IfPNaN9.0NaN
23426660829167331983372015-11-16 02:38:37 +0000Here we have a well-established sunblockerspaniel. Lost his other flip-flop. 6/10 not very waterproof https://t.co/3RU6x0vHB7NaN6.0NaN
23436660731007867740162015-11-16 01:59:36 +0000Let's hope this flight isn't Malaysian (lol). What a dog! Almost completely camouflaged. 10/10 I trust this pilot https://t.co/Yk6GHE9tOYNaN10.0NaN
23446660711932215091202015-11-16 01:52:02 +0000Here we have a northern speckled Rhododendron. Much sass. Gives 0 fucks. Good tongue. 9/10 would caress sensually https://t.co/ZoL8kq2XFxNaN9.0NaN
23456660638272560865332015-11-16 01:22:45 +0000This is the happiest dog you will ever see. Very committed owner. Nice couch. 10/10 https://t.co/RhUEAloehKNaN10.0NaN
23466660586005241569282015-11-16 01:01:59 +0000Here is the Rand Paul of retrievers folks! He's probably good at poker. Can drink beer (lol rad). 8/10 good dog https://t.co/pYAJkAe76pNaN8.0NaN
23476660570904992440322015-11-16 00:55:59 +0000My oh my. This is a rare blond Canadian terrier on wheels. Only $8.98. Rather docile. 9/10 very rare https://t.co/yWBqbrzy8ONaN9.0NaN
23486660555250424053802015-11-16 00:49:46 +0000Here is a Siberian heavily armored polar bear mix. Strong owner. 10/10 I would do unspeakable things to pet this dog https://t.co/rdivxLiqEtNaN10.0NaN
23496660518538268508162015-11-16 00:35:11 +0000This is an odd dog. Hard on the outside but loving on the inside. Petting still fun. Doesn't play catch well. 2/10 https://t.co/v5A4vzSDdcNaN2.0NaN
23506660507587946946572015-11-16 00:30:50 +0000This is a truly beautiful English Wilson Staff retriever. Has a nice phone. Privileged. 10/10 would trade lives with https://t.co/fvIbQfHjIeNaN10.0NaN
23516660492481658224652015-11-16 00:24:50 +0000Here we have a 1949 1st generation vulpix. Enjoys sweat tea and Fox News. Cannot be phased. 5/10 https://t.co/4B7cOc1EDqNaN5.0NaN
23526660442263298007042015-11-16 00:04:52 +0000This is a purebred Piers Morgan. Loves to Netflix and chill. Always looks like he forgot to unplug the iron. 6/10 https://t.co/DWnyCjf2mxNaN6.0NaN
23536660334127010324492015-11-15 23:21:54 +0000Here is a very happy pup. Big fan of well-maintained decks. Just look at that tongue. 9/10 would cuddle af https://t.co/y671yMhoiRNaN9.0NaN
23546660292850026209282015-11-15 23:05:30 +0000This is a western brown Mitsubishi terrier. Upset about leaf. Actually 2 dogs here. 7/10 would walk the shit out of https://t.co/r7mOb2m0UINaN7.0NaN
23556660208880227901492015-11-15 22:32:08 +0000Here we have a Japanese Irish Setter. Lost eye in Vietnam (?). Big fan of relaxing on stair. 8/10 would pet https://t.co/BLDqew2IjjNaN8.0NaN

1740 rows × 6 columns

测试
twitter_archive_enhanced_clean
tweet_idtimestamptextnameratingstage
08924206435553361932017-08-01 16:23:56 +0000This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 https://t.co/MgUWQ76dJUPhineas13.0NaN
18921774213063434262017-08-01 00:17:27 +0000This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 https://t.co/0Xxu71qeIVTilly13.0NaN
28918151813780848642017-07-31 00:18:03 +0000This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 https://t.co/wUnZnhtVJBArchie12.0NaN
38916895572798586882017-07-30 15:58:51 +0000This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us https://t.co/tD36da7qLQDarla13.0NaN
48913275589266882562017-07-29 16:00:24 +0000This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek https://t.co/AtUZn91f7fFranklin12.0NaN
58910879508758978562017-07-29 00:08:17 +0000Here we have a majestic great white breaching off South Africa's coast. Absolutely h*ckin breathtaking. 13/10 (IG: tucker_marlo) #BarkWeek https://t.co/kQ04fDDRmhNaN13.0NaN
68909719131739914262017-07-28 16:27:12 +0000Meet Jax. He enjoys ice cream so much he gets nervous around it. 13/10 help Jax enjoy more things by clicking below\n\nhttps://t.co/Zr4hWfAs1H https://t.co/tVJBRMnhxlJax13.0NaN
78907291814112378882017-07-28 00:22:40 +0000When you watch your owner call another dog a good boy but then they turn back to you and say you're a great boy. 13/10 https://t.co/v0nONBcwxqNaN13.0NaN
88906091851503124482017-07-27 16:25:51 +0000This is Zoey. She doesn't want to be one of the scary sharks. Just wants to be a snuggly pettable boatpet. 13/10 #BarkWeek https://t.co/9TwLuAGH0bZoey13.0NaN
98902402553491988492017-07-26 15:59:51 +0000This is Cassie. She is a college pup. Studying international doggo communication and stick theory. 14/10 so elegant much sophisticate https://t.co/t1bfwz5S2ACassie14.0doggo
108900066081131724802017-07-26 00:31:25 +0000This is Koda. He is a South Australian deckshark. Deceptively deadly. Frighteningly majestic. 13/10 would risk a petting #BarkWeek https://t.co/dVPW0B0MmeKoda13.0NaN
118898808964798668812017-07-25 16:11:53 +0000This is Bruno. He is a service shark. Only gets out of the water to assist you. 13/10 terrifyingly good boy https://t.co/u1XPQMl29gBruno13.0NaN
128896653883336826892017-07-25 01:55:32 +0000Here's a puppo that seems to be on the fence about something haha no but seriously someone help her. 13/10 https://t.co/BxvuXk0UCmNaN13.0puppo
138896388375799070722017-07-25 00:10:02 +0000This is Ted. He does his best. Sometimes that's not enough. But it's ok. 12/10 would assist https://t.co/f8dEDcrKSRTed12.0NaN
148895311353442099212017-07-24 17:02:04 +0000This is Stuart. He's sporting his favorite fanny pack. Secretly filled with bones only. 13/10 puppared puppo #BarkWeek https://t.co/y70o6h3isqStuart13.0puppo
158892788419816857602017-07-24 00:19:32 +0000This is Oliver. You're witnessing one of his many brutal attacks. Seems to be playing with his victim. 13/10 fr*ckin frightening #BarkWeek https://t.co/WpHvrQedPbOliver13.0NaN
168889172381238312962017-07-23 00:22:39 +0000This is Jim. He found a fren. Taught him how to sit like the good boys. 12/10 for both https://t.co/chxruIOUJNJim12.0NaN
178888049891996712972017-07-22 16:56:37 +0000This is Zeke. He has a new stick. Very proud of it. Would like you to throw it for him without taking it. 13/10 would do my best https://t.co/HTQ77yNQ5KZeke13.0NaN
188885549627242782722017-07-22 00:23:06 +0000This is Ralphus. He's powering up. Attempting maximum borkdrive. 13/10 inspirational af https://t.co/YnYAFCTTiKRalphus13.0NaN
208880784344585871362017-07-20 16:49:33 +0000This is Gerald. He was just told he didn't get the job he interviewed for. A h*ckin injustice. 12/10 didn't want the job anyway https://t.co/DK7iDPfuRXGerald12.0NaN
218877052893818265602017-07-19 16:06:48 +0000This is Jeffrey. He has a monopoly on the pool noodles. Currently running a 'boop for two' midweek sale. 13/10 h*ckin strategic https://t.co/PhrUk20Q64Jeffrey13.0NaN
228875171391580938242017-07-19 03:39:09 +0000I've yet to rate a Venezuelan Hover Wiener. This is such an honor. 14/10 paw-inspiring af (IG: roxy.thedoxy) https://t.co/20VrLAA8baNaN14.0NaN
238874739571039518832017-07-19 00:47:34 +0000This is Canela. She attempted some fancy porch pics. They were unsuccessful. 13/10 someone help her https://t.co/cLyzpcUcMXCanela13.0NaN
248873432170453688322017-07-18 16:08:03 +0000You may not have known you needed to see this today. 13/10 please enjoy (IG: emmylouroo) https://t.co/WZqNqygEyVNaN13.0NaN
258871013928040857602017-07-18 00:07:08 +0000This... is a Jubilant Antarctic House Bear. We only rate dogs. Please only send dogs. Thank you... 12/10 would suffocate in floof https://t.co/4Ad1jzJSdpNaN12.0NaN
268869832335225446402017-07-17 16:17:36 +0000This is Maya. She's very shy. Rarely leaves her cup. 13/10 would find her an environment to thrive in https://t.co/I6oNy0CgiTMaya13.0NaN
278867368805193195522017-07-16 23:58:41 +0000This is Mingus. He's a wonderful father to his smol pup. Confirmed 13/10, but he needs your help\n\nhttps://t.co/bVi0Yr4Cff https://t.co/ISvKOSkd5bMingus13.0NaN
288866803364779335682017-07-16 20:14:00 +0000This is Derek. He's late for a dog meeting. 13/10 pet...al to the metal https://t.co/BCoWue0abADerek13.0NaN
298863661447344455682017-07-15 23:25:31 +0000This is Roscoe. Another pupper fallen victim to spontaneous tongue ejections. Get the BlepiPen immediate. 12/10 deep breaths Roscoe https://t.co/RGE08MIJoxRoscoe12.0pupper
318862583841518878732017-07-15 16:17:19 +0000This is Waffles. His doggles are pupside down. Unsure how to fix. 13/10 someone assist Waffles https://t.co/xZDA9Qsq1OWaffles13.0NaN
.....................
23266664115075514818572015-11-17 00:24:19 +0000This is quite the dog. Gets really excited when not in water. Not very soft tho. Bad at fetch. Can't do tricks. 2/10 https://t.co/aMCTNWO94tNaN2.0NaN
23276664071268567654402015-11-17 00:06:54 +0000This is a southern Vesuvius bumblegruff. Can drive a truck (wow). Made friends with 5 other nifty dogs (neat). 7/10 https://t.co/LopTBkKa8hNaN7.0NaN
23286663962473732915202015-11-16 23:23:41 +0000Oh goodness. A super rare northeast Qdoba kangaroo mix. Massive feet. No pouch (disappointing). Seems alert. 9/10 https://t.co/Dc7b0E8qFENaN9.0NaN
23296663737537445888022015-11-16 21:54:18 +0000Those are sunglasses and a jean jacket. 11/10 dog cool af https://t.co/uHXrPkUEylNaN11.0NaN
23306663627589092843532015-11-16 21:10:36 +0000Unique dog here. Very small. Lives in container of Frosted Flakes (?). Short legs. Must be rare 6/10 would still pet https://t.co/XMD9CwjEnMNaN6.0NaN
23316663532884561018882015-11-16 20:32:58 +0000Here we have a mixed Asiago from the Galápagos Islands. Only one ear working. Big fan of marijuana carpet. 8/10 https://t.co/tltQ5w9aUONaN8.0NaN
23326663454175762104322015-11-16 20:01:42 +0000Look at this jokester thinking seat belt laws don't apply to him. Great tongue tho 10/10 https://t.co/VFKG1vxGjBNaN10.0NaN
23336663378823035248642015-11-16 19:31:45 +0000This is an extremely rare horned Parthenon. Not amused. Wears shoes. Overall very nice. 9/10 would pet aggressively https://t.co/QpRjllzWALNaN9.0NaN
23346662939116321341442015-11-16 16:37:02 +0000This is a funny dog. Weird toes. Won't come down. Loves branch. Refuses to eat his food. Hard to cuddle with. 3/10 https://t.co/IIXis0zta0NaN3.0NaN
23356662874062246952962015-11-16 16:11:11 +0000This is an Albanian 3 1/2 legged Episcopalian. Loves well-polished hardwood flooring. Penis on the collar. 9/10 https://t.co/d9NcXFKwLvNaN9.0NaN
23366662730976166379522015-11-16 15:14:19 +0000Can take selfies 11/10 https://t.co/ws2AMaNwPWNaN11.0NaN
23376662689108036444162015-11-16 14:57:41 +0000Very concerned about fellow dog trapped in computer. 10/10 https://t.co/0yxApIikpkNaN10.0NaN
23386661041332886650882015-11-16 04:02:55 +0000Not familiar with this breed. No tail (weird). Only 2 legs. Doesn't bark. Surprisingly quick. Shits eggs. 1/10 https://t.co/Asgdc6kuLXNaN1.0NaN
23396661021559091445762015-11-16 03:55:04 +0000Oh my. Here you are seeing an Adobe Setter giving birth to twins!!! The world is an amazing place. 11/10 https://t.co/11LvqN4WLqNaN11.0NaN
23406660995137870520322015-11-16 03:44:34 +0000Can stand on stump for what seems like a while. Built that birdhouse? Impressive. Made friends with a squirrel. 8/10 https://t.co/Ri4nMTLq5CNaN8.0NaN
23416660940000221593622015-11-16 03:22:39 +0000This appears to be a Mongolian Presbyterian mix. Very tired. Tongue slip confirmed. 9/10 would lie down with https://t.co/mnioXo3IfPNaN9.0NaN
23426660829167331983372015-11-16 02:38:37 +0000Here we have a well-established sunblockerspaniel. Lost his other flip-flop. 6/10 not very waterproof https://t.co/3RU6x0vHB7NaN6.0NaN
23436660731007867740162015-11-16 01:59:36 +0000Let's hope this flight isn't Malaysian (lol). What a dog! Almost completely camouflaged. 10/10 I trust this pilot https://t.co/Yk6GHE9tOYNaN10.0NaN
23446660711932215091202015-11-16 01:52:02 +0000Here we have a northern speckled Rhododendron. Much sass. Gives 0 fucks. Good tongue. 9/10 would caress sensually https://t.co/ZoL8kq2XFxNaN9.0NaN
23456660638272560865332015-11-16 01:22:45 +0000This is the happiest dog you will ever see. Very committed owner. Nice couch. 10/10 https://t.co/RhUEAloehKNaN10.0NaN
23466660586005241569282015-11-16 01:01:59 +0000Here is the Rand Paul of retrievers folks! He's probably good at poker. Can drink beer (lol rad). 8/10 good dog https://t.co/pYAJkAe76pNaN8.0NaN
23476660570904992440322015-11-16 00:55:59 +0000My oh my. This is a rare blond Canadian terrier on wheels. Only $8.98. Rather docile. 9/10 very rare https://t.co/yWBqbrzy8ONaN9.0NaN
23486660555250424053802015-11-16 00:49:46 +0000Here is a Siberian heavily armored polar bear mix. Strong owner. 10/10 I would do unspeakable things to pet this dog https://t.co/rdivxLiqEtNaN10.0NaN
23496660518538268508162015-11-16 00:35:11 +0000This is an odd dog. Hard on the outside but loving on the inside. Petting still fun. Doesn't play catch well. 2/10 https://t.co/v5A4vzSDdcNaN2.0NaN
23506660507587946946572015-11-16 00:30:50 +0000This is a truly beautiful English Wilson Staff retriever. Has a nice phone. Privileged. 10/10 would trade lives with https://t.co/fvIbQfHjIeNaN10.0NaN
23516660492481658224652015-11-16 00:24:50 +0000Here we have a 1949 1st generation vulpix. Enjoys sweat tea and Fox News. Cannot be phased. 5/10 https://t.co/4B7cOc1EDqNaN5.0NaN
23526660442263298007042015-11-16 00:04:52 +0000This is a purebred Piers Morgan. Loves to Netflix and chill. Always looks like he forgot to unplug the iron. 6/10 https://t.co/DWnyCjf2mxNaN6.0NaN
23536660334127010324492015-11-15 23:21:54 +0000Here is a very happy pup. Big fan of well-maintained decks. Just look at that tongue. 9/10 would cuddle af https://t.co/y671yMhoiRNaN9.0NaN
23546660292850026209282015-11-15 23:05:30 +0000This is a western brown Mitsubishi terrier. Upset about leaf. Actually 2 dogs here. 7/10 would walk the shit out of https://t.co/r7mOb2m0UINaN7.0NaN
23556660208880227901492015-11-15 22:32:08 +0000Here we have a Japanese Irish Setter. Lost eye in Vietnam (?). Big fan of relaxing on stair. 8/10 would pet https://t.co/BLDqew2IjjNaN8.0NaN

2117 rows × 6 columns

清洁度

twitter_archive_enhanceddoggo,floofer,pupper,puppo四个列标题是值
三个数据集都是以 tweet_id 为观察对象,却未合并为一张表
定义
  • 1.doggo,floofer,pupper,puppo这几列已经进行了清洗
  • 2.通过 merge 函数按照"tweet_id"合并三张表
代码
##将 twitter_archive_enhanced_clean表中"tweet_id"列的数据类型更改为字符串
twitter_archive_enhanced_clean["tweet_id"] = twitter_archive_enhanced_clean.tweet_id.astype(str)
#使用astype将`image_predictions`表中"tweet_id"列的整数型转换为字符类型
image_predictions_clean.tweet_id = image_predictions_clean.tweet_id.astype(str)
extra_data_clean.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2352 entries, 0 to 2351
Data columns (total 3 columns):
tweet_id          2352 non-null object
retweet_count     2352 non-null int64
favorite_count    2352 non-null int64
dtypes: int64(2), object(1)
memory usage: 55.2+ KB
twitter_archive_enhanced_clean = pd.merge(twitter_archive_enhanced_clean,extra_data_clean,on=["tweet_id"],how="left" )
twitter_archive_master = pd.merge(twitter_archive_enhanced_clean,image_predictions_clean,on="tweet_id",how="inner")
#twitter_archive_master = twitter_archive_enhanced_clean.merge(image_predictions_clean,
 #                                                    how='inner',on='tweet_id').merge(extra_data_clean,how='left',on='tweet_id')
twitter_archive_master.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1994 entries, 0 to 1993
Data columns (total 19 columns):
tweet_id          1994 non-null object
timestamp         1994 non-null object
text              1994 non-null object
name              1381 non-null object
rating            1981 non-null float64
stage             342 non-null object
retweet_count     1994 non-null int64
favorite_count    1994 non-null int64
jpg_url           1994 non-null object
img_num           1994 non-null int64
p1                1994 non-null object
p1_conf           1994 non-null float64
p1_dog            1994 non-null bool
p2                1994 non-null object
p2_conf           1994 non-null float64
p2_dog            1994 non-null bool
p3                1994 non-null object
p3_conf           1994 non-null float64
p3_dog            1994 non-null bool
dtypes: bool(3), float64(4), int64(3), object(9)
memory usage: 270.7+ KB
twitter_archive_master.head()
tweet_idtimestamptextnameratingstageretweet_countfavorite_countjpg_urlimg_nump1p1_confp1_dogp2p2_confp2_dogp3p3_confp3_dog
08924206435553361932017-08-01 16:23:56 +0000This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 https://t.co/MgUWQ76dJUPhineas13.0NaN884239492https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg1orange0.097049Falsebagel0.085851Falsebanana0.076110False
18921774213063434262017-08-01 00:17:27 +0000This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 https://t.co/0Xxu71qeIVTilly13.0NaN648033786https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg1Chihuahua0.323581TruePekinese0.090647Truepapillon0.068957True
28918151813780848642017-07-31 00:18:03 +0000This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 https://t.co/wUnZnhtVJBArchie12.0NaN430125445https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg1Chihuahua0.716012Truemalamute0.078253Truekelpie0.031379True
38916895572798586882017-07-30 15:58:51 +0000This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us https://t.co/tD36da7qLQDarla13.0NaN892542863https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg1paper_towel0.170278FalseLabrador_retriever0.168086Truespatula0.040836False
48913275589266882562017-07-29 16:00:24 +0000This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek https://t.co/AtUZn91f7fFranklin12.0NaN972141016https://pbs.twimg.com/media/DF6hr6BUMAAzZgT.jpg2basset0.555712TrueEnglish_springer0.225770TrueGerman_short-haired_pointer0.175219True
twitter_archive_master.tail()
tweet_idtimestamptextnameratingstageretweet_countfavorite_countjpg_urlimg_nump1p1_confp1_dogp2p2_confp2_dogp3p3_confp3_dog
19896660492481658224652015-11-16 00:24:50 +0000Here we have a 1949 1st generation vulpix. Enjoys sweat tea and Fox News. Cannot be phased. 5/10 https://t.co/4B7cOc1EDqNaN5.0NaN41111https://pbs.twimg.com/media/CT5IQmsXIAAKY4A.jpg1miniature_pinscher0.560311TrueRottweiler0.243682TrueDoberman0.154629True
19906660442263298007042015-11-16 00:04:52 +0000This is a purebred Piers Morgan. Loves to Netflix and chill. Always looks like he forgot to unplug the iron. 6/10 https://t.co/DWnyCjf2mxNaN6.0NaN147309https://pbs.twimg.com/media/CT5Dr8HUEAA-lEu.jpg1Rhodesian_ridgeback0.408143Trueredbone0.360687Trueminiature_pinscher0.222752True
19916660334127010324492015-11-15 23:21:54 +0000Here is a very happy pup. Big fan of well-maintained decks. Just look at that tongue. 9/10 would cuddle af https://t.co/y671yMhoiRNaN9.0NaN47128https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg1German_shepherd0.596461Truemalinois0.138584Truebloodhound0.116197True
19926660292850026209282015-11-15 23:05:30 +0000This is a western brown Mitsubishi terrier. Upset about leaf. Actually 2 dogs here. 7/10 would walk the shit out of https://t.co/r7mOb2m0UINaN7.0NaN48132https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg1redbone0.506826Trueminiature_pinscher0.074192TrueRhodesian_ridgeback0.072010True
19936660208880227901492015-11-15 22:32:08 +0000Here we have a Japanese Irish Setter. Lost eye in Vietnam (?). Big fan of relaxing on stair. 8/10 would pet https://t.co/BLDqew2IjjNaN8.0NaN5302528https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg1Welsh_springer_spaniel0.465074Truecollie0.156665TrueShetland_sheepdog0.061428True

质量

image_predictions表中"tweet_id"列数据类型错误,在清洁度中已转换
twitter_archive_enhanced表中有"tweet_id","timestamp"两列数据类型错误,出于需要,"tweet_id"列数据类型已转换
定义

利用to_date_time将"timestamp"列转换为python的日期类型

Code
twitter_archive_master["timestamp"] = pd.to_datetime(twitter_archive_master["timestamp"],format="%Y-%m-%d")
保存文件
#通过to_csv存储为 twitter_archive_master.csv
twitter_archive_master.to_csv("twitter_archive_master.csv",index=False)

分析&可视化

提出问题

1.最常用的狗狗名字有哪些
2.狗狗评分的分布是怎样的?
3.随着时间的变化狗狗评分,喜爱数和转发量有何变化?
4.喜爱数与狗狗评分之间是否存在关联?
1. 因为出现1次的狗狗名字较多,我们选择出现次数排名前25的狗狗名字及其占比
twitter_master = pd.read_csv("twitter_archive_master.csv")# 读取清洗后整合的csv文件
% matplotlib inline
import matplotlib.pyplot as plt

plt.figure()  
ax1 = twitter_master["name"].value_counts().head(25).plot(kind="bar",figsize=(12,5),color="#C0C0C0",legend=True,label="Number of dog's name")  
ax2 = (twitter_master["name"].value_counts().head(25) / len(twitter_master["name"].value_counts())).plot(secondary_y=True,legend=True,label="Ratio of dog's name",mark_right=False,style='r')#设置第二个y轴(右y轴)

plt.title("Dog's name for TOP 25")  

ax1.set_ylabel("Numbmer of dog's name")  
ax2.set_ylabel("Ratio of dog's name")  
plt.gcf().autofmt_xdate() # 横坐标倾斜

在这里插入图片描述

通过词云分布,来查看最常用狗狗名字有哪些
% matplotlib inline
from wordcloud import WordCloud
from PIL import Image
from os import path

name=twitter_master.name.dropna()# 去除缺失值
dog_mask = np.array(Image.open(path.join("timg.jpg")))
wc = WordCloud(background_color="white", max_words=2075, mask=dog_mask)
wc.generate(' '.join(name))

plt.figure(figsize=(12,6))
plt.imshow(wc, interpolation='bilinear')
plt.axis("off")
plt.show()

在这里插入图片描述

2. 对狗狗的评分分布进行探索
twitter_master.rating.describe()
count    1981.000000
mean       11.642797
std        40.772727
min         0.000000
25%        10.000000
50%        11.000000
75%        12.000000
max      1776.000000
Name: rating, dtype: float64
通过describe函数查看发现评分中最大值是1776,最小值为0,属于异常情况,为了便于后续的分析,此处需将异常值剔除
Q3 = 12
Q1 = 10
IQR = Q3 -Q1
Max = Q3 + 1.5*IQR
Min = Q1 -1.5*IQR
twitter_master = twitter_master[(twitter_master.rating < Max)&(twitter_master.rating>Min)]
twitter_master.rating.describe()
count    1813.000000
mean       11.036338
std         1.437450
min         7.500000
25%        10.000000
50%        11.000000
75%        12.000000
max        14.000000
Name: rating, dtype: float64
处理完异常值之后,通过seaborn库绘制评分分布的直方图查看整体评分的分布情况
import seaborn as sns
% matplotlib inline
plt.figure(figsize=(9,5))
sns.distplot(twitter_master.rating.dropna(), hist=True,bins=8, kde=True)

sns.kdeplot(twitter_master.rating.dropna(),shade=True,color='r') 

sns.rugplot(twitter_master.rating.dropna())
plt.title("Histogram and density diagram for rating")
plt.show()

在这里插入图片描述

3.随着时间的变化狗狗评分,喜爱数和转发量的变化如图
import matplotlib.dates as mdate
import matplotlib.dates as mdates
from matplotlib.ticker import FormatStrFormatter
% matplotlib inline

date2num = mdate.strpdate2num('%Y-%m-%d')# 设置日期显示格式
twitter_master["timestamp"] = pd.to_datetime(twitter_master["timestamp"],format="%Y-%m-%d")# 转换“timestamp”列为日期类型

#fig = plt.figure(figsize=(40,40))
#fig.autofmt_xdate()# 设置x轴时间外观

fig,axes = plt.subplots(3,1,sharex = True,sharey= False) # 绘制三幅图

#twitter_master.plot(kind="bar",ax = axes[0],color="k",alpha=0.7)
##twitter_master.plot(kind="line",ax = axes[1],color="k",alpha=0.7)

axes[0].set_xlim(date2num('2016-07-01'),date2num('2017-08-01')) # 设置x轴范围
plt.gca().xaxis.set_major_locator(mdates.MonthLocator()) # 设置时间间隔
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m')) # 设置横坐标轴日期显示格式

#plot
axes[0].plot(twitter_master["timestamp"],twitter_master['rating'],color="#1F77B4")
axes[1].plot(twitter_master["timestamp"],twitter_master['favorite_count'],color="#FF7F0E")
axes[2].plot(twitter_master["timestamp"],twitter_master['retweet_count'],color="#2CA02C")


plt.xticks(rotation=45) # 显示日期旋转45度 

axes[0].set_title("rating with time")
axes[1].set_title("favorite count with time")
axes[2].set_title("retweet count with time")
plt.xlabel("date")
axes[0].set_ylabel("rating")
axes[1].set_ylabel("favorite count")
axes[2].set_ylabel("retweet count")
plt.subplots_adjust(right=2,top=1.5,hspace=0.2)
plt.grid(False)

plt.show()

在这里插入图片描述

4.绘制喜爱数及狗狗评分的散点图
% matplotlib inline
plt.scatter(twitter_master['rating'], twitter_master['favorite_count'], alpha=0.5, c="#17BECF",)

plt.title("rating and favorite count")
plt.xlabel("rating")
plt.ylabel("favorite count")
plt.legend(loc='upper left')

plt.subplots_adjust(right=1.5,top=1)
plt.show()                                                                               

在这里插入图片描述

结论

1. 从狗狗名字出现次数的排名情况来看,排名前五的Charlie,Tucker,Cooper,Lucy,Oliver是常用的名字。
2.狗狗评分数据中,通过处理异常值后,我们发现狗狗的评分分布集中在10~12之间。
3.从2016年7月至2017年8月间的数据看出,狗狗评分10分以下的越来越少,狗狗的喜爱量和转发量也是呈增长趋势。说明狗狗们表现越来越棒了.
4.从散点图可以看出,对狗狗的喜爱数随着评分数的增加而增加,两者具有相关性。

参考资料:

关于狗的rating部分
http://discussions.youdaxue.com/t/rating/58299

[助教分享]for 循环和 apply 函数不要混用(Pandas 中如何遍历数据集)
http://discussions.youdaxue.com/t/for-apply-pandas/64971

推特图像预测数据【整洁度问题】
http://discussions.youdaxue.com/t/topic/61725/7

python正则表达式分组
https://blog.csdn.net/qq_42739440/article/details/81117919

python正则表达式详解
https://www.cnblogs.com/dyfblog/p/5880728.html

项目- 清洗与分析数据,筛选评分结果不正确
http://discussions.youdaxue.com/t/topic/65586

狗的评分问题(rating)
http://discussions.youdaxue.com/t/rating/61554/4

Python: Extract numbers from a string
https://stackoverflow.com/questions/4289331/python-extract-numbers-from-a-string

Pandas–DataFrame修改值
https://blog.csdn.net/qq_33711966/article/details/79902276

6 种 Python 数据可视化工具
http://python.jobbole.com/85601/

搞懂箱形图分析,快速识别异常值!
https://blog.csdn.net/eric_e/article/details/80944744

【特征工程】一种异常值检测方法、原理、代码实现 (基于箱线图)
https://blog.csdn.net/sscc_learning/article/details/78771324

matplotlib命令与格式
https://blog.csdn.net/helunqu2017/article/details/78736686

日期格式
https://blog.csdn.net/belldeep/article/details/78170274

matplotlib官方文档
https://matplotlib.org/gallery/lines_bars_and_markers/spectrum_demo.html#sphx-glr-gallery-lines-bars-and-markers-spectrum-demo-py

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值