Pandas第二部分

Grouping和Aggregating

df = pd.read_csv('data/survey_results_public.csv',index_col='Respondent')
pd.set_option('display.max_columns', 85)
pd.set_option('display.max_rows', 85)
df.head(3)
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
Respondent
1I am a student who is learning to codeYesNeverThe quality of OSS and closed source software ...Not employed, and not looking for workUnited KingdomNoPrimary/elementary schoolNaNTaught yourself a new language, framework, or ...NaNNaN410NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNHTML/CSS;Java;JavaScript;PythonC;C++;C#;Go;HTML/CSS;Java;JavaScript;Python;SQLSQLiteMySQLMacOS;WindowsAndroid;Arduino;WindowsDjango;FlaskFlask;jQueryNode.jsNode.jsIntelliJ;Notepad++;PyCharmWindowsI do not use containersNaNNaNYesFortunately, someone else has that titleYesTwitterOnlineUsername2017A few times per month or weeklyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster31-60 minutesNoNaNNo, I didn't know that Stack Overflow had a jo...No, and I don't know what those areNeutralJust as welcome now as I felt last yearTech articles written by other developers;Indu...14.0ManNoStraight / HeterosexualNaNNoAppropriate in lengthNeither easy nor difficult
2I am a student who is learning to codeNoLess than once per yearThe quality of OSS and closed source software ...Not employed, but looking for workBosnia and HerzegovinaYes, full-timeSecondary school (e.g. American high school, G...NaNTaken an online course in programming or softw...NaNDeveloper, desktop or enterprise applications;...NaN17NaNNaNNaNNaNNaNNaNI am actively looking for a jobI've never had a jobNaNNaNFinancial performance or funding status of the...Something else changed (education, award, medi...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNC++;HTML/CSS;PythonC++;HTML/CSS;JavaScript;SQLNaNMySQLWindowsWindowsDjangoDjangoNaNNaNAtom;PyCharmWindowsI do not use containersNaNUseful across many domains and could change ma...YesYesYesInstagramOnlineUsername2017Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster11-30 minutesYesA few times per month or weeklyNo, I knew that Stack Overflow had a job board...No, and I don't know what those areYes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...19.0ManNoStraight / HeterosexualNaNNoAppropriate in lengthNeither easy nor difficult
3I am not primarily a developer, but I write co...YesNeverThe quality of OSS and closed source software ...Employed full-timeThailandNoBachelor’s degree (BA, BS, B.Eng., etc.)Web development or web designTaught yourself a new language, framework, or ...100 to 499 employeesDesigner;Developer, back-end;Developer, front-...3221Slightly satisfiedSlightly satisfiedNot at all confidentNot sureNot sureI’m not actively looking, but I am open to new...1-2 years agoInterview with people in peer rolesNoLanguages, frameworks, and other technologies ...I was preparing for a job searchTHBThai baht23000.0Monthly8820.040.0There's no schedule or spec; I work on what se...Distracting work environment;Inadequate access...Less than once per month / NeverHomeAverageNoNaNNo, but I think we shouldNot sureI have little or no influenceHTML/CSSElixir;HTML/CSSPostgreSQLPostgreSQLNaNNaNNaNOther(s):NaNNaNVim;Visual Studio CodeLinux-basedI do not use containersNaNNaNYesYesYesRedditIn real life (in person)Username2011A few times per weekFind answers to specific questions;Learn how t...6-10 times per weekThey were about the sameNaNYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...NeutralJust as welcome now as I felt last yearTech meetups or events in your area;Courses on...28.0ManNoStraight / HeterosexualNaNYesAppropriate in lengthNeither easy nor difficult
df['ConvertedComp'].head(10)
Respondent
1          NaN
2          NaN
3       8820.0
4      61000.0
5          NaN
6     366420.0
7          NaN
8          NaN
9      95179.0
10     13293.0
Name: ConvertedComp, dtype: float64

聚合函数

  • 薪资中位数
df['ConvertedComp'].median()
57287.0
df.median()
D:\Anaconda\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.  Select only valid columns before calling the reduction.
  """Entry point for launching an IPython kernel.





CompTotal        62000.0
ConvertedComp    57287.0
WorkWeekHrs         40.0
CodeRevHrs           4.0
Age                 29.0
dtype: float64
df.describe()
CompTotalConvertedCompWorkWeekHrsCodeRevHrsAge
count5.594500e+045.582300e+0464503.00000049790.00000079210.000000
mean5.519014e+111.271107e+0542.1271975.08430830.336699
std7.331926e+132.841523e+0537.2876105.5139319.178390
min0.000000e+000.000000e+001.0000000.0000001.000000
25%2.000000e+042.577750e+0440.0000002.00000024.000000
50%6.200000e+045.728700e+0440.0000004.00000029.000000
75%1.200000e+051.000000e+0544.7500006.00000035.000000
max1.000000e+162.000000e+064850.00000099.00000099.000000
  • 查看多少人回答了薪资
df['ConvertedComp'].count()#count不考虑Nan
55823
df['Hobbyist']
Respondent
1        Yes
2         No
3        Yes
4         No
5        Yes
        ... 
88377    Yes
88601     No
88802     No
88816     No
88863    Yes
Name: Hobbyist, Length: 88883, dtype: object
  • 分组统计,查看多少人回答yes多少人回答no
df['Hobbyist'].value_counts()
Yes    71257
No     17626
Name: Hobbyist, dtype: int64
df['SocialMedia']
Respondent
1          Twitter
2        Instagram
3           Reddit
4           Reddit
5         Facebook
           ...    
88377      YouTube
88601          NaN
88802          NaN
88816          NaN
88863     WhatsApp
Name: SocialMedia, Length: 88883, dtype: object
schema_df = pd.read_csv('data/survey_results_schema.csv',index_col='Column')
schema_df.loc['SocialMedia','QuestionText']
'What social media site do you use the most?'
df['SocialMedia'].value_counts()
Reddit                      14374
YouTube                     13830
WhatsApp                    13347
Facebook                    13178
Twitter                     11398
Instagram                    6261
I don't use social media     5554
LinkedIn                     4501
WeChat 微信                     667
Snapchat                      628
VK ВКонта́кте                 603
Weibo 新浪微博                     56
Youku Tudou 优酷                 21
Hello                          19
Name: SocialMedia, dtype: int64
df['SocialMedia'].value_counts(normalize=True)#百分比
Reddit                      0.170233
YouTube                     0.163791
WhatsApp                    0.158071
Facebook                    0.156069
Twitter                     0.134988
Instagram                   0.074150
I don't use social media    0.065777
LinkedIn                    0.053306
WeChat 微信                   0.007899
Snapchat                    0.007437
VK ВКонта́кте               0.007141
Weibo 新浪微博                  0.000663
Youku Tudou 优酷              0.000249
Hello                       0.000225
Name: SocialMedia, dtype: float64

分组函数

df['Country'].value_counts()
United States        20949
India                 9061
Germany               5866
United Kingdom        5737
Canada                3395
                     ...  
Tonga                    1
Timor-Leste              1
North Korea              1
Brunei Darussalam        1
Chad                     1
Name: Country, Length: 179, dtype: int64
country_grp = df.groupby(['Country'])
country_grp.get_group('United States')
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
Respondent
4I am a developer by professionNoNeverThe quality of OSS and closed source software ...Employed full-timeUnited StatesNoBachelor’s degree (BA, BS, B.Eng., etc.)Computer science, computer engineering, or sof...Taken an online course in programming or softw...100 to 499 employeesDeveloper, full-stack316Less than 1 yearVery satisfiedSlightly satisfiedVery confidentNoNot sureI am not interested in new job opportunitiesLess than a year agoWrite code by hand (e.g., on a whiteboard);Int...NoLanguages, frameworks, and other technologies ...I was preparing for a job searchUSDUnited States dollar61000.0Yearly61000.080.0There's no schedule or spec; I work on what se...NaNLess than once per month / NeverHomeA little below averageNoNaNNo, but I think we shouldDevelopers typically have the most influence o...I have little or no influenceC;C++;C#;Python;SQLC;C#;JavaScript;SQLMySQL;SQLiteMySQL;SQLiteLinux;WindowsLinux;WindowsNaNNaN.NET.NETEclipse;Vim;Visual Studio;Visual Studio CodeWindowsI do not use containersNot at allUseful for decentralized currency (i.e., Bitcoin)YesSIGHYesRedditIn real life (in person)Username2014Daily or almost dailyFind answers to specific questions;Pass the ti...1-2 times per weekStack Overflow was much faster31-60 minutesYesLess than once per month or monthlyYesNo, and I don't know what those areNo, not reallyJust as welcome now as I felt last yearTech articles written by other developers;Indu...22.0ManNoStraight / HeterosexualWhite or of European descentNoAppropriate in lengthEasy
13I am a developer by professionYesLess than once a month but more than once per ...OSS is, on average, of HIGHER quality than pro...Employed full-timeUnited StatesNoMaster’s degree (MA, MS, M.Eng., MBA, etc.)Computer science, computer engineering, or sof...Taken an online course in programming or softw...10 to 19 employeesData or business analyst;Database administrato...17118Very satisfiedVery satisfiedNaNNaNNaNI am not interested in new job opportunities3-4 years agoComplete a take-home project;Interview with pe...YesLanguages, frameworks, and other technologies ...I was preparing for a job searchUSDUnited States dollar90000.0Yearly90000.040.0There is a schedule and/or spec (made by me or...Meetings;Non-work commitments (parenting, scho...All or almost all the time (I'm full-time remote)HomeA little above averageYes, because I see value in code review5.0No, but I think we shouldDevelopers and management have nearly equal in...I have a great deal of influenceBash/Shell/PowerShell;HTML/CSS;JavaScript;PHP;...Bash/Shell/PowerShell;HTML/CSS;JavaScript;Rust...Couchbase;DynamoDB;Firebase;MySQLFirebase;MySQL;RedisAndroid;AWS;Docker;IBM Cloud or Watson;iOS;Lin...Android;AWS;Docker;IBM Cloud or Watson;Linux;S...Angular/Angular.js;ASP.NET;Express;jQuery;Vue.jsExpress;Vue.jsNode.js;XamarinNode.js;TensorFlowVim;Visual Studio;Visual Studio Code;XcodeWindowsDevelopment;Testing;ProductionNot at allUseful for decentralized currency (i.e., Bitcoin)YesYesYesTwitterIn real life (in person)Username2011Multiple times per dayFind answers to specific questionsMore than 10 times per weekStack Overflow was much faster11-30 minutesYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...NeutralSomewhat more welcome now than last yearTech articles written by other developers;Cour...28.0ManNoStraight / HeterosexualWhite or of European descentYesAppropriate in lengthEasy
22I am a developer by professionYesLess than once per yearOSS is, on average, of HIGHER quality than pro...Employed full-timeUnited StatesNoSome college/university study without earning ...NaNTaken an online course in programming or softw...10,000 or more employeesData or business analyst;Designer;Developer, b...351218Slightly satisfiedVery dissatisfiedSomewhat confidentNoNoI’m not actively looking, but I am open to new...More than 4 years agoInterview with people in senior / management r...NoIndustry that I'd be working in;Financial perf...I had a negative experience or interaction at ...USDUnited States dollar103000.0Yearly103000.040.0There is a schedule and/or spec (made by me or...Being tasked with non-development work;Meeting...Less than half the time, but at least one day ...HomeAverageNoNaNNo, but I think we shouldThe CTO, CIO, or other management purchase new...I have little or no influenceBash/Shell/PowerShell;C++;HTML/CSS;JavaScript;...Bash/Shell/PowerShell;C++;HTML/CSS;JavaScript;...Elasticsearch;MySQL;Oracle;RedisElasticsearch;MySQL;Oracle;RedisDocker;Linux;Raspberry Pi;WindowsDocker;Linux;Raspberry Pi;WindowsAngular/Angular.js;Ruby on RailsAngular/Angular.js;Ruby on RailsNode.jsNode.jsSublime Text;Visual Studio;Visual Studio CodeWindowsOutside of work, for personal projectsNot at allNaNYesYesYesInstagramOnlineUsernameI don't rememberDaily or almost dailyFind answers to specific questions3-5 times per weekStack Overflow was much faster0-10 minutesYesA few times per weekYesNo, and I don't know what those areYes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...47.0ManNoStraight / HeterosexualWhite or of European descentYesAppropriate in lengthEasy
23I am a developer by professionYesLess than once per yearThe quality of OSS and closed source software ...Employed full-timeUnited StatesNoBachelor’s degree (BA, BS, B.Eng., etc.)Information systems, information technology, o...Taken an online course in programming or softw...10,000 or more employeesDeveloper, full-stack3191Slightly satisfiedSlightly satisfiedVery confidentNoNot sureI’m not actively looking, but I am open to new...Less than a year agoWrite any code;Write code by hand (e.g., on a ...NoOpportunities for professional development;How...I was preparing for a job searchUSDUnited States dollar69000.0Yearly69000.040.0There is a schedule and/or spec (made by me or...Distracting work environment;Meetings;Non-work...A few days each monthOfficeAverageYes, because I see value in code review8.0Yes, it's part of our processDevelopers and management have nearly equal in...I have little or no influenceBash/Shell/PowerShell;HTML/CSS;JavaScript;Pyth...Bash/Shell/PowerShell;Go;HTML/CSS;Java;JavaScr...Oracle;SQLiteCouchbase;DynamoDB;Elasticsearch;Firebase;OracleDocker;Google Cloud PlatformDocker;iOS;SlackReact.js;Ruby on RailsExpress;React.js;Ruby on Rails;Vue.jsNaNReact Native;TensorFlowVisual Studio CodeMacOSDevelopment;Testing;ProductionNaNUseful for immutable record keeping outside of...YesSIGHYesRedditIn real life (in person)Username2014Multiple times per dayFind answers to specific questions;Learn how t...6-10 times per weekThey were about the sameNaNYesI have never participated in Q&A on Stack Over...YesNo, I've heard of them, but I am not part of a...No, not reallyJust as welcome now as I felt last yearTech articles written by other developers;Tech...22.0ManNoStraight / HeterosexualBlack or of African descentNoAppropriate in lengthEasy
26I am a developer by professionYesLess than once per yearThe quality of OSS and closed source software ...Employed full-timeUnited StatesNoSome college/university study without earning ...Computer science, computer engineering, or sof...Taught yourself a new language, framework, or ...10,000 or more employeesDesigner;Developer, back-end;Developer, deskto...1288Very satisfiedVery satisfiedNaNNaNNaNI’m not actively looking, but I am open to new...Less than a year agoInterview with people in peer roles;Interview ...NoRemote work options;Diversity of the company o...I was preparing for a job searchUSDUnited States dollar114000.0Yearly114000.040.0There is a schedule and/or spec (made by me or...Being tasked with non-development work;Meeting...Less than half the time, but at least one day ...HomeFar above averageYes, because I see value in code review2.0Yes, it's not part of our process but the deve...Developers typically have the most influence o...I have a great deal of influenceBash/Shell/PowerShell;C++;C#;HTML/CSS;JavaScri...C#;HTML/CSS;JavaScript;Objective-C;Ruby;SQL;Sw...Microsoft SQL Server;MySQL;Redis;SQLiteMicrosoft SQL Server;MySQL;Redis;SQLiteAWS;Docker;Linux;MacOS;Microsoft Azure;Windows...Android;Docker;iOS;Linux;MacOS;Microsoft Azure...Angular/Angular.js;ASP.NET;Drupal;Express;jQue...Angular/Angular.js;ASP.NET.NET;.NET Core;Node.js;Xamarin.NET;.NET Core;Node.jsNotepad++;Sublime Text;Vim;Visual Studio;XcodeMacOSDevelopment;TestingNot at allA passing fadYesSIGHYesI don't use social mediaIn real life (in person)Username2008Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster11-30 minutesYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...NeutralJust as welcome now as I felt last yearNaN34.0ManNoGay or LesbianNaNNoAppropriate in lengthEasy
...............................................................................................................................................................................................................................................................
78292NaNNoOnce a month or more oftenOSS is, on average, of HIGHER quality than pro...Independent contractor, freelancer, or self-em...United StatesNoOther doctoral degree (Ph.D, Ed.D., etc.)A health science (ex. nursing, pharmacy, radio...Completed an industry certification program (e...Just me - I am a freelancer, sole proprietor, ...Academic researcher421431NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNBash/Shell/PowerShell;C;PythonBash/Shell/PowerShell;C;PythonSQLiteSQLiteLinux;Raspberry Pi;Other(s):Linux;Raspberry Pi;Other(s):NaNNaNChefNaNEmacs;IPython / JupyterLinux-basedI do not use containersNaNUseful for immutable record keeping outside of...NoYesYesI don't use social mediaIn real life (in person)NaN2013A few times per weekFind answers to specific questionsLess than once per weekThe other resource was slightly faster11-30 minutesNot sure / can't rememberNaNNo, I didn't know that Stack Overflow had a jo...No, and I don't know what those areNo, not reallySomewhat less welcome now than last yearNaN60.0ManNoStraight / HeterosexualWhite or of European descentYesToo longNeither easy nor difficult
82717NaNNoLess than once per yearThe quality of OSS and closed source software ...Not employed, but looking for workUnited StatesNoSecondary school (e.g. American high school, G...NaNNaNNaNNaNLess than 1 yearNaNLess than 1 yearNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAndroid;WindowsAndroid;Microsoft Azure;WindowsNaNNaNNaNNaNNaNMacOSTestingNaNNaNNoSIGHYesFacebookIn real life (in person)Username2018Less than once per month or monthlyFind answers to specific questionsLess than once per weekNaN60+ minutesNoNaNNo, I knew that Stack Overflow had a job board...No, I've heard of them, but I am not part of a...Not sureNaNIndustry news about technologies you're intere...44.0ManNoStraight / HeterosexualWhite or of European descentYesAppropriate in lengthNeither easy nor difficult
83397NaNYesLess than once per yearNaNNot employed, but looking for workUnited StatesNoBachelor’s degree (BA, BS, B.Eng., etc.)Computer science, computer engineering, or sof...Taken an online course in programming or softw...NaNNaN129Less than 1 yearNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNHTML/CSS;JavaScript;Python;SQLC;C++;C#;Go;Java;JavaScript;Python;R;Ruby;SQL;...NaNNaNAndroid;Arduino;SlackAndroid;Arduino;Docker;iOS;Raspberry Pi;SlackFlaskDjango;Drupal;Flask;jQuery;React.jsNaNChef;Torch/PyTorchEclipse;IPython / Jupyter;Sublime TextMacOSI do not use containersNaNNaNNaNSIGHYesNaNNaNHandleI don't rememberA few times per weekFind answers to specific questions;Learn how t...3-5 times per weekThey were about the sameNaNNot sure / can't rememberNaNYesNo, and I don't know what those areNo, not at allJust as welcome now as I felt last yearNaN27.0WomanNoBisexualWhite or of European descentNoAppropriate in lengthEasy
85642NaNNoLess than once per yearOSS is, on average, of LOWER quality than prop...Independent contractor, freelancer, or self-em...United StatesNoAssociate degreeInformation systems, information technology, o...Taken an online course in programming or softw...Just me - I am a freelancer, sole proprietor, ...Designer;Marketing or sales professional207Less than 1 yearNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNGo;HTML/CSSNaNNaNNaNNaNNaNNaNNaNNaNNaNVisual Studio CodeWindowsI do not use containersNaNUseful for immutable record keeping outside of...NoSIGHYesNaNIn real life (in person)Handle2008Less than once per month or monthlyFind answers to specific questionsLess than once per weekStack Overflow was slightly faster60+ minutesYesI have never participated in Q&A on Stack Over...No, I knew that Stack Overflow had a job board...No, and I don't know what those areNo, not at allJust as welcome now as I felt last yearTech articles written by other developers;Indu...34.0Non-binary, genderqueer, or gender non-conformingNaNBisexual;Gay or LesbianWhite or of European descentNoAppropriate in lengthEasy
88282NaNYesOnce a month or more oftenThe quality of OSS and closed source software ...Not employed, but looking for workUnited StatesNoSome college/university study without earning ...Computer science, computer engineering, or sof...Taught yourself a new language, framework, or ...NaNDeveloper, back-end;Developer, desktop or ente...381038NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNBash/Shell/PowerShell;Go;HTML/CSS;JavaScript;W...Bash/Shell/PowerShell;C;Go;HTML/CSS;JavaScript...NaNNaNLinuxLinux;Raspberry PiReact.jsVue.jsNode.jsAnsibleVimLinux-basedI do not use containersNaNAn irresponsible use of resourcesNoNaNYesI don't use social mediaIn real life (in person)UsernameI don't rememberA few times per month or weeklyFind answers to specific questions1-2 times per weekThey were about the sameNaNYesI have never participated in Q&A on Stack Over...YesNo, and I don't know what those areNo, not reallyJust as welcome now as I felt last yearNaNNaNManNoStraight / HeterosexualNaNNoToo shortNeither easy nor difficult

20949 rows × 84 columns

filt = df['Country'] == 'India'
df.loc[filt,'SocialMedia'].value_counts()
WhatsApp                    2990
YouTube                     1820
LinkedIn                     955
Facebook                     841
Instagram                    822
Twitter                      542
Reddit                       473
I don't use social media     250
Snapchat                      23
Hello                          5
WeChat 微信                      5
VK ВКонта́кте                  4
Youku Tudou 优酷                 2
Weibo 新浪微博                     1
Name: SocialMedia, dtype: int64
  • 各个国家社交媒体分组统计
country_grp['SocialMedia'].value_counts()
Country      SocialMedia             
Afghanistan  Facebook                    15
             YouTube                      9
             I don't use social media     6
             WhatsApp                     4
             Instagram                    1
                                         ..
Zimbabwe     Facebook                     3
             YouTube                      3
             Instagram                    2
             LinkedIn                     2
             Reddit                       1
Name: SocialMedia, Length: 1220, dtype: int64
country_grp['SocialMedia'].value_counts().loc['India']
SocialMedia
WhatsApp                    2990
YouTube                     1820
LinkedIn                     955
Facebook                     841
Instagram                    822
Twitter                      542
Reddit                       473
I don't use social media     250
Snapchat                      23
Hello                          5
WeChat 微信                      5
VK ВКонта́кте                  4
Youku Tudou 优酷                 2
Weibo 新浪微博                     1
Name: SocialMedia, dtype: int64
  • 各个国家薪资中位数
country_grp['ConvertedComp'].median()
Country
Afghanistan                               6222.0
Albania                                  10818.0
Algeria                                   7878.0
Andorra                                 160931.0
Angola                                    7764.0
                                          ...   
Venezuela, Bolivarian Republic of...      6384.0
Viet Nam                                 11892.0
Yemen                                    11940.0
Zambia                                    5040.0
Zimbabwe                                 19200.0
Name: ConvertedComp, Length: 179, dtype: float64
country_grp['ConvertedComp'].agg(['median','mean'])
medianmean
Country
Afghanistan6222.0101953.333333
Albania10818.021833.700000
Algeria7878.034924.047619
Andorra160931.0160931.000000
Angola7764.07764.000000
.........
Venezuela, Bolivarian Republic of...6384.014581.627907
Viet Nam11892.017233.436782
Yemen11940.016909.166667
Zambia5040.010075.375000
Zimbabwe19200.034046.666667

179 rows × 2 columns

filt = df['Country'] == 'India'
df.loc[filt,'LanguageWorkedWith'].str.contains('Python')
Respondent
8         True
10        True
15       False
50        True
65       False
         ...  
77339    False
79795     True
83862    False
84299     True
86012    False
Name: LanguageWorkedWith, Length: 9061, dtype: object
df.loc[filt,'LanguageWorkedWith'].str.contains('Python').sum()#统计多少使用python
3105
  • 各个国家使用python的人数
country_grp['LanguageWorkedWith'].apply(lambda x:x.str.contains('Python').sum())#这里的x是一个国家数据,例如上面的India,参考上面的例子理解
Country
Afghanistan                              8
Albania                                 23
Algeria                                 40
Andorra                                  0
Angola                                   2
                                        ..
Venezuela, Bolivarian Republic of...    28
Viet Nam                                78
Yemen                                    3
Zambia                                   4
Zimbabwe                                14
Name: LanguageWorkedWith, Length: 179, dtype: int64
country_respondents = df['Country'].value_counts()
country_respondents
United States        20949
India                 9061
Germany               5866
United Kingdom        5737
Canada                3395
                     ...  
Tonga                    1
Timor-Leste              1
North Korea              1
Brunei Darussalam        1
Chad                     1
Name: Country, Length: 179, dtype: int64
country_use_python = country_grp['LanguageWorkedWith'].apply(lambda x:x.str.contains('Python').sum())
country_use_python
Country
Afghanistan                              8
Albania                                 23
Algeria                                 40
Andorra                                  0
Angola                                   2
                                        ..
Venezuela, Bolivarian Republic of...    28
Viet Nam                                78
Yemen                                    3
Zambia                                   4
Zimbabwe                                14
Name: LanguageWorkedWith, Length: 179, dtype: int64

concat

python_df = pd.concat([country_respondents,country_use_python],axis='columns')
python_df
CountryLanguageWorkedWith
United States2094910083
India90613105
Germany58662451
United Kingdom57372384
Canada33951558
.........
Tonga10
Timor-Leste11
North Korea10
Brunei Darussalam10
Chad10

179 rows × 2 columns

python_df.rename(columns={'Country':'NumRespoondents','LanguageWorkedWith':'NumKnowPython'},inplace=True)
python_df
NumRespoondentsNumKnowPython
United States2094910083
India90613105
Germany58662451
United Kingdom57372384
Canada33951558
.........
Tonga10
Timor-Leste11
North Korea10
Brunei Darussalam10
Chad10

179 rows × 2 columns

python_df['PctKnowPython'] = (python_df['NumKnowPython']/python_df['NumRespoondents']) * 100
python_df
NumRespoondentsNumKnowPythonPctKnowPython
United States209491008348.131176
India9061310534.267741
Germany5866245141.783157
United Kingdom5737238441.554820
Canada3395155845.891016
............
Tonga100.000000
Timor-Leste11100.000000
North Korea100.000000
Brunei Darussalam100.000000
Chad100.000000

179 rows × 3 columns

python_df.sort_values(by='PctKnowPython',ascending=False,inplace=True)
python_df
NumRespoondentsNumKnowPythonPctKnowPython
Sao Tome and Principe11100.000000
Timor-Leste11100.000000
Dominica11100.000000
Niger11100.000000
Turkmenistan7685.714286
............
Cape Verde300.000000
Lao People's Democratic Republic300.000000
Malawi200.000000
Liberia200.000000
Chad100.000000

179 rows × 3 columns

注意点

count(),median(),sum()是聚合函数所以要有括号

count()统计时不考虑空值

sum()函数只统计True的个数

数据清洗

import pandas as pd
import numpy as np
people = {
    "first": ["Corey", 'Jane', 'John','Chris',np.nan,None,'NA'], 
    "last": ["Schafer", 'Doe', 'Doe','Schafer',np.nan,np.nan,'Missing'], 
    "email": ["CoreyMSchafer@gmail.com", 'JaneDoe@email.com', 'JohnDoe@email.com',None,np.nan,'Anony@email.com','NA'],
    'age':['33','55','63','36',None,None,'Missing']
}
df = pd.DataFrame(people)
df.replace('NA',np.nan,inplace=True)
df.replace('Missing',np.nan,inplace=True)
df
C:\Users\24539\AppData\Roaming\Python\Python37\site-packages\pandas\compat\_optional.py:138: UserWarning: Pandas requires version '2.7.0' or newer of 'numexpr' (version '2.6.8' currently installed).
  warnings.warn(msg, UserWarning)
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchaferNone36
4NaNNaNNaNNone
5NoneNaNAnony@email.comNone
6NaNNaNNaNNaN
df.dropna()#默认删除行
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
  • 删除缺失值行
df.dropna(axis='index',how='any')#dropna()中默认axis='index',how='any',any:只要存在一个缺失值就删除该行
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
df.dropna(axis='index',how='all')#一行全部缺失才删除
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchaferNone36
5NoneNaNAnony@email.comNone
  • 删除缺失值列
df.dropna(axis='columns',how='all')
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchaferNone36
4NaNNaNNaNNone
5NoneNaNAnony@email.comNone
6NaNNaNNaNNaN
df.dropna(axis='columns',how='any')
0
1
2
3
4
5
6
df
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchaferNone36
4NaNNaNNaNNone
5NoneNaNAnony@email.comNone
6NaNNaNNaNNaN
  • 删除email存在缺失值的行
df.dropna(axis='index',how='any',subset=['last','email'])#last和email中至少存在一个为缺失值,则删除该行
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
df.dropna(axis='index',how='all',subset=['last','email'])#last和email同时存在缺失值,则删除该行
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchaferNone36
5NoneNaNAnony@email.comNone
  • 判断是否为空
df.isna()
firstlastemailage
0FalseFalseFalseFalse
1FalseFalseFalseFalse
2FalseFalseFalseFalse
3FalseFalseTrueFalse
4TrueTrueTrueTrue
5TrueTrueFalseTrue
6TrueTrueTrueTrue
  • 填充缺失值
df.fillna('MISSING')
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchaferMISSING36
4MISSINGMISSINGMISSINGMISSING
5MISSINGMISSINGAnony@email.comMISSING
6MISSINGMISSINGMISSINGMISSING
df.fillna(0)
firstlastemailage
0CoreySchaferCoreyMSchafer@gmail.com33
1JaneDoeJaneDoe@email.com55
2JohnDoeJohnDoe@email.com63
3ChrisSchafer036
40000
500Anony@email.com0
60000
  • 查看数据类型
df.dtypes
first    object
last     object
email    object
age      object
dtype: object
  • 转换age列数据类型
df['age'] = df['age'].astype(int)
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-14-9b0df8191b9d> in <module>
----> 1 df['age'] = df['age'].astype(int)


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5813         else:
   5814             # else, only a single dtype is given
-> 5815             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   5816             return self._constructor(new_data).__finalize__(self, method="astype")
   5817 


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    416 
    417     def astype(self: T, dtype, copy: bool = False, errors: str = "raise") -> T:
--> 418         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    419 
    420     def convert(


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\managers.py in apply(self, f, align_keys, ignore_failures, **kwargs)
    325                     applied = b.apply(f, **kwargs)
    326                 else:
--> 327                     applied = getattr(b, f)(**kwargs)
    328             except (TypeError, NotImplementedError):
    329                 if not ignore_failures:


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    589         values = self.values
    590 
--> 591         new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    592 
    593         new_values = maybe_coerce_values(new_values)


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\cast.py in astype_array_safe(values, dtype, copy, errors)
   1307 
   1308     try:
-> 1309         new_values = astype_array(values, dtype, copy=copy)
   1310     except (ValueError, TypeError):
   1311         # e.g. astype_nansafe can fail on object-dtype of strings


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\cast.py in astype_array(values, dtype, copy)
   1255 
   1256     else:
-> 1257         values = astype_nansafe(values, dtype, copy=copy)
   1258 
   1259     # in pandas we don't store numpy str dtypes, so convert to object


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
   1172         # work around NumPy brokenness, #1987
   1173         if np.issubdtype(dtype.type, np.integer):
-> 1174             return lib.astype_intsafe(arr, dtype)
   1175 
   1176         # if we have a datetime/timedelta array of objects


~\AppData\Roaming\Python\Python37\site-packages\pandas\_libs\lib.pyx in pandas._libs.lib.astype_intsafe()


TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
type(np.nan)
float
df['age'] = df['age'].astype(float)
df.dtypes
first     object
last      object
email     object
age      float64
dtype: object

  • 缺失值列表
na_vals = ['NA','Missing']
df = pd.read_csv('data/survey_results_public.csv',index_col='Respondent',na_values=na_vals)#na_values
schema_df = pd.read_csv('data/survey_results_schema.csv',index_col='Column')
pd.set_option('display.max_columns', 85)
pd.set_option('display.max_rows', 85)
df.head(3)
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
Respondent
1I am a student who is learning to codeYesNeverThe quality of OSS and closed source software ...Not employed, and not looking for workUnited KingdomNoPrimary/elementary schoolNaNTaught yourself a new language, framework, or ...NaNNaN410NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNHTML/CSS;Java;JavaScript;PythonC;C++;C#;Go;HTML/CSS;Java;JavaScript;Python;SQLSQLiteMySQLMacOS;WindowsAndroid;Arduino;WindowsDjango;FlaskFlask;jQueryNode.jsNode.jsIntelliJ;Notepad++;PyCharmWindowsI do not use containersNaNNaNYesFortunately, someone else has that titleYesTwitterOnlineUsername2017A few times per month or weeklyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster31-60 minutesNoNaNNo, I didn't know that Stack Overflow had a jo...No, and I don't know what those areNeutralJust as welcome now as I felt last yearTech articles written by other developers;Indu...14.0ManNoStraight / HeterosexualNaNNoAppropriate in lengthNeither easy nor difficult
2I am a student who is learning to codeNoLess than once per yearThe quality of OSS and closed source software ...Not employed, but looking for workBosnia and HerzegovinaYes, full-timeSecondary school (e.g. American high school, G...NaNTaken an online course in programming or softw...NaNDeveloper, desktop or enterprise applications;...NaN17NaNNaNNaNNaNNaNNaNI am actively looking for a jobI've never had a jobNaNNaNFinancial performance or funding status of the...Something else changed (education, award, medi...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNC++;HTML/CSS;PythonC++;HTML/CSS;JavaScript;SQLNaNMySQLWindowsWindowsDjangoDjangoNaNNaNAtom;PyCharmWindowsI do not use containersNaNUseful across many domains and could change ma...YesYesYesInstagramOnlineUsername2017Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster11-30 minutesYesA few times per month or weeklyNo, I knew that Stack Overflow had a job board...No, and I don't know what those areYes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...19.0ManNoStraight / HeterosexualNaNNoAppropriate in lengthNeither easy nor difficult
3I am not primarily a developer, but I write co...YesNeverThe quality of OSS and closed source software ...Employed full-timeThailandNoBachelor’s degree (BA, BS, B.Eng., etc.)Web development or web designTaught yourself a new language, framework, or ...100 to 499 employeesDesigner;Developer, back-end;Developer, front-...3221Slightly satisfiedSlightly satisfiedNot at all confidentNot sureNot sureI’m not actively looking, but I am open to new...1-2 years agoInterview with people in peer rolesNoLanguages, frameworks, and other technologies ...I was preparing for a job searchTHBThai baht23000.0Monthly8820.040.0There's no schedule or spec; I work on what se...Distracting work environment;Inadequate access...Less than once per month / NeverHomeAverageNoNaNNo, but I think we shouldNot sureI have little or no influenceHTML/CSSElixir;HTML/CSSPostgreSQLPostgreSQLNaNNaNNaNOther(s):NaNNaNVim;Visual Studio CodeLinux-basedI do not use containersNaNNaNYesYesYesRedditIn real life (in person)Username2011A few times per weekFind answers to specific questions;Learn how t...6-10 times per weekThey were about the sameNaNYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...NeutralJust as welcome now as I felt last yearTech meetups or events in your area;Courses on...28.0ManNoStraight / HeterosexualNaNYesAppropriate in lengthNeither easy nor difficult
df['YearsCode'].head(10)
Respondent
1       4
2     NaN
3       3
4       3
5      16
6      13
7       6
8       8
9      12
10     12
Name: YearsCode, dtype: object
df['YearsCode'] = df['YearsCode'].astype(float)#
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-21-245fa41f666c> in <module>
----> 1 df['YearsCode'] = df['YearsCode'].astype(float)#


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5813         else:
   5814             # else, only a single dtype is given
-> 5815             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   5816             return self._constructor(new_data).__finalize__(self, method="astype")
   5817 


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    416 
    417     def astype(self: T, dtype, copy: bool = False, errors: str = "raise") -> T:
--> 418         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    419 
    420     def convert(


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\managers.py in apply(self, f, align_keys, ignore_failures, **kwargs)
    325                     applied = b.apply(f, **kwargs)
    326                 else:
--> 327                     applied = getattr(b, f)(**kwargs)
    328             except (TypeError, NotImplementedError):
    329                 if not ignore_failures:


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    589         values = self.values
    590 
--> 591         new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    592 
    593         new_values = maybe_coerce_values(new_values)


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\cast.py in astype_array_safe(values, dtype, copy, errors)
   1307 
   1308     try:
-> 1309         new_values = astype_array(values, dtype, copy=copy)
   1310     except (ValueError, TypeError):
   1311         # e.g. astype_nansafe can fail on object-dtype of strings


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\cast.py in astype_array(values, dtype, copy)
   1255 
   1256     else:
-> 1257         values = astype_nansafe(values, dtype, copy=copy)
   1258 
   1259     # in pandas we don't store numpy str dtypes, so convert to object


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
   1199     if copy or is_object_dtype(arr.dtype) or is_object_dtype(dtype):
   1200         # Explicit copy, or required since NumPy can't view from / to object.
-> 1201         return arr.astype(dtype, copy=True)
   1202 
   1203     return arr.astype(dtype, copy=copy)


ValueError: could not convert string to float: 'Less than 1 year'
float('3')
float('A')

转换错误原因与float(‘A’)类似

  • 查看列中的元素
df['YearsCode'].unique()
array(['4', nan, '3', '16', '13', '6', '8', '12', '2', '5', '17', '10',
       '14', '35', '7', 'Less than 1 year', '30', '9', '26', '40', '19',
       '15', '20', '28', '25', '1', '22', '11', '33', '50', '41', '18',
       '34', '24', '23', '42', '27', '21', '36', '32', '39', '38', '31',
       '37', 'More than 50 years', '29', '44', '45', '48', '46', '43',
       '47', '49'], dtype=object)
df['YearsCode'].replace({'Less than 1 year':0,'More than 50 years':51},inplace=True)
df['YearsCode'].unique()
array(['4', nan, '3', '16', '13', '6', '8', '12', '2', '5', '17', '10',
       '14', '35', '7', 0, '30', '9', '26', '40', '19', '15', '20', '28',
       '25', '1', '22', '11', '33', '50', '41', '18', '34', '24', '23',
       '42', '27', '21', '36', '32', '39', '38', '31', '37', 51, '29',
       '44', '45', '48', '46', '43', '47', '49'], dtype=object)
df['YearsCode'] = df['YearsCode'].astype(float)
df['YearsCode'].mean()
11.662114216834588
df['YearsCode'].median()
9.0

小节

df.dropna()中how参数

any:只要存在一个缺失值就删除该行

all:所有值缺失时才删除该行

drop、fillna、replace 都有inplace参数,因为删除之后,恢复很麻烦

添加数据的函数就没有该参数,例如concat

时间序列数据

import pandas as pd
df = pd.read_csv('E:/Pandas/data/ETH_1h.csv')
df.head()
DateSymbolOpenHighLowCloseVolume
02020-03-13 08-PMETHUSD129.94131.82126.87128.711940673.93
12020-03-13 07-PMETHUSD119.51132.02117.10129.947579741.09
22020-03-13 06-PMETHUSD124.47124.85115.50119.514898735.81
32020-03-13 05-PMETHUSD124.08127.42121.63124.472753450.92
42020-03-13 04-PMETHUSD124.85129.51120.17124.084461424.71
  • 日期列转化为日期类型
df['Date'] = pd.to_datetime(df['Date'],format='%Y-%m-%d %I-%p')
df['Date']
0       2020-03-13 20:00:00
1       2020-03-13 19:00:00
2       2020-03-13 18:00:00
3       2020-03-13 17:00:00
4       2020-03-13 16:00:00
                ...        
23669   2017-07-01 15:00:00
23670   2017-07-01 14:00:00
23671   2017-07-01 13:00:00
23672   2017-07-01 12:00:00
23673   2017-07-01 11:00:00
Name: Date, Length: 23674, dtype: datetime64[ns]
  • 获取当前日期为星期几
df.loc[0,'Date'].day_name()
'Friday'
  • 读入时将Date列转化为日期类型
d_parser = lambda x:pd.datetime.strptime(x,'%Y-%m-%d %I-%p')
df = pd.read_csv('E:/Pandas/data/ETH_1h.csv',parse_dates=['Date'],date_parser=d_parser)
'''
parse_dates指定解析列
date_parser解析函数
'''
D:\Anaconda\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime module instead.
  """Entry point for launching an IPython kernel.





'\nparse_dates指定解析列\ndate_parser解析函数\n'
df.head()
DateSymbolOpenHighLowCloseVolume
02020-03-13 20:00:00ETHUSD129.94131.82126.87128.711940673.93
12020-03-13 19:00:00ETHUSD119.51132.02117.10129.947579741.09
22020-03-13 18:00:00ETHUSD124.47124.85115.50119.514898735.81
32020-03-13 17:00:00ETHUSD124.08127.42121.63124.472753450.92
42020-03-13 16:00:00ETHUSD124.85129.51120.17124.084461424.71
df['Date'].dt.day_name()#Series中有dt类,还有例如str类
0          Friday
1          Friday
2          Friday
3          Friday
4          Friday
           ...   
23669    Saturday
23670    Saturday
23671    Saturday
23672    Saturday
23673    Saturday
Name: Date, Length: 23674, dtype: object
df['DayOfWeek'] = df['Date'].dt.day_name()#加入星期列
df
DateSymbolOpenHighLowCloseVolumeDayOfWeek
02020-03-13 20:00:00ETHUSD129.94131.82126.87128.711940673.93Friday
12020-03-13 19:00:00ETHUSD119.51132.02117.10129.947579741.09Friday
22020-03-13 18:00:00ETHUSD124.47124.85115.50119.514898735.81Friday
32020-03-13 17:00:00ETHUSD124.08127.42121.63124.472753450.92Friday
42020-03-13 16:00:00ETHUSD124.85129.51120.17124.084461424.71Friday
...........................
236692017-07-01 15:00:00ETHUSD265.74272.74265.00272.571500282.55Saturday
236702017-07-01 14:00:00ETHUSD268.79269.90265.00265.741702536.85Saturday
236712017-07-01 13:00:00ETHUSD274.83274.93265.00268.793010787.99Saturday
236722017-07-01 12:00:00ETHUSD275.01275.01271.00274.83824362.87Saturday
236732017-07-01 11:00:00ETHUSD279.98279.99272.10275.01679358.87Saturday

23674 rows × 8 columns

  • 查看最早日期和最晚日期
df['Date'].min()
Timestamp('2017-07-01 11:00:00')
df['Date'].max()
Timestamp('2020-03-13 20:00:00')
  • 计算时间跨度
df['Date'].max() - df['Date'].min()
Timedelta('986 days 09:00:00')
  • 时间过滤
filt = (df['Date'] >= '2019') & (df['Date'] < '2020')#与:%; 或:|;非:-
df.loc[filt]
DateSymbolOpenHighLowCloseVolumeDayOfWeek
17492019-12-31 23:00:00ETHUSD128.33128.69128.14128.54440678.91Tuesday
17502019-12-31 22:00:00ETHUSD128.38128.69127.95128.33554646.02Tuesday
17512019-12-31 21:00:00ETHUSD127.86128.43127.72128.38350155.69Tuesday
17522019-12-31 20:00:00ETHUSD127.84128.34127.71127.86428183.38Tuesday
17532019-12-31 19:00:00ETHUSD128.69128.69127.60127.841169847.84Tuesday
...........................
105042019-01-01 04:00:00ETHUSD130.75133.96130.74131.962791135.37Tuesday
105052019-01-01 03:00:00ETHUSD130.06130.79130.06130.75503732.63Tuesday
105062019-01-01 02:00:00ETHUSD130.79130.88129.55130.06838183.43Tuesday
105072019-01-01 01:00:00ETHUSD131.62131.62130.77130.79434917.99Tuesday
105082019-01-01 00:00:00ETHUSD130.53131.91130.48131.621067136.21Tuesday

8760 rows × 8 columns

filt = (df['Date'] >= pd.to_datetime('2019-01-01')) & (df['Date'] < pd.to_datetime('2020-01-01'))#与:%; 或:|;非:-
df.loc[filt]
DateSymbolOpenHighLowCloseVolumeDayOfWeek
17492019-12-31 23:00:00ETHUSD128.33128.69128.14128.54440678.91Tuesday
17502019-12-31 22:00:00ETHUSD128.38128.69127.95128.33554646.02Tuesday
17512019-12-31 21:00:00ETHUSD127.86128.43127.72128.38350155.69Tuesday
17522019-12-31 20:00:00ETHUSD127.84128.34127.71127.86428183.38Tuesday
17532019-12-31 19:00:00ETHUSD128.69128.69127.60127.841169847.84Tuesday
...........................
105042019-01-01 04:00:00ETHUSD130.75133.96130.74131.962791135.37Tuesday
105052019-01-01 03:00:00ETHUSD130.06130.79130.06130.75503732.63Tuesday
105062019-01-01 02:00:00ETHUSD130.79130.88129.55130.06838183.43Tuesday
105072019-01-01 01:00:00ETHUSD131.62131.62130.77130.79434917.99Tuesday
105082019-01-01 00:00:00ETHUSD130.53131.91130.48131.621067136.21Tuesday

8760 rows × 8 columns

  • 将日期设置为索引可以得到相同的效果
df.set_index('Date',inplace=True)
df
SymbolOpenHighLowCloseVolumeDayOfWeek
Date
2020-03-13 20:00:00ETHUSD129.94131.82126.87128.711940673.93Friday
2020-03-13 19:00:00ETHUSD119.51132.02117.10129.947579741.09Friday
2020-03-13 18:00:00ETHUSD124.47124.85115.50119.514898735.81Friday
2020-03-13 17:00:00ETHUSD124.08127.42121.63124.472753450.92Friday
2020-03-13 16:00:00ETHUSD124.85129.51120.17124.084461424.71Friday
........................
2017-07-01 15:00:00ETHUSD265.74272.74265.00272.571500282.55Saturday
2017-07-01 14:00:00ETHUSD268.79269.90265.00265.741702536.85Saturday
2017-07-01 13:00:00ETHUSD274.83274.93265.00268.793010787.99Saturday
2017-07-01 12:00:00ETHUSD275.01275.01271.00274.83824362.87Saturday
2017-07-01 11:00:00ETHUSD279.98279.99272.10275.01679358.87Saturday

23674 rows × 7 columns

df.loc['2019']
SymbolOpenHighLowCloseVolumeDayOfWeek
Date
2019-12-31 23:00:00ETHUSD128.33128.69128.14128.54440678.91Tuesday
2019-12-31 22:00:00ETHUSD128.38128.69127.95128.33554646.02Tuesday
2019-12-31 21:00:00ETHUSD127.86128.43127.72128.38350155.69Tuesday
2019-12-31 20:00:00ETHUSD127.84128.34127.71127.86428183.38Tuesday
2019-12-31 19:00:00ETHUSD128.69128.69127.60127.841169847.84Tuesday
........................
2019-01-01 04:00:00ETHUSD130.75133.96130.74131.962791135.37Tuesday
2019-01-01 03:00:00ETHUSD130.06130.79130.06130.75503732.63Tuesday
2019-01-01 02:00:00ETHUSD130.79130.88129.55130.06838183.43Tuesday
2019-01-01 01:00:00ETHUSD131.62131.62130.77130.79434917.99Tuesday
2019-01-01 00:00:00ETHUSD130.53131.91130.48131.621067136.21Tuesday

8760 rows × 7 columns

df.loc['2020-01':'2020-02']
SymbolOpenHighLowCloseVolumeDayOfWeek
Date
2020-02-29 23:00:00ETHUSD223.35223.58216.83217.311927939.88Saturday
2020-02-29 22:00:00ETHUSD223.48223.59222.14223.35535998.57Saturday
2020-02-29 21:00:00ETHUSD224.63225.14222.74223.48561158.03Saturday
2020-02-29 20:00:00ETHUSD225.31225.33223.50224.63511648.65Saturday
2020-02-29 19:00:00ETHUSD225.09225.85223.87225.311250856.20Saturday
........................
2020-01-01 04:00:00ETHUSD129.57130.00129.50129.56702786.82Wednesday
2020-01-01 03:00:00ETHUSD130.37130.44129.38129.57496704.23Wednesday
2020-01-01 02:00:00ETHUSD130.14130.50129.91130.37396315.72Wednesday
2020-01-01 01:00:00ETHUSD128.34130.14128.32130.14635419.40Wednesday
2020-01-01 00:00:00ETHUSD128.54128.54128.12128.34245119.91Wednesday

1440 rows × 7 columns

df.loc['2020-01':'2020-02']['Close'].mean()
195.16559027777814
df.loc['2020-01-01']['High'].max()
132.68
df['High'].resample('D').max()
Date
2017-07-01    279.99
2017-07-02    293.73
2017-07-03    285.00
2017-07-04    282.83
2017-07-05    274.97
               ...  
2020-03-09    208.65
2020-03-10    206.28
2020-03-11    202.98
2020-03-12    195.64
2020-03-13    148.00
Freq: D, Name: High, Length: 987, dtype: float64
highs = df['High'].resample('D').max()
highs['2020-01-01']
132.68
%matplotlib inline
highs.plot()

在这里插入图片描述

df.resample('W').mean()
OpenHighLowCloseVolume
Date
2017-07-02268.066486271.124595264.819730268.2021622.185035e+06
2017-07-09261.337024262.872917259.186190261.0620831.337349e+06
2017-07-16196.193214199.204405192.722321195.6983932.986756e+06
2017-07-23212.351429215.779286209.126310212.7837504.298593e+06
2017-07-30203.496190205.110357201.714048203.3095241.581729e+06
..................
2020-02-16255.021667257.255238252.679762255.1984522.329087e+06
2020-02-23265.220833267.263690262.948512265.3219051.826094e+06
2020-03-01236.720536238.697500234.208750236.3739882.198762e+06
2020-03-08229.923571231.284583228.373810229.8176191.628910e+06
2020-03-15176.937521179.979487172.936239176.3328214.259828e+06

142 rows × 5 columns

  • 各列使用不同的函数
df.resample('W').agg({'Close':'mean','High':'max','Low':'min','Volume':'sum'})
CloseHighLowVolume
Date
2017-07-02268.202162293.73253.238.084631e+07
2017-07-09261.062083285.00231.252.246746e+08
2017-07-16195.698393240.33130.265.017750e+08
2017-07-23212.783750249.40153.257.221637e+08
2017-07-30203.309524229.99178.032.657305e+08
...............
2020-02-16255.198452290.00216.313.912867e+08
2020-02-23265.321905287.13242.363.067838e+08
2020-03-01236.373988278.13209.263.693920e+08
2020-03-08229.817619253.01196.002.736569e+08
2020-03-15176.332821208.6590.004.983998e+08

142 rows × 4 columns

小节

resample和groupby类似,只不过resample用时间分组,例如一天的数据为一组,一周数据为一组
series的dt属性有year、month、day属性

读入写入数据

df = pd.read_csv('data/survey_results_public.csv',index_col='Respondent',na_values=na_vals)#na_values
schema_df = pd.read_csv('data/survey_results_schema.csv',index_col='Column')
pd.set_option('display.max_columns', 85)
pd.set_option('display.max_rows', 85)
df.head(3)
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
Respondent
1I am a student who is learning to codeYesNeverThe quality of OSS and closed source software ...Not employed, and not looking for workUnited KingdomNoPrimary/elementary schoolNaNTaught yourself a new language, framework, or ...NaNNaN410NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNHTML/CSS;Java;JavaScript;PythonC;C++;C#;Go;HTML/CSS;Java;JavaScript;Python;SQLSQLiteMySQLMacOS;WindowsAndroid;Arduino;WindowsDjango;FlaskFlask;jQueryNode.jsNode.jsIntelliJ;Notepad++;PyCharmWindowsI do not use containersNaNNaNYesFortunately, someone else has that titleYesTwitterOnlineUsername2017A few times per month or weeklyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster31-60 minutesNoNaNNo, I didn't know that Stack Overflow had a jo...No, and I don't know what those areNeutralJust as welcome now as I felt last yearTech articles written by other developers;Indu...14.0ManNoStraight / HeterosexualNaNNoAppropriate in lengthNeither easy nor difficult
2I am a student who is learning to codeNoLess than once per yearThe quality of OSS and closed source software ...Not employed, but looking for workBosnia and HerzegovinaYes, full-timeSecondary school (e.g. American high school, G...NaNTaken an online course in programming or softw...NaNDeveloper, desktop or enterprise applications;...NaN17NaNNaNNaNNaNNaNNaNI am actively looking for a jobI've never had a jobNaNNaNFinancial performance or funding status of the...Something else changed (education, award, medi...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNC++;HTML/CSS;PythonC++;HTML/CSS;JavaScript;SQLNaNMySQLWindowsWindowsDjangoDjangoNaNNaNAtom;PyCharmWindowsI do not use containersNaNUseful across many domains and could change ma...YesYesYesInstagramOnlineUsername2017Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was much faster11-30 minutesYesA few times per month or weeklyNo, I knew that Stack Overflow had a job board...No, and I don't know what those areYes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...19.0ManNoStraight / HeterosexualNaNNoAppropriate in lengthNeither easy nor difficult
3I am not primarily a developer, but I write co...YesNeverThe quality of OSS and closed source software ...Employed full-timeThailandNoBachelor’s degree (BA, BS, B.Eng., etc.)Web development or web designTaught yourself a new language, framework, or ...100 to 499 employeesDesigner;Developer, back-end;Developer, front-...3221Slightly satisfiedSlightly satisfiedNot at all confidentNot sureNot sureI’m not actively looking, but I am open to new...1-2 years agoInterview with people in peer rolesNoLanguages, frameworks, and other technologies ...I was preparing for a job searchTHBThai baht23000.0Monthly8820.040.0There's no schedule or spec; I work on what se...Distracting work environment;Inadequate access...Less than once per month / NeverHomeAverageNoNaNNo, but I think we shouldNot sureI have little or no influenceHTML/CSSElixir;HTML/CSSPostgreSQLPostgreSQLNaNNaNNaNOther(s):NaNNaNVim;Visual Studio CodeLinux-basedI do not use containersNaNNaNYesYesYesRedditIn real life (in person)Username2011A few times per weekFind answers to specific questions;Learn how t...6-10 times per weekThey were about the sameNaNYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...NeutralJust as welcome now as I felt last yearTech meetups or events in your area;Courses on...28.0ManNoStraight / HeterosexualNaNYesAppropriate in lengthNeither easy nor difficult
filt = (df['Country']=='India')
India_df = df.loc[filt]
India_df.head()
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
Respondent
8I code primarily as a hobbyYesLess than once per yearOSS is, on average, of HIGHER quality than pro...Not employed, but looking for workIndiaNaNBachelor’s degree (BA, BS, B.Eng., etc.)Computer science, computer engineering, or sof...Taught yourself a new language, framework, or ...NaNDeveloper, back-end;Engineer, site reliability816NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNBash/Shell/PowerShell;C;C++;HTML/CSS;Java;Java...Bash/Shell/PowerShell;C;C++;Elixir;Erlang;Go;P...Cassandra;Elasticsearch;MongoDB;MySQL;Oracle;R...Cassandra;DynamoDB;Elasticsearch;Firebase;Mong...AWS;Docker;Heroku;Linux;MacOS;SlackAndroid;Arduino;AWS;Docker;Google Cloud Platfo...Express;Flask;React.js;SpringDjango;Express;Flask;React.js;Vue.jsHadoop;Node.js;PandasAnsible;Apache Spark;Chef;Hadoop;Node.js;Panda...Atom;IntelliJ;IPython / Jupyter;PyCharm;Visual...Linux-basedDevelopment;Testing;Production;Outside of work...NaNUseful across many domains and could change ma...YesSIGHYesYouTubeIn real life (in person)Handle2012A few times per weekFind answers to specific questions;Learn how t...Less than once per weekStack Overflow was slightly faster11-30 minutesYesLess than once per month or monthlyYesNo, and I don't know what those areYes, definitelyA lot more welcome now than last yearTech articles written by other developers;Indu...24.0ManNoStraight / HeterosexualNaNNaNAppropriate in lengthNeither easy nor difficult
10I am a developer by professionYesOnce a month or more oftenOSS is, on average, of HIGHER quality than pro...Employed full-timeIndiaNoMaster’s degree (MA, MS, M.Eng., MBA, etc.)NaNNaN10,000 or more employeesData or business analyst;Data scientist or mac...122010Slightly dissatisfiedSlightly dissatisfiedSomewhat confidentYesYesI’m not actively looking, but I am open to new...3-4 years agoNaNNoLanguages, frameworks, and other technologies ...NaNINRIndian rupee950000.0Yearly13293.070.0There's no schedule or spec; I work on what se...NaNA few days each monthHomeFar above averageYes, because I see value in code review4.0Yes, it's part of our processNaNNaNC#;Go;JavaScript;Python;R;SQLC#;Go;JavaScript;Kotlin;Python;R;SQLElasticsearch;MongoDB;Microsoft SQL Server;MyS...Elasticsearch;MongoDB;Microsoft SQL ServerLinux;WindowsAndroid;Linux;Raspberry Pi;WindowsAngular/Angular.js;ASP.NET;Django;Express;Flas...Angular/Angular.js;ASP.NET;Django;Express;Flas....NET;Node.js;Pandas;Torch/PyTorch.NET;Node.js;TensorFlow;Torch/PyTorchAndroid Studio;Eclipse;IPython / Jupyter;Notep...WindowsNaNNot at allUseful for immutable record keeping outside of...NoYesYesYouTubeNeitherScreen NameNaNMultiple times per dayFind answers to specific questions;Get a sense...3-5 times per weekThey were about the sameNaNYesA few times per month or weeklyYesNo, and I don't know what those areYes, somewhatSomewhat less welcome now than last yearTech articles written by other developers;Tech...NaNNaNNaNNaNNaNYesToo longDifficult
15I am a student who is learning to codeYesNeverOSS is, on average, of HIGHER quality than pro...Not employed, but looking for workIndiaYes, full-timeSecondary school (e.g. American high school, G...NaNTaken an online course in programming or softw...NaNStudent313NaNNaNNaNNaNNaNNaNI’m not actively looking, but I am open to new...I've never had a jobNaNNaNIndustry that I'd be working in;Languages, fra...Something else changed (education, award, medi...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAssembly;Bash/Shell/PowerShell;C;C++;HTML/CSS;...Assembly;Bash/Shell/PowerShell;C;C++;C#;Go;HTM...MariaDB;MySQL;Oracle;SQLiteMariaDB;MongoDB;Microsoft SQL Server;MySQL;Ora...Linux;WindowsAndroid;Google Cloud Platform;iOS;Linux;MacOS;...NaNAngular/Angular.js;ASP.NET;Django;Drupal;jQuer...NaN.NET;.NET Core;Node.js;TensorFlow;Unity 3D;Unr...Atom;NetBeans;Notepad++;Sublime Text;VimLinux-basedDevelopmentNaNNaNYesYesWhat?YouTubeIn real life (in person)NaN2018Daily or almost dailyFind answers to specific questions;Learn how t...More than 10 times per weekThey were about the sameNaNYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...Yes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...20.0ManNoNaNNaNYesToo longNeither easy nor difficult
50I am a developer by professionYesOnce a month or more oftenOSS is, on average, of LOWER quality than prop...Employed full-timeIndiaNoBachelor’s degree (BA, BS, B.Eng., etc.)Another engineering discipline (ex. civil, ele...Received on-the-job training in software devel...10,000 or more employeesDeveloper, back-end;DevOps specialist7152Slightly satisfiedVery satisfiedVery confidentNot sureYesI’m not actively looking, but I am open to new...1-2 years agoWrite code by hand (e.g., on a whiteboard);Int...NoSpecific department or team I'd be working on;...I was preparing for a job searchINRIndian rupee400000.0Yearly5597.07.0There is a schedule and/or spec (made by me or...Meetings;Time spent commutingLess than once per month / NeverOther place, such as a coworking space or cafeAverageNoNaNYes, it's not part of our process but the deve...The CTO, CIO, or other management purchase new...I have little or no influenceBash/Shell/PowerShell;C;C++;HTML/CSS;Java;Java...HTML/CSS;JavaScript;PythonElasticsearch;Firebase;MariaDB;MongoDB;MySQL;O...Firebase;PostgreSQL;Redis;Other(s):Arduino;AWS;Heroku;Linux;MacOS;Raspberry Pi;Wo...AWS;Docker;Heroku;Kubernetes;Linux;MacOS;WordP...Django;Express;Flask;jQueryExpress;Flask;jQuery;React.js;Vue.jsNode.jsNode.jsNotepad++;Visual Studio CodeMacOSTestingNot at allUseful for immutable record keeping outside of...YesAlso YesWhat?YouTubeIn real life (in person)Username2012Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was slightly faster11-30 minutesYesLess than once per month or monthlyNo, I knew that Stack Overflow had a job board...No, and I don't know what those areYes, definitelyJust as welcome now as I felt last yearTech articles written by other developers;Tech...23.0ManNoNaNSouth AsianNoToo longEasy
65I am a developer by professionYesNeverNaNEmployed full-timeIndiaNoBachelor’s degree (BA, BS, B.Eng., etc.)Information systems, information technology, o...NaN20 to 99 employeesDeveloper, front-end;Developer, mobile2172Very satisfiedVery satisfiedVery confidentNoNot sureI’m not actively looking, but I am open to new...Less than a year agoWrite any code;Solve a brain-teaser style puzz...NoLanguages, frameworks, and other technologies ...My job status changed (promotion, new job, etc.)INRIndian rupeeNaNMonthlyNaN48.0There's no schedule or spec; I work on what se...NaNAbout half the timeOfficeAverageYes, because I see value in code reviewNaNYes, it's not part of our process but the deve...Not sureNaNAssembly;C;C++;C#;HTML/CSS;JavaKotlinFirebase;MySQL;Oracle;SQLiteFirebase;SQLiteAndroidAndroidASP.NETNaNNaNNaNAndroid Studio;IntelliJLinux-basedNaNNaNNaNYesYesWhat?WhatsAppIn real life (in person)NaN2017Multiple times per dayFind answers to specific questionsMore than 10 times per weekStack Overflow was slightly faster11-30 minutesYesA few times per weekNo, I knew that Stack Overflow had a job board...No, and I don't know what those areNot sureA lot more welcome now than last yearNaN21.0ManNoNaNNaNYesAppropriate in lengthNeither easy nor difficult
India_df.to_csv('data/modified.csv')
India_df.to_csv('data/modified.tsv',sep='\t')
India_df.to_excel('data/modified.xlsx')
test = pd.read_excel('data/modified.xlsx',index_col='Respondent') 
test.head()
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
Respondent
8I code primarily as a hobbyYesLess than once per yearOSS is, on average, of HIGHER quality than pro...Not employed, but looking for workIndiaNaNBachelor’s degree (BA, BS, B.Eng., etc.)Computer science, computer engineering, or sof...Taught yourself a new language, framework, or ...NaNDeveloper, back-end;Engineer, site reliability816NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNBash/Shell/PowerShell;C;C++;HTML/CSS;Java;Java...Bash/Shell/PowerShell;C;C++;Elixir;Erlang;Go;P...Cassandra;Elasticsearch;MongoDB;MySQL;Oracle;R...Cassandra;DynamoDB;Elasticsearch;Firebase;Mong...AWS;Docker;Heroku;Linux;MacOS;SlackAndroid;Arduino;AWS;Docker;Google Cloud Platfo...Express;Flask;React.js;SpringDjango;Express;Flask;React.js;Vue.jsHadoop;Node.js;PandasAnsible;Apache Spark;Chef;Hadoop;Node.js;Panda...Atom;IntelliJ;IPython / Jupyter;PyCharm;Visual...Linux-basedDevelopment;Testing;Production;Outside of work...NaNUseful across many domains and could change ma...YesSIGHYesYouTubeIn real life (in person)Handle2012A few times per weekFind answers to specific questions;Learn how t...Less than once per weekStack Overflow was slightly faster11-30 minutesYesLess than once per month or monthlyYesNo, and I don't know what those areYes, definitelyA lot more welcome now than last yearTech articles written by other developers;Indu...24.0ManNoStraight / HeterosexualNaNNaNAppropriate in lengthNeither easy nor difficult
10I am a developer by professionYesOnce a month or more oftenOSS is, on average, of HIGHER quality than pro...Employed full-timeIndiaNoMaster’s degree (MA, MS, M.Eng., MBA, etc.)NaNNaN10,000 or more employeesData or business analyst;Data scientist or mac...122010Slightly dissatisfiedSlightly dissatisfiedSomewhat confidentYesYesI’m not actively looking, but I am open to new...3-4 years agoNaNNoLanguages, frameworks, and other technologies ...NaNINRIndian rupee950000.0Yearly13293.070.0There's no schedule or spec; I work on what se...NaNA few days each monthHomeFar above averageYes, because I see value in code review4.0Yes, it's part of our processNaNNaNC#;Go;JavaScript;Python;R;SQLC#;Go;JavaScript;Kotlin;Python;R;SQLElasticsearch;MongoDB;Microsoft SQL Server;MyS...Elasticsearch;MongoDB;Microsoft SQL ServerLinux;WindowsAndroid;Linux;Raspberry Pi;WindowsAngular/Angular.js;ASP.NET;Django;Express;Flas...Angular/Angular.js;ASP.NET;Django;Express;Flas....NET;Node.js;Pandas;Torch/PyTorch.NET;Node.js;TensorFlow;Torch/PyTorchAndroid Studio;Eclipse;IPython / Jupyter;Notep...WindowsNaNNot at allUseful for immutable record keeping outside of...NoYesYesYouTubeNeitherScreen NameNaNMultiple times per dayFind answers to specific questions;Get a sense...3-5 times per weekThey were about the sameNaNYesA few times per month or weeklyYesNo, and I don't know what those areYes, somewhatSomewhat less welcome now than last yearTech articles written by other developers;Tech...NaNNaNNaNNaNNaNYesToo longDifficult
15I am a student who is learning to codeYesNeverOSS is, on average, of HIGHER quality than pro...Not employed, but looking for workIndiaYes, full-timeSecondary school (e.g. American high school, G...NaNTaken an online course in programming or softw...NaNStudent313NaNNaNNaNNaNNaNNaNI’m not actively looking, but I am open to new...I've never had a jobNaNNaNIndustry that I'd be working in;Languages, fra...Something else changed (education, award, medi...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAssembly;Bash/Shell/PowerShell;C;C++;HTML/CSS;...Assembly;Bash/Shell/PowerShell;C;C++;C#;Go;HTM...MariaDB;MySQL;Oracle;SQLiteMariaDB;MongoDB;Microsoft SQL Server;MySQL;Ora...Linux;WindowsAndroid;Google Cloud Platform;iOS;Linux;MacOS;...NaNAngular/Angular.js;ASP.NET;Django;Drupal;jQuer...NaN.NET;.NET Core;Node.js;TensorFlow;Unity 3D;Unr...Atom;NetBeans;Notepad++;Sublime Text;VimLinux-basedDevelopmentNaNNaNYesYesWhat?YouTubeIn real life (in person)NaN2018Daily or almost dailyFind answers to specific questions;Learn how t...More than 10 times per weekThey were about the sameNaNYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...Yes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...20.0ManNoNaNNaNYesToo longNeither easy nor difficult
50I am a developer by professionYesOnce a month or more oftenOSS is, on average, of LOWER quality than prop...Employed full-timeIndiaNoBachelor’s degree (BA, BS, B.Eng., etc.)Another engineering discipline (ex. civil, ele...Received on-the-job training in software devel...10,000 or more employeesDeveloper, back-end;DevOps specialist7152Slightly satisfiedVery satisfiedVery confidentNot sureYesI’m not actively looking, but I am open to new...1-2 years agoWrite code by hand (e.g., on a whiteboard);Int...NoSpecific department or team I'd be working on;...I was preparing for a job searchINRIndian rupee400000.0Yearly5597.07.0There is a schedule and/or spec (made by me or...Meetings;Time spent commutingLess than once per month / NeverOther place, such as a coworking space or cafeAverageNoNaNYes, it's not part of our process but the deve...The CTO, CIO, or other management purchase new...I have little or no influenceBash/Shell/PowerShell;C;C++;HTML/CSS;Java;Java...HTML/CSS;JavaScript;PythonElasticsearch;Firebase;MariaDB;MongoDB;MySQL;O...Firebase;PostgreSQL;Redis;Other(s):Arduino;AWS;Heroku;Linux;MacOS;Raspberry Pi;Wo...AWS;Docker;Heroku;Kubernetes;Linux;MacOS;WordP...Django;Express;Flask;jQueryExpress;Flask;jQuery;React.js;Vue.jsNode.jsNode.jsNotepad++;Visual Studio CodeMacOSTestingNot at allUseful for immutable record keeping outside of...YesAlso YesWhat?YouTubeIn real life (in person)Username2012Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was slightly faster11-30 minutesYesLess than once per month or monthlyNo, I knew that Stack Overflow had a job board...No, and I don't know what those areYes, definitelyJust as welcome now as I felt last yearTech articles written by other developers;Tech...23.0ManNoNaNSouth AsianNoToo longEasy
65I am a developer by professionYesNeverNaNEmployed full-timeIndiaNoBachelor’s degree (BA, BS, B.Eng., etc.)Information systems, information technology, o...NaN20 to 99 employeesDeveloper, front-end;Developer, mobile2172Very satisfiedVery satisfiedVery confidentNoNot sureI’m not actively looking, but I am open to new...Less than a year agoWrite any code;Solve a brain-teaser style puzz...NoLanguages, frameworks, and other technologies ...My job status changed (promotion, new job, etc.)INRIndian rupeeNaNMonthlyNaN48.0There's no schedule or spec; I work on what se...NaNAbout half the timeOfficeAverageYes, because I see value in code reviewNaNYes, it's not part of our process but the deve...Not sureNaNAssembly;C;C++;C#;HTML/CSS;JavaKotlinFirebase;MySQL;Oracle;SQLiteFirebase;SQLiteAndroidAndroidASP.NETNaNNaNNaNAndroid Studio;IntelliJLinux-basedNaNNaNNaNYesYesWhat?WhatsAppIn real life (in person)NaN2017Multiple times per dayFind answers to specific questionsMore than 10 times per weekStack Overflow was slightly faster11-30 minutesYesA few times per weekNo, I knew that Stack Overflow had a job board...No, and I don't know what those areNot sureA lot more welcome now than last yearNaN21.0ManNoNaNNaNYesAppropriate in lengthNeither easy nor difficult
India_df.to_json('data/modified.json',orient='records',lines=True)
test = pd.read_json('data/modified.json',orient='records',lines=True)
test.head()
MainBranchHobbyistOpenSourcerOpenSourceEmploymentCountryStudentEdLevelUndergradMajorEduOtherOrgSizeDevTypeYearsCodeAge1stCodeYearsCodeProCareerSatJobSatMgrIdiotMgrMoneyMgrWantJobSeekLastHireDateLastIntFizzBuzzJobFactorsResumeUpdateCurrencySymbolCurrencyDescCompTotalCompFreqConvertedCompWorkWeekHrsWorkPlanWorkChallengeWorkRemoteWorkLocImpSynCodeRevCodeRevHrsUnitTestsPurchaseHowPurchaseWhatLanguageWorkedWithLanguageDesireNextYearDatabaseWorkedWithDatabaseDesireNextYearPlatformWorkedWithPlatformDesireNextYearWebFrameWorkedWithWebFrameDesireNextYearMiscTechWorkedWithMiscTechDesireNextYearDevEnvironOpSysContainersBlockchainOrgBlockchainIsBetterLifeITpersonOffOnSocialMediaExtraversionScreenNameSOVisit1stSOVisitFreqSOVisitToSOFindAnswerSOTimeSavedSOHowMuchTimeSOAccountSOPartFreqSOJobsEntTeamsSOCommWelcomeChangeSONewContentAgeGenderTransSexualityEthnicityDependentsSurveyLengthSurveyEase
0I code primarily as a hobbyYesLess than once per yearOSS is, on average, of HIGHER quality than pro...Not employed, but looking for workIndiaNoneBachelor’s degree (BA, BS, B.Eng., etc.)Computer science, computer engineering, or sof...Taught yourself a new language, framework, or ...NoneDeveloper, back-end;Engineer, site reliability816NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaNNoneNaNNaNNoneNoneNoneNoneNoneNoneNaNNoneNoneNoneBash/Shell/PowerShell;C;C++;HTML/CSS;Java;Java...Bash/Shell/PowerShell;C;C++;Elixir;Erlang;Go;P...Cassandra;Elasticsearch;MongoDB;MySQL;Oracle;R...Cassandra;DynamoDB;Elasticsearch;Firebase;Mong...AWS;Docker;Heroku;Linux;MacOS;SlackAndroid;Arduino;AWS;Docker;Google Cloud Platfo...Express;Flask;React.js;SpringDjango;Express;Flask;React.js;Vue.jsHadoop;Node.js;PandasAnsible;Apache Spark;Chef;Hadoop;Node.js;Panda...Atom;IntelliJ;IPython / Jupyter;PyCharm;Visual...Linux-basedDevelopment;Testing;Production;Outside of work...NoneUseful across many domains and could change ma...YesSIGHYesYouTubeIn real life (in person)Handle2012A few times per weekFind answers to specific questions;Learn how t...Less than once per weekStack Overflow was slightly faster11-30 minutesYesLess than once per month or monthlyYesNo, and I don't know what those areYes, definitelyA lot more welcome now than last yearTech articles written by other developers;Indu...24.0ManNoStraight / HeterosexualNoneNoneAppropriate in lengthNeither easy nor difficult
1I am a developer by professionYesOnce a month or more oftenOSS is, on average, of HIGHER quality than pro...Employed full-timeIndiaNoMaster’s degree (MA, MS, M.Eng., MBA, etc.)NoneNone10,000 or more employeesData or business analyst;Data scientist or mac...122010Slightly dissatisfiedSlightly dissatisfiedSomewhat confidentYesYesI’m not actively looking, but I am open to new...3-4 years agoNoneNoLanguages, frameworks, and other technologies ...NoneINRIndian rupee950000.0Yearly13293.070.0There's no schedule or spec; I work on what se...NoneA few days each monthHomeFar above averageYes, because I see value in code review4.0Yes, it's part of our processNoneNoneC#;Go;JavaScript;Python;R;SQLC#;Go;JavaScript;Kotlin;Python;R;SQLElasticsearch;MongoDB;Microsoft SQL Server;MyS...Elasticsearch;MongoDB;Microsoft SQL ServerLinux;WindowsAndroid;Linux;Raspberry Pi;WindowsAngular/Angular.js;ASP.NET;Django;Express;Flas...Angular/Angular.js;ASP.NET;Django;Express;Flas....NET;Node.js;Pandas;Torch/PyTorch.NET;Node.js;TensorFlow;Torch/PyTorchAndroid Studio;Eclipse;IPython / Jupyter;Notep...WindowsNoneNot at allUseful for immutable record keeping outside of...NoYesYesYouTubeNeitherScreen NameNoneMultiple times per dayFind answers to specific questions;Get a sense...3-5 times per weekThey were about the sameNoneYesA few times per month or weeklyYesNo, and I don't know what those areYes, somewhatSomewhat less welcome now than last yearTech articles written by other developers;Tech...NaNNoneNoneNoneNoneYesToo longDifficult
2I am a student who is learning to codeYesNeverOSS is, on average, of HIGHER quality than pro...Not employed, but looking for workIndiaYes, full-timeSecondary school (e.g. American high school, G...NoneTaken an online course in programming or softw...NoneStudent313NoneNoneNoneNoneNoneNoneI’m not actively looking, but I am open to new...I've never had a jobNoneNoneIndustry that I'd be working in;Languages, fra...Something else changed (education, award, medi...NoneNoneNaNNoneNaNNaNNoneNoneNoneNoneNoneNoneNaNNoneNoneNoneAssembly;Bash/Shell/PowerShell;C;C++;HTML/CSS;...Assembly;Bash/Shell/PowerShell;C;C++;C#;Go;HTM...MariaDB;MySQL;Oracle;SQLiteMariaDB;MongoDB;Microsoft SQL Server;MySQL;Ora...Linux;WindowsAndroid;Google Cloud Platform;iOS;Linux;MacOS;...NoneAngular/Angular.js;ASP.NET;Django;Drupal;jQuer...None.NET;.NET Core;Node.js;TensorFlow;Unity 3D;Unr...Atom;NetBeans;Notepad++;Sublime Text;VimLinux-basedDevelopmentNoneNoneYesYesWhat?YouTubeIn real life (in person)None2018Daily or almost dailyFind answers to specific questions;Learn how t...More than 10 times per weekThey were about the sameNoneYesLess than once per month or monthlyYesNo, I've heard of them, but I am not part of a...Yes, somewhatJust as welcome now as I felt last yearTech articles written by other developers;Indu...20.0ManNoNoneNoneYesToo longNeither easy nor difficult
3I am a developer by professionYesOnce a month or more oftenOSS is, on average, of LOWER quality than prop...Employed full-timeIndiaNoBachelor’s degree (BA, BS, B.Eng., etc.)Another engineering discipline (ex. civil, ele...Received on-the-job training in software devel...10,000 or more employeesDeveloper, back-end;DevOps specialist7152Slightly satisfiedVery satisfiedVery confidentNot sureYesI’m not actively looking, but I am open to new...1-2 years agoWrite code by hand (e.g., on a whiteboard);Int...NoSpecific department or team I'd be working on;...I was preparing for a job searchINRIndian rupee400000.0Yearly5597.07.0There is a schedule and/or spec (made by me or...Meetings;Time spent commutingLess than once per month / NeverOther place, such as a coworking space or cafeAverageNoNaNYes, it's not part of our process but the deve...The CTO, CIO, or other management purchase new...I have little or no influenceBash/Shell/PowerShell;C;C++;HTML/CSS;Java;Java...HTML/CSS;JavaScript;PythonElasticsearch;Firebase;MariaDB;MongoDB;MySQL;O...Firebase;PostgreSQL;Redis;Other(s):Arduino;AWS;Heroku;Linux;MacOS;Raspberry Pi;Wo...AWS;Docker;Heroku;Kubernetes;Linux;MacOS;WordP...Django;Express;Flask;jQueryExpress;Flask;jQuery;React.js;Vue.jsNode.jsNode.jsNotepad++;Visual Studio CodeMacOSTestingNot at allUseful for immutable record keeping outside of...YesAlso YesWhat?YouTubeIn real life (in person)Username2012Daily or almost dailyFind answers to specific questions;Learn how t...3-5 times per weekStack Overflow was slightly faster11-30 minutesYesLess than once per month or monthlyNo, I knew that Stack Overflow had a job board...No, and I don't know what those areYes, definitelyJust as welcome now as I felt last yearTech articles written by other developers;Tech...23.0ManNoNoneSouth AsianNoToo longEasy
4I am a developer by professionYesNeverNoneEmployed full-timeIndiaNoBachelor’s degree (BA, BS, B.Eng., etc.)Information systems, information technology, o...None20 to 99 employeesDeveloper, front-end;Developer, mobile2172Very satisfiedVery satisfiedVery confidentNoNot sureI’m not actively looking, but I am open to new...Less than a year agoWrite any code;Solve a brain-teaser style puzz...NoLanguages, frameworks, and other technologies ...My job status changed (promotion, new job, etc.)INRIndian rupeeNaNMonthlyNaN48.0There's no schedule or spec; I work on what se...NoneAbout half the timeOfficeAverageYes, because I see value in code reviewNaNYes, it's not part of our process but the deve...Not sureNoneAssembly;C;C++;C#;HTML/CSS;JavaKotlinFirebase;MySQL;Oracle;SQLiteFirebase;SQLiteAndroidAndroidASP.NETNoneNoneNoneAndroid Studio;IntelliJLinux-basedNoneNoneNoneYesYesWhat?WhatsAppIn real life (in person)None2017Multiple times per dayFind answers to specific questionsMore than 10 times per weekStack Overflow was slightly faster11-30 minutesYesA few times per weekNo, I knew that Stack Overflow had a job board...No, and I don't know what those areNot sureA lot more welcome now than last yearNone21.0ManNoNoneNoneYesAppropriate in lengthNeither easy nor difficult
from sqlalchemy import create_engine
import psycopg2
engine = create_engine('postgresql://dbuser:dbpass@localhost:5432/sample_db')
India_df.to_sql('samlpe_table',engine,if_exists='replace')#if_exist如果存在这张表,就更新数据
---------------------------------------------------------------------------

OperationalError                          Traceback (most recent call last)

D:\Anaconda\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
   2261         try:
-> 2262             return fn()
   2263         except dialect.dbapi.Error as e:


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in connect(self)
    353         if not self._use_threadlocal:
--> 354             return _ConnectionFairy._checkout(self)
    355 


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in _checkout(cls, pool, threadconns, fairy)
    750         if not fairy:
--> 751             fairy = _ConnectionRecord.checkout(pool)
    752 


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in checkout(cls, pool)
    482     def checkout(cls, pool):
--> 483         rec = pool._do_get()
    484         try:


D:\Anaconda\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
    137                 with util.safe_reraise():
--> 138                     self._dec_overflow()
    139         else:


D:\Anaconda\lib\site-packages\sqlalchemy\util\langhelpers.py in __exit__(self, type_, value, traceback)
     67             if not self.warn_only:
---> 68                 compat.reraise(exc_type, exc_value, exc_tb)
     69         else:


D:\Anaconda\lib\site-packages\sqlalchemy\util\compat.py in reraise(tp, value, tb, cause)
    128             raise value.with_traceback(tb)
--> 129         raise value
    130 


D:\Anaconda\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
    134             try:
--> 135                 return self._create_connection()
    136             except:


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in _create_connection(self)
    298 
--> 299         return _ConnectionRecord(self)
    300 


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in __init__(self, pool, connect)
    427         if connect:
--> 428             self.__connect(first_connect_check=True)
    429         self.finalize_callback = deque()


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in __connect(self, first_connect_check)
    629             self.starttime = time.time()
--> 630             connection = pool._invoke_creator(self)
    631             pool.logger.debug("Created new connection %r", connection)


D:\Anaconda\lib\site-packages\sqlalchemy\engine\strategies.py in connect(connection_record)
    113                             return connection
--> 114                 return dialect.connect(*cargs, **cparams)
    115 


D:\Anaconda\lib\site-packages\sqlalchemy\engine\default.py in connect(self, *cargs, **cparams)
    452     def connect(self, *cargs, **cparams):
--> 453         return self.dbapi.connect(*cargs, **cparams)
    454 


D:\Anaconda\lib\site-packages\psycopg2\__init__.py in connect(dsn, connection_factory, cursor_factory, **kwargs)
    121     dsn = _ext.make_dsn(dsn, **kwargs)
--> 122     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
    123     if cursor_factory is not None:


OperationalError: connection to server at "localhost" (::1), port 5432 failed: Connection refused (0x0000274D/10061)
	Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused (0x0000274D/10061)
	Is the server running on that host and accepting TCP/IP connections?


​ The above exception was the direct cause of the following exception:

OperationalError                          Traceback (most recent call last)

<ipython-input-63-3ed25a579e98> in <module>
----> 1 India_df.to_sql('samlpe_table',engine)


~\AppData\Roaming\Python\Python37\site-packages\pandas\core\generic.py in to_sql(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
   2880             chunksize=chunksize,
   2881             dtype=dtype,
-> 2882             method=method,
   2883         )
   2884 


~\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py in to_sql(frame, name, con, schema, if_exists, index, index_label, chunksize, dtype, method, engine, **engine_kwargs)
    726         method=method,
    727         engine=engine,
--> 728         **engine_kwargs,
    729     )
    730 


~\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py in to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype, method, engine, **engine_kwargs)
   1756             index_label=index_label,
   1757             schema=schema,
-> 1758             dtype=dtype,
   1759         )
   1760 


~\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py in prep_table(self, frame, name, if_exists, index, index_label, schema, dtype)
   1648             dtype=dtype,
   1649         )
-> 1650         table.create()
   1651         return table
   1652 


~\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py in create(self)
    854 
    855     def create(self):
--> 856         if self.exists():
    857             if self.if_exists == "fail":
    858                 raise ValueError(f"Table '{self.name}' already exists.")


~\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py in exists(self)
    838 
    839     def exists(self):
--> 840         return self.pd_sql.has_table(self.name, self.schema)
    841 
    842     def sql_schema(self):


~\AppData\Roaming\Python\Python37\site-packages\pandas\io\sql.py in has_table(self, name, schema)
   1785         else:
   1786             return self.connectable.run_callable(
-> 1787                 self.connectable.dialect.has_table, name, schema or self.meta.schema
   1788             )
   1789 


D:\Anaconda\lib\site-packages\sqlalchemy\engine\base.py in run_callable(self, callable_, *args, **kwargs)
   2144 
   2145         """
-> 2146         with self._contextual_connect() as conn:
   2147             return conn.run_callable(callable_, *args, **kwargs)
   2148 


D:\Anaconda\lib\site-packages\sqlalchemy\engine\base.py in _contextual_connect(self, close_with_result, **kwargs)
   2224         return self._connection_cls(
   2225             self,
-> 2226             self._wrap_pool_connect(self.pool.connect, None),
   2227             close_with_result=close_with_result,
   2228             **kwargs


D:\Anaconda\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
   2264             if connection is None:
   2265                 Connection._handle_dbapi_exception_noconnection(
-> 2266                     e, dialect, self
   2267                 )
   2268             else:


D:\Anaconda\lib\site-packages\sqlalchemy\engine\base.py in _handle_dbapi_exception_noconnection(cls, e, dialect, engine)
   1534             util.raise_from_cause(newraise, exc_info)
   1535         elif should_wrap:
-> 1536             util.raise_from_cause(sqlalchemy_exception, exc_info)
   1537         else:
   1538             util.reraise(*exc_info)


D:\Anaconda\lib\site-packages\sqlalchemy\util\compat.py in raise_from_cause(exception, exc_info)
    381     exc_type, exc_value, exc_tb = exc_info
    382     cause = exc_value if exc_value is not exception else None
--> 383     reraise(type(exception), exception, tb=exc_tb, cause=cause)
    384 
    385 


D:\Anaconda\lib\site-packages\sqlalchemy\util\compat.py in reraise(tp, value, tb, cause)
    126             value.__cause__ = cause
    127         if value.__traceback__ is not tb:
--> 128             raise value.with_traceback(tb)
    129         raise value
    130 


D:\Anaconda\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
   2260         dialect = self.dialect
   2261         try:
-> 2262             return fn()
   2263         except dialect.dbapi.Error as e:
   2264             if connection is None:


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in connect(self)
    352         """
    353         if not self._use_threadlocal:
--> 354             return _ConnectionFairy._checkout(self)
    355 
    356         try:


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in _checkout(cls, pool, threadconns, fairy)
    749     def _checkout(cls, pool, threadconns=None, fairy=None):
    750         if not fairy:
--> 751             fairy = _ConnectionRecord.checkout(pool)
    752 
    753             fairy._pool = pool


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in checkout(cls, pool)
    481     @classmethod
    482     def checkout(cls, pool):
--> 483         rec = pool._do_get()
    484         try:
    485             dbapi_connection = rec.get_connection()


D:\Anaconda\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
    136             except:
    137                 with util.safe_reraise():
--> 138                     self._dec_overflow()
    139         else:
    140             return self._do_get()


D:\Anaconda\lib\site-packages\sqlalchemy\util\langhelpers.py in __exit__(self, type_, value, traceback)
     66             self._exc_info = None  # remove potential circular references
     67             if not self.warn_only:
---> 68                 compat.reraise(exc_type, exc_value, exc_tb)
     69         else:
     70             if not compat.py3k and self._exc_info and self._exc_info[1]:


D:\Anaconda\lib\site-packages\sqlalchemy\util\compat.py in reraise(tp, value, tb, cause)
    127         if value.__traceback__ is not tb:
    128             raise value.with_traceback(tb)
--> 129         raise value
    130 
    131     def u(s):


D:\Anaconda\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
    133         if self._inc_overflow():
    134             try:
--> 135                 return self._create_connection()
    136             except:
    137                 with util.safe_reraise():


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in _create_connection(self)
    297         """Called by subclasses to create a new ConnectionRecord."""
    298 
--> 299         return _ConnectionRecord(self)
    300 
    301     def _invalidate(self, connection, exception=None, _checkin=True):


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in __init__(self, pool, connect)
    426         self.__pool = pool
    427         if connect:
--> 428             self.__connect(first_connect_check=True)
    429         self.finalize_callback = deque()
    430 


D:\Anaconda\lib\site-packages\sqlalchemy\pool\base.py in __connect(self, first_connect_check)
    628         try:
    629             self.starttime = time.time()
--> 630             connection = pool._invoke_creator(self)
    631             pool.logger.debug("Created new connection %r", connection)
    632             self.connection = connection


D:\Anaconda\lib\site-packages\sqlalchemy\engine\strategies.py in connect(connection_record)
    112                         if connection is not None:
    113                             return connection
--> 114                 return dialect.connect(*cargs, **cparams)
    115 
    116             creator = pop_kwarg("creator", connect)


D:\Anaconda\lib\site-packages\sqlalchemy\engine\default.py in connect(self, *cargs, **cparams)
    451 
    452     def connect(self, *cargs, **cparams):
--> 453         return self.dbapi.connect(*cargs, **cparams)
    454 
    455     def create_connect_args(self, url):


D:\Anaconda\lib\site-packages\psycopg2\__init__.py in connect(dsn, connection_factory, cursor_factory, **kwargs)
    120 
    121     dsn = _ext.make_dsn(dsn, **kwargs)
--> 122     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
    123     if cursor_factory is not None:
    124         conn.cursor_factory = cursor_factory


OperationalError: (psycopg2.OperationalError) connection to server at "localhost" (::1), port 5432 failed: Connection refused (0x0000274D/10061)
	Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused (0x0000274D/10061)
	Is the server running on that host and accepting TCP/IP connections?

(Background on this error at: http://sqlalche.me/e/e3q8)
sql_df = pd.read_sql('sqmple_table',engine,index_col='Respondent')
sql_df.head()
sql_df = pd.read_sql_query('SELECT * FROM sample_table',engine,index_col='Respondent')
posts_df = pd.read_json('https://raw.githubusercontent.com/CoreyMSchafer/code_snippets/master/Python/Flask_Blog/snippets/posts.json')
posts_df.head()
titlecontentuser_id
0My Updated PostMy first updated post!\r\n\r\nThis is exciting!1
1A Second PostThis is a post from a different user...2
2Top 5 Programming LanaguagesTe melius apeirian postulant cum, labitur admo...1
3Sublime Text Tips and TricksEa vix dico modus voluptatibus, mel iudico sua...1
4Best Python IDEsElit contentiones nam no, sea ut consul adipis...1
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值