数据科学家 数据工程师_发展数据科学家和工程师

数据科学家 数据工程师

by David Venturi

大卫·文图里(David Venturi)

发展数据科学家和工程师 (Developing Data Scientists and Engineers)

Free Code Camp问了15,000个人,他们是谁,以及他们如何学习编码。 我隔离了那些专注于数据科学和数据工程的人。 (Free Code Camp asked 15,000 people who they are, and how they’re learning to code. I isolated those focused on data science and data engineering.)

More than 15,000 people responded to Free Code Camp’s 2016 New Coder Survey, granting researchers (like me!) an unprecedented glimpse into how people are learning to code. They released the entire dataset on Kaggle.

超过15,000人对Free Code Camp的2016年New Coder调查做出了回应,使研究人员( 像我一样! )空前地了解了人们如何学习编码。 他们在Kaggle上发布了整个数据集。

646位受访者回答了“ 数据科学家/数据工程师 ”的问题:“ 您最感兴趣的角色是哪个?(646 respondents answered “Data Scientist/Data Engineer” to the question: “Which one of these roles are you most interested in?)

Here are a few high-level statistics from this data-focused subset, which complements Free Code Camp’s exploration of new coders in general.

以下是这个以数据为中心的子集的一些高级统计信息,补充了Free Code Camp 通常对新编码器探索

I’ve borrowed the structure of Free Code Camp’s announcement article for ease of comparison. I’ve also included my comments where findings differ notably. And a few bonus plots, too!

为了便于比较,我借用了Free Code Camp的公告文章的结构。 我还发表了自己的评论,其中发现存在显着差异 还有一些奖励情节!

We asked 15,000 people who they are, and how they’re learning to codeMore than 15,000 people responded to the 2016 New Coder Survey, granting researchers an unprecedented glimpse into how…medium.freecodecamp.com

我们询问了15,000人,他们是谁,以及他们如何学习编码 。超过15,000人对2016年《 New Coder调查》做出了回应,使研究人员能够以前所未有的方式了解…... medium.freecodecamp.com

谁参加了? (Who participated?)

Of the 646 developing data scientists and data engineers who responded to the survey:

646位接受调查的发展中的数据科学家和数据工程师:

  • 25% are women (4% more)

    女性 25% (增加4%)

  • their median age is 26 years old (one year younger)

    他们的中位年龄是26(比她小一岁)

  • they started programming an average of 16 months ago (5 months earlier)

    他们平均在16个月前(比5个月前)开始编程

学习者的目标和方法 (Learner goals and approaches)

平均每周花14个小时学习。 (14 hours each week, on average, are spent learning.)

This is one hour less than new coders in general.

一般而言,这比新编码员少一小时。

0%的人想要自由职业者或自己创业。* (0% want to freelance or start their own business.*)

Compared to 40% for the full new coder survey, this is a bit shocking. I have a hunch these zero counts are caused by the survey’s design. Every respondent that answered the job role of interest question has zero counts for “start your own business” and “freelance.”

与全新编码器调查的40%相比,这有点令人震惊。 我直觉这些零计数是由调查的设计引起 。 每个回答了兴趣职位问题的受访者,“开办自己的企业”和“自由职业”的计分都为零。

52%的人已经在申请工作,或者将在明年开始申请。 (52% percent are already applying for jobs, or will start applying within the next year.)

This is a longer time horizon than new coders in general, where 65% are applying within the next year.

一般而言,这比新编码员的时间跨度更长,因为新编码员将在明年申请65%的编码。

他们中的大多数人希望在办公室工作,而不是远程工作。 (Most of them want to work in an office, as opposed to remotely.)
并且大多数人愿意搬迁。 (And a majority are willing to relocate.)
他们中的大多数人尚未参加任何现场编码活动。 (Most of them have not yet attended any in-person coding events.)
64%的人使用过Coursera,edX或Udacity中的至少一种。 (64% have used at least one of Coursera, edX, or Udacity.)

Only 46% of new coders in general have used at least one of these resources. These companies have a wider range of subject areas than the some of the coding-specific resources listed.

通常,只有46%的新编码员至少使用了其中一种资源。 这些公司的主题领域比列出的某些特定于编码的资源还要广泛。

Of them, Partially Derivative, Becoming A Data Scientist, and Talking Machines are the only data-specific podcasts noted.

其中, 部分衍生成为数据科学家Talking Machines是唯一提到的特定于数据的播客。

只有1%的人参加了训练营。 (Only 1% have attended a bootcamp.)

6% of new coders have attended a bootcamp.

6%的新编码员参加了训练营。

人口统计学和社会经济学 (Demographics and Socioeconomics)

以数据为中心的受访者来自166个国家。 (Data-focused respondents represent 166 countries.)
超过90%来自北美,欧洲和亚洲。 (More than 90% are from North America, Europe, and Asia.)

The dominating percentage of North Americans should be expected because Free Code Camp is based in the United States.

因为Free Code Camp的总部位于美国,所以应该可以预期北美人占主导地位。

他们的城市涵盖了广泛的城市化水平。 (Their cities span a wide range of urbanization levels.)
不到四分之一的受访者是他们国家的少数民族。 (Just under a quarter of respondents are ethnic minorities in their country.)
几乎一半是非英语母语者。 他们长大后会讲148种语言中的一种。 (And nearly half are non-native English speakers. They grew up speaking one of 148 languages.)
67%的人至少拥有学士学位。 (67% have earned at least a bachelor’s degree.)

Compared to 58% for new coders in general, the data-focused subset is more skewed towards post-secondary studies.

相比于一般新程序员的58%,以数据为中心的子集更倾向于中学后学习。

Diversity amongst majors is greater compared to the full survey, where Computer Science and Information Technology checked in at #1 and #2 with 17% and 5%, respectively.

与完整调查相比,专业之间的差异更大,在完整调查中,计算机科学和信息技术分别以17%和5%位居第一和第二。

目前只有一半以上在工作。 (Just over one-half are currently working.)

Two-thirds of the new coder population are currently working.

目前有三分之二的新编码员正在工作。

科技行业的四分之一工作。 (A quarter work in the tech industry.)

There is a higher variety of employment fields compared to the full dataset, where 50% of respondents work in software development and IT.

与完整数据集相比,雇佣领域的多样性更高,在整个数据集中,有50%的受访者从事软件开发和IT工作。

目前的中位数工资为$ 44k。 (Median current salary is $44k.)

The median current salary for the full dataset is $37k.

完整数据集的当前薪水中位数为37,000美元。

他们希望凭借新的数据科学/工程技能获得中位数6万美元。 (And they expect to earn a median of $60k with their new data science/engineering skills.)

The median for the full survey dataset is $50k. With data science/engineering being notoriously lucrative in 2016, some respondents might be seeking higher wages.

整个调查数据集的中位数为5万美元。 随着2016年数据科学/工程学的丰厚利润 ,一些受访者可能会寻求更高的薪水。

7%曾在本国的军队中服役。 (7% have served in their country’s military.)
13%有孩子,另外3%在经济上抚养年长或残疾亲戚。 五分之一的人在没有配偶帮助的情况下这样做。 (13% have children, and another 3% financially support an elderly or disabled relative. And one-fifth are doing this without the help of a spouse.)
47%的人认为自己就业不足(从事的工作低于其教育水平)。 (47% consider themselves underemployed (working a job that is below their education level).)

This is 5% higher than new coders in general.

一般而言,这比新编码员高5%。

如果他们有房屋抵押贷款,他们平均要欠$ 194k。 (If they have a home mortgage, they owe an average of $194k.)
如果他们有学生贷款,他们平均要欠37,000美元。 (If they have student loans, they owe an average of $37k.)

This average is $3k more than the full survey dataset.

该平均值比整个调查数据集高出3000美元。

14%的人尚未在家中使用高速互联网。 (14% don’t yet have high-speed internet at home.)
目前,有3%的人正在从政府那里获得残疾补助。 (And 3% are currently receiving disability benefits from their government.)
这些是正在学习数据科学和工程的人。 免费的,自定进度的学习资源绝对重要。 (These are the people who are learning data science and engineering. Free, self-paced learning resources are definitely important.)

下一步是什么? (What’s next?)

You can find a more detailed version of this analysis on Kaggle, where I outline my exploratory data analysis (EDA) process.

您可以在Kaggle上找到此分析的更详细版本 ,其中概述了探索性数据分析(EDA)过程。

Be sure to check out my initial exploration of Free Code Camp’s dataset, where I dive deeper into the characteristics of new coders:

一定要检查一下我对Free Code Camp数据集的初步探索,在此我将更深入地研究新编码员的特征:

New Coders: How Salary and Time Spent Learning Vary by DemographicI analyzed the 15,000 respondents to Free Code Camp’s New Coder Survey by continent, gender, and whether they’re an…medium.freecodecamp.comThe 6 most desirable coding jobs (and the types of people drawn to each)Free Code Camp asked 15,000 people who they are, and how they’re learning to code. I separated them by their job…medium.freecodecamp.com

新编码员:薪资和学习时间的变化因人口 我按大陆,性别以及他们是否从事过Free Code Camp的New Coder调查对15,000名受访者进行了分析…… medium.freecodecamp.com 6个最理想的编码工作(以及吸引每个人的类型) 免费代码营问了15,000个人,他们是谁,以及他们如何学习编码。 我按他们的工作把他们分开了…… medium.freecodecamp.com

If you have questions or concerns about this series or the R code that generated it, don’t hesitate to let me know.

如果您对此系列或生成它的R代码有疑问或疑虑,请随时告诉我

David Venturi (@venturidb) | TwitterThe latest Tweets from David Venturi (@venturidb). Creating my own data science master's degree. @queensu chem eng/econ…twitter.com

大卫·文图里(@venturidb)| Twitter 来自David Venturi的最新推文(@venturidb)。 创建自己的数据科学硕士学位。 @queensu Chem eng / econ… twitter.com

翻译自: https://www.freecodecamp.org/news/developing-data-scientists-engineers-710f4ef5a773/

数据科学家 数据工程师

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值