第一名数据科学工作冠状病毒医生

背景 (Background)

3 years ago, I had just finished medical school and started working full-time as a doctor in the UK’s National Health Service (NHS). Now, I work full-time as a data scientist at dunnhumby, writing code for “Big Data” analytics with Python and Spark.

3年前,我刚读完医学院,并开始在英国国家卫生局(NHS)担任全职医生。 现在,我在dunnhumby从事数据科学家的全职工作 ,使用Python和Spark编写“大数据”分析代码。

More and more people are making the transition towards data science, or related technical roles, from a variety of disciplines. So in this article I’m going to share my experiences and advice for making a (perhaps) unconventional career transition into a technical role. I can break these down into five main learnings:

越来越多的人正在从各种学科过渡到数据科学或相关的技术角色。 因此,在本文中,我将分享我的经验和建议,以使(也许)非常规的职业转变为技术角色。 我可以将其分解为五个主要的学习内容:

(1) 寻找技术朋友 ((1) Find technical friends)

Coming from a medical background, I didn’t have the first clue about how to develop coding skills or data science understanding. And neither did anybody around me.

来自医学背景,我对如何发展编码技能或对数据科学的了解没有第一条线索。 我周围的人也没有。

This made it really important for me to branch out and find people who did. I quickly saw the benefits of doing so.

这对我来说很重要的一点是,要找到能干的人。 我很快看到了这样做的好处。

早期效率低下 (Early inefficiencies)

When I initially started out, I’d have to resort to Google or StackOverflow to try and solve my problems. These are great resources, but it’s hard to find what you want when you don’t really know what you’re looking for.

刚开始时,我不得不求助于Google或StackOverflow尝试解决问题。 这些都是很棒的资源,但是当您真的不知道要寻找什么时,很难找到想要的东西。

One of the top skills a developer needs to know is what to search to find the solution to your current problem. Without that skill, I would spend ages stuck at a relatively simple hurdle — like trying to manipulate a pandas dataframe in a particular way, or how to install and import the package I needed.

开发人员需要知道的最重要的技能之一是要搜索什么才能找到当前问题的解决方案。 没有这种技能,我会花很多时间在相对简单的障碍上,例如尝试以特定方式操纵熊猫数据框,或者如何安装和导入所需的软件包。

I’d heard that it’s best to think of your own projects as a means to learn. However, without insight into what it takes to build a project, and what’s possible, I would typically come up with over-ambitious projects with too many moving parts. I remember an early project idea was to build a chatbot patient for doctors to practice with, and I even started collecting transcripts from real conversations to help make this. In hindsight, this type of task was way too ambitious for someone of my technical level at that time (having just completed an Intro to Python course).

我听说最好将自己的项目视为学习的一种方式。 但是,如果不了解构建项目所需要的内容以及可行的方法,那么我通常会提出过于雄心勃勃的项目,其中包含过多的活动部件。 我记得一个早期的项目构想是为医生创建一个聊天机器人患者以进行练习,我甚至开始从真实的对话中收集成绩单以帮助实现这一目标。 事后看来,对于我这个技术水平的人来说,这种任务太过雄心勃勃了(刚刚完成了Python入门课程)。

在一些朋友的帮助下获得成功 (Getting by with a little help from some friends)

Having technical friends is great for overcoming both of these sources of inefficiency.

拥有技术朋友对于克服这两种效率低下的问题都非常有用。

If you have a relatively simple technical issue, but don’t know where to go to solve it, a technical friend can point you in the right direction pretty quickly. This saves a lot of time and frustration.

如果您有一个相对简单的技术问题,但又不知道应该去哪里解决,那么技术朋友可以Swift为您指出正确的方向。 这样可以节省大量时间和挫败感。

Likewise, if you come up with a project idea, you can run it by a technical friend. They’ll be able to break it down into stages and ultimately advise you whether it makes sense to do and how best to go about it. This can save you a lot of time from barking up the wrong tree.

同样,如果您提出了一个项目构想,则可以由技术朋友来执行。 他们将能够将其分解为多个阶段,并最终建议您这样做是否有意义以及如何最好地进行。 这样可以避免树错树皮,从而节省大量时间。

如何交技术朋友 (How to make technical friends)

I don’t think there’s a “right” way to find technical friends and establish a relationship where you can ask them for advice. Here are a few principles that I find helpful.

我认为没有找到“技术朋友”并建立合作关系的“正确”方法,您可以向他们寻求建议。 这里有一些我认为有帮助的原则。

Firstly, being open and honest about my intentions (“I’m learning to code and would love someone I could ping a message to when I get stuck”).

首先,要对自己的意图保持开放和诚实(“我正在学习编码,并且会爱上一个我可以在遇到困难时向其发送消息的人”)。

Secondly, being respectful of their time. I pushed myself to only ask for help if I’d truly searched for the solution and spent time trying to solve it myself. (To be honest, I think you also learn better this way.)

其次,尊重他们的时间。 我强迫自己只在寻求解决方案时才寻求帮助,并花时间尝试自己解决问题。 (说实话,我认为您也可以通过这种方式学习得更好。)

This wasn’t always easy. Sometimes I’d hit the initial wall of frustration and feel an urge to send off multiple messages, hoping for a quick solution. I tried to have a good crack myself first, but I’ll admit I sometimes caved in.

这并不总是那么容易。 有时,我会遇到挫折感,并发出发送多条消息的渴望,希望能有一个快速的解决方案。 我试图先做好自己的准备,但我承认有时我会屈服。

It was also helpful to have multiple people I could go to. I didn’t have to keep bugging the same person, reducing the risk of annoying them.

有多个人可以去找我也很有帮助。 我不必一直烦扰同一个人,从而减少了使他们烦恼的风险。

I personally met these friends from multiple places; from attending events that interested me (such as data science and machine learning meet-ups), from working on projects together (more on that in Section 2 and 3) and from a smattering of formal ‘networking’, friends-of-friends and random LinkedIn messages.

我亲自从多个地方认识了这些朋友。 从参加令我感兴趣的活动(例如数据科学和机器学习聚会),从一起开展项目(在第2节和第3节中有更多讨论),以及从少量的正式“网络”,朋友的朋友和随机LinkedIn消息。

At some point, when it felt comfortable, I’d reach out for advice on a specific problem or project I was working on. Sometimes the problems were simple or the project ideas were bad, so I had to put my pride to the side and seek out the constructive criticism.

在某个时候,当感觉舒适时,我会就我正在研究的特定问题或项目寻求建议。 有时问题很简单,或者项目思路不好,所以我不得不放下自己的骄傲,去寻求建设性的批评。

If you’re starting out, and looking for a technical friend to help you get started, feel free to reach out (email at hi@chrislovejoy.me, Twitter @ChrisLovejoy_).

如果您是 新手 并正在寻找技术朋友来帮助您入门,请随时与我们联系(发送电子邮件至 hi@chrislovejoy.me ,Twitter @ChrisLovejoy_ )。

(2)建立投资组合:3的法则 ((2) Build a portfolio: the rule of 3)

One of the first pieces of advice I received from a mentor is one that has stuck with me since:

我从导师那里收到的第一批建议之一是自从我以来一直坚持的建议:

Have a portfolio of three great projects.

拥有三个伟大项目的投资组合。

Initially, for me, this just meant striving to get three projects under my belt. I went to hackathons, interned at companies and designed my own.

最初,对我而言,这只是意味着要努力完成三个项目。 我参加了黑客马拉松,在公司实习并设计了自己的游戏。

Once I achieved that, I kept looking at how I could build on them or replace them with new and cooler projects.

一旦实现这一目标,我就会一直在研究如何在它们之上构建或用新的更酷的项目替换它们。

It’s so simple, but I’ve found it a useful way to frame it. You’re only as good as your top three projects.

它非常简单,但是我发现它是一种有用的框架。 您仅与前三个项目一样出色。

To this day, the lower right-hand side of my CV is dedicated to my top three projects at that moment in time.

时至今日,我的简历的右下侧致力于当时的前三个项目。

If a round of job applications comes up, or if somebody asks me about work I’ve done, I know what to talk about. These three projects are always in mind.

如果出现一轮求职申请,或者有人问我完成的工作,我知道该谈论些什么。 始终牢记这三个项目。

获得项目 (Getting projects)

I’m a big advocate of designing our own projects, both for learning and for potentially contributing to community. But it’s not always easy, particularly when starting out.

我大力倡导设计我们自己的项目 ,既用于学习又可能对社区做出贡献。 但这并不总是那么容易,尤其是刚开始时。

A good source of ‘ready-made’ projects to work on is Kaggle. They provide a dataset and often specific challenges. You can also see other people’s solutions, which is a great source of learning.

Kaggle是可以进行“现成”项目的一个很好的资源。 它们提供了数据集,并且通常提供特定的挑战。 您还可以看到其他人的解决方案,这是学习的重要资源。

A great way to devise a project within a team is to attend hackathons. These are typically weekend sprints to develop a solution to a problem and are held in most major cities around the world.

在团队中设计项目的一种好方法是参加黑客马拉松。 这些通常是周末冲刺,用于开发问题的解决方案,并且在全球大多数主要城市中举行。

One thing I found really helpful was attending a project-based course. So much so, that I’ll devote the next section to it.

我发现真正有用的一件事是参加基于项目的课程。 如此之多,我将在下一节中进行介绍。

(3)如果可以的话,参加基于项目的课程或训练营 ((3) Go on a project-based course or bootcamp if you can)

Even with all the motivation in the world and a great team of technical mentors, it can still be challenging to build great projects and learn new skills off your own back. There’s a lot to be said for being in the right environment, and having good projects being defined for you.

即使拥有世界上所有的动力和强大的技术导师团队,建立出色的项目和学习新技能仍然具有挑战性。 在正确的环境中,要为您定义好的项目有很多话要说。

The best place for this would be to get a full-time job in a technical role. However, it’s not always possible to jump straight into this.

最好的地方是担任技术职位的全职工作。 但是,并非总是可能直接跳入这一步。

A really great intermediate step can be to go on a project-based course.

真正伟大的中间步骤可以是参加基于项目的课程。

These typically range between around 5 weeks to few months and are centred around a group project that produces a tangible output. There’s typically a partnership with a commercial client who has a genuine interest in what you are building.

这些时间通常在大约5周到几个月之间,并且以产生有形产出的小组项目为中心。 通常会与对您的建筑有真正兴趣的商业客户建立合作关系。

I went on the “Science to Data Science” (S2DS) virtual course. I found it really helpful having a defined project and having responsive technical mentors to go to for any problems that I arose.

我参加了“科学到数据科学”(S2DS)虚拟课程。 我发现有一个明确的项目并让响应的技术顾问解决我遇到的任何问题真的很有帮助。

I learnt a huge amount during the project; in particular I learnt how to structure source code, became more familiar with github, gained better understanding of regression performance metrics and learnt PEP-8 Python coding guidelines. (I’ll be sharing a full post on this in future.)

我在项目期间学到了很多东西; 特别是,我学习了如何构建源代码,对github更加熟悉,对回归性能指标有了更好的了解,并学习了PEP-8 Python编码指南。 (以后,我将分享一篇完整的帖子。)

Another course I’ve heard good things about is the ‘ASI Data Science’ course (who I think have now re-branded as ‘faculty.ai’).

我听说过的另一门很好的课程是“ ASI数据科学”课程(我认为该课程现已更名“ faculty.ai” )。

Note: I’m based in the UK. S2DS is international. I’m not sure about ASI. I’m sure there are programs abroad that I’m not familiar with.

注意:我是英国人。 S2DS是国际性的。 我不确定ASI。 我确定国外有一些我不熟悉的程序。

One word of warning is that there are a lot of courses which, in my opinion, over-charge. This is a reflection of the area having become popular, but it isn’t necessarily a reflection of the value that courses offer. The S2DS course I attended cost £800, which felt like fantastic value after attending, but I’ve seen many courses in the £5,000+ range.

值得一提的是,我认为很多课程收费过高。 这反映出该领域变得越来越流行,但这并不一定反映课程提供的价值。 我参加的S2DS课程费用为800英镑,参加该课程后感觉像是物超所值,但我看过很多课程都在5,000英镑以上。

(4)深入了解核心概念 ((4) Nail down understanding of core concepts)

Data science is a really big (and expanding) field. There’s a huge amount that you could learn and it’s easy to be overwhelmed when starting out.

数据科学是一个很大的领域(并且正在扩展)。 您可以学到很多东西,起步时很容易不知所措。

My approach has been to work towards a solid understanding of (i) core concepts and (ii) my specific areas of interest.

我的方法是努力对(i)核心概念和(ii)我的特定兴趣领域有扎实的理解。

So I guess the question is: what constitutes a ‘core concept?

所以我想问题是:什么构成“核心概念”?

I can’t claim to be an authority on what you should know, but these pages appear useful:

我不能声称是您应该知道的权威,但是这些页面似乎很有用:

If I were to suggest core concepts and skills, it would be something like:

如果我要提出核心概念和技能,那将是:

PROGRAMMING:

编程:

  • comfortable with python, pandas, numpy and scikit-learn

    熟悉python,pandas,numpy和scikit-learn
  • familiar with GitHub

    熟悉GitHub
  • familiar with the command line

    熟悉命令行
  • familiar with installing packages

    熟悉安装软件包

THEORY:

理论:

  • classification algorithms (SVMs, random forest, logistic regression, AdaBoost): higher level principles of how they work

    分类算法(SVM,随机森林,逻辑回归,AdaBoost):它们工作方式的更高层​​次原则
  • regression algorithms (linear/OLS regression, lasso and ridge regression, regression trees): higher level principles of how they work, and common considerations

    回归算法(线性/ OLS回归,套索和岭回归,回归树):它们如何工作的更高层次原则以及常见注意事项
  • performance measures for classification and for regression algorithms

    分类和回归算法的性能指标
  • a familiarity with neural networks and deep learning

    熟悉神经网络和深度学习
  • clustering algorithms: K means and hierarchical clustering

    聚类算法:K均值和层次聚类
  • familiar with dimensionality reduction techniques such as PCA

    熟悉降维技术,例如PCA

I’d suggest using online courses to build up your understanding in each of these key areas. Ones I found helpful were Brilliant.org and Khan academy for principles and maths, plus various courses on Coursera and Udemy for the more technical aspects.

我建议您使用在线课程来建立您对这些关键领域的了解。 我发现有帮助的是Brilliant.org和Khan学院的原理和数学课程,以及有关Coursera和Udemy的各种技术方面的课程。

As for the skills on top of this, I’d argue it depends on the industry of interest and type of work you’ll be doing. If working with Big Data, Apache Spark will be helpful. If working with time series, familiarity with ARIMA models will be helpful.

至于最重要的技能,我认为这取决于感兴趣的行业和您将要从事的工作类型。 如果使用大数据, Apache Spark将很有帮助。 如果使用时间序列,熟悉ARIMA模型将很有帮助。

(5)硕士学位(或其他正式资格)不是必需的,但可以帮助您 ((5) A Master’s Degree (or other formal qualification) isn’t essential, but can help)

There are a lot of data science roles that don’t specify a master’s degree as a formal requirement, and I think it’s possible to get a job without one.

数据科学中有很多角色并未将硕士学位指定为正式要求,而且我认为没有一份工作也有可能。

However, for me, and coming from an unconventional background, I found it hard to get taken seriously until I started working towards one.

但是,对我而言,由于来自非常规背景,我发现很难认真对待它,直到我开始朝着一个方向努力。

I think if my bachelor’s degree was more directly relevant to data science (rather than Medicine), I may not have needed to. A good bachelor’s + proof of practical experience is sufficient in many cases.

我认为,如果我的学士学位与数据科学(而不是医学)更直接相关,那么我可能并不需要。 在许多情况下,具有良好的学士+实践经验证明就足够了。

I ended up choosing a master’s degree in Data Science and Machine Learning at UCL in London. Even before starting it, I was pretty confident that I had a good grasp of key concepts from my own self-study.

我最终选择了伦敦UCL的数据科学和机器学习硕士学位。 甚至在开始之前,我就对自己的自学掌握了关键概念非常有信心。

But a lot of job applications led to straight-out rejection, so I never had the chance to prove myself. And I can completely understand why. If you have a lot of applications for a role, the one that says “I’m a full-time doctor, but have done loads of data science self-study” is a pretty easy one to remove from the pile.

但是很多工作申请导致了被拒绝,所以我从来没有机会证明自己。 我完全可以理解为什么。 如果您有很多职位申请,那么说“我是一名专职医生,但已经完成了大量的数据科学自学”的人很容易从职位中删除。

The master’s didn’t guarantee I’d progress to an interview, but I definitely felt it helped me get a foot in the door.

师父并不能保证我会接受面试,但是我绝对觉得这可以帮助我踏进门。

(Whether or not I’m fully behind paying X thousand for a master’s degree in our current climate of remote courses is a matter for another day, however…)

(在目前的远程课程环境下,是否要全额支付X 1000的硕士学位费用,这是另一回事了……)

最后的想法 (Final thoughts)

I’m absolutely loving work as a data scientist. It’s really satisfying to see the hard work pay off, even if the journey was tough at times. It feels great to have two skillsets in my portfolio (as both a doctor and as a data scientist) and one step towards a more ‘portfolio’ future approach to work. I hope that the experiences and suggestions I’ve shared here help you with your transition, too.

我绝对喜欢从事数据科学家的工作。 即使旅途有时很艰难,看到辛勤的工作也会收获很大的满足感。 在我的投资组合中拥有两个技能组(既是医生又是数据科学家)真是太好了,朝着更加“组合”的未来工作方法迈出了一步。 我希望我在这里分享的经验和建议也能帮助您过渡。

Best of luck! :)

祝你好运! :)

Many thanks to Abdel Mahmoud and Luke Harries for reviewing this article.

非常感谢Abdel Mahmoud和Lu​​ke Harries审阅本文。

翻译自: https://towardsdatascience.com/first-data-science-job-coronavirus-doctor-b8cf074bae96

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值