数据科学家访谈录 百度网盘_推动数据科学案例研究访谈

数据科学家访谈录 百度网盘


The data science case study is often the most difficult part of the hiring process. After sending in a resume and passing the recruiter screening along with the initial interview, this final stage often makes or breaks an applicant’s hiring potential.

数据科学案例研究通常是招聘过程中最困难的部分。 在发送简历并通过招聘者的筛选以及初次面试之后,此最后阶段通常会决定或破坏申请人的招聘潜力。

Designed to simulate a company’s current and past projects, case study problems rigorously examine how a candidate approaches prompts, communicates their findings, and works through roadblocks.


为什么要问案例研究? (Why do case studies get asked?)

In order to understand how to pass the case study section, it’s important to first understand what interviewers are looking for when applicants work through these prompts. Often, at this point in the process, prospects have already demonstrated sufficient technical understanding and skills for the position, so this is no longer a question of whether or not they can perform job duties.

为了了解如何通过案例研究部分,重要的是首先了解当申请人通过这些提示工作时面试官在寻找什么。 通常,在此过程的这一点上,潜在客户已经表现出了对该职位的足够的技术理解和技能,因此,这不再是他们是否可以执行职务的问题。

Instead, case studies look to understand the interviewee’s thought process- the ability to think on their feet through problems that don’t have a singular solution. Real life cases aren’t binary- there is no black-and-white-yes-or-no answer. Rather, due to all the ambiguities, candidates will need to demonstrate decisiveness in their investigations, as well as a capacity to consider impacts and topics from a variety of angles.

相反,案例研究着眼于了解受访者的思维过程-通过没有唯一解决方案的问题思考问题的能力。 现实生活中的情况不是二进制的-没有黑白是或没有答案。 相反,由于种种歧义,候选人将需要在其调查中表现出果断性,并具有从各种角度考虑影响和主题的能力。

Perhaps even more importantly, the ability to effectively communicate conclusions will be heavily highlighted in data science case study problems. Real working conditions require a great deal of information exchange across teams and divisions, so part of the interviewer’s focus will be on the system through which a candidate processes and explains their answer, and consequently, exactly what details are falling through the cracks.

也许甚至更重要的是,在数据科学案例研究中,将高度强调有效传达结论的能力。 实际的工作条件要求团队和部门之间进行大量的信息交流,因此,面试官的重点将放在应聘者用来处理和解释其答案的系统上,并因此准确地把握了细节。

数据科学案例研究的类型 (Types of Data Science Case Studies)

Image for post
Unsplash Unsplash

There are three main types of data science case studies: product questions, modeling and machine learning questions, and business case questions.


产品案例研究问题 (Product Case Study Questions)

This type of case study tackles a specific product or feature, often tied to the interviewing company. As such, it is extremely beneficial to research current projects and research developments across different divisions, as it might end up as the case study topic!

这种类型的案例研究针对特定产品或功能,通常与面试公司相关。 因此,研究跨部门的当前项目和研究开发极其有益,因为它可能最终成为案例研究的主题!

In this type of data science case study, interviewers are generally looking for a sense of business intuition revolving around product mechanics. The most important part is to identify which metrics should be proposed to understand a product.

在这种类型的数据科学案例研究中,访调员通常希望寻找一种围绕产品机制的业务直觉感。 最重要的部分是确定应提出哪些度量标准来理解产品。

Check out our guide on how to tackle the product data science case interview.


Here’s an example product data science case study question:


Suppose you’re working as a data scientist at Facebook. How would you measure the success of private stories on Instagram, where only certain chosen friends can see the story?

假设您正在Facebook担任数据科学家。 您如何衡量Instagram上私人故事的成功程度,只有某些选定的朋友才能看到该故事?

Try solving a product case question on Interview Query here!


建模和ML案例问题 (Modeling and ML Case Questions)

Modeling case studies are more varied and designed around developing some sort of insight into building models around business problems. These questions can range from applying machine learning to solve a specific case scenario to assessing the validity of a hypothetical existing model. The modeling case study requires a candidate to evaluate and explain any certain part of the model building process.

建模案例研究的范围更加广泛,其设计目的是围绕构建针对业务问题的模型的某种见识。 这些问题的范围可能从应用机器学习解决特定案例场景评估假设的现有模型的有效性。 建模案例研究要求候选人评估和解释模型构建过程的任何特定部分。

A common case study problem would be for a candidate to explain how they would build a model for a product that exists at the company or another company.


To get a better understanding of ML questions asked by companies such as Amazon, check out this article about Amazon Machine Learning Interview Questions and Solutions.

为了更好地理解诸如Amazon之类的公司提出的ML问题,请查看有关Amazon Machine Learning面试问题和解决方案的本文

For example:


Describe how you would build a model to predict Uber ETAs after a rider requests a ride

描述在骑手请求乘车后如何构建模型来预测Uber ETA

Many times this can be scoped down into specific portion of the model building process. For example taking the example above, we could break it up to:

很多时候,这可以缩小到模型构建过程的特定部分。 例如,以上面的示例为例,我们可以将其分解为:

How would you evaluate the predictions of an Uber ETA model?

您将如何评估Uber ETA模型的预测?



What features would you use to predict the Uber ETA for ride requests?

您将使用哪些功能来预测出行请求的Uber ETA?

Our recommended framework is to break a modeling and machine learning case study down to individual steps and tackle each one thoroughly.


In each full modeling case study, you’ll want to go over each part of:


  • Data processing

  • Feature Selection

  • Model Selection

  • Cross Validation

  • Evaluation Metrics

  • Testing and Roll Out


Try out solving a modeling question on Interview Query!


商业案例问题 (Business Case Questions)

Similar to product questions, business case problems are tackling a problem specific to the business. Common topics are often tied around having candidates assess the best option for certain business plans, and formulating a process for solving a specific problem. Other examples could include estimation and calculation, as well as applying problem solving to a larger case.

与产品问题类似,业务案例问题正在解决特定于业务的问题。 常见主题通常与让候选人评估某些业务计划的最佳选择以及制定解决特定问题的流程有关。 其他示例可能包括估计和计算,以及将解决问题的方法应用于更大的案例。

As with the product variant, it is helpful to read up on the interviewing company’s products and ventures beforehand to have some exposure to possible topics.


Example business case question:


You work as a data scientist for a ride-sharing company. An executive asks how you would evaluate whether a 50% rider discount promotion is a good or bad idea. How would you implement it? What metrics would you track?

您是拼车公司的数据科学家。 高管询问您如何评估50%的车手折扣促销是好还是坏的主意。 您将如何实施? 您将跟踪什么指标?

For an in-depth example of a case study question, go through the “Amazon Business Intelligence Case Question: Duplicate Products” article on Interview Query!

有关案例研究问题的深入示例,请阅读Interview Query上的“ Amazon Business Intelligence案例问题:重复产品”一文!

数据科学案例研究框架 (Framework for Data Science Case Studies)

Image for post
Pixabay Pixabay

There are four main steps to tackling every data science case study problem, regardless of the type: clarify, make assumptions, gather context, and provide data points and analysis.


澄清 (Clarify)

The first step is used to gather more information. More often than not, these case studies are designed to be confusing! There will be unorganized data intentionally supplemented with extraneous or omitted information, so it is the candidate’s job in this step to even out this inherent disadvantage. Interviewers will observe how they ask questions and continue on through their solution.

第一步用于收集更多信息。 这些案例研究经常被设计为令人困惑! 将有意补充无用或遗漏的信息的无组织数据,因此,消除此内在缺点是候选人的工作。 采访者观察他们如何提出问题,并继续解决。

For example, with a product question, you might take into consideration:


  • What is the product?

  • How does the product work?

  • How does the product align with the business itself?


进行假设 (Make Assumptions)

The next stepis where the thought process really starts to be outlined. With all the data provided, it’s important to start investigating and discarding possible hypotheses. Developing insights here is complementary to the ability to fine tune and glean information from the previous step, and the understanding gained there is paramount to forming a successful hypothesis. For simplicity’s sake, let’s continue with the product line of questioning.

下一步是真正开始概述思维过程的地方。 利用提供的所有数据,重要的是开始研究并舍弃可能的假设。 在这里发展见解是对上一步信息进行微调和收集的能力的补充,并且在那里获得的理解对于形成成功的假设至关重要。 为简单起见,让我们继续提问的产品线。

In this step, some important questions to evaluate and draw conclusions from include:


  • Who uses the product? Why?

    谁在使用产品? 为什么?
  • What are the goals of the product?


The goal of this is to reduce scope of the problem at hand and ask the interviewer questions upfront that allow you to tackle the meat of the problem instead of focusing on random edge cases.


假设并提出解决方案 (Hypothesize and Propose a Solution)

Now that a hypothesis is formed, gathering contextis the next step towards fleshing out an answer. This is where the problem should be reframed given the new information gathered in the last two steps.

现在已经形成了一个假设,收集上下文是充实答案的下一步。 考虑到在最后两个步骤中收集到的新信息,应该在这里重新构造问题。

Remember that there isn’t an expected singular solution, and as such, there is a certain freedom here to determine the exact path for investigation. Consider how to define different metrics in the context of the problem.

请记住,没有一个预期的奇异解决方案,因此,这里有一定的自由来确定确切的调查路径。 考虑如何在问题的上下文中定义不同的指标。

提供数据点和分析 (Provide Data Points and Analysis)

Finally, providing data points and analysisinvolves choosing and prioritizing a main metric. As with all prior factors, this step must be tied back to the hypothesis and the main goal of the problem. From there, it’s important to trace through and analyze different examples- from the main metric-in order to validate the hypothesis.

最后,提供数据点和分析涉及选择和确定主要指标的优先级。 与所有先前因素一样,此步骤必须与假设和问题的主要目标联系在一起。 从那里开始,重要的是要从主要指标中追溯并分析不同的示例,以验证假设。

最终故障+提示 (Final Breakdown + Tips)

The last topic to touch upon would be the general format of these case studies. Unfortunately, this is company-specific: some prefer live settings, where candidates actively work through a prompt after receiving it, while others offer some period of time (say, a week) before settling in for a presentation of the findings.

最后要讨论的主题是这些案例研究的一般格式。 不幸的是,这是特定于公司的:有些人喜欢现场设置,候选人在收到提示后会主动处理提示,而另一些人则提供一定的时间(例如一周),然后再提出结论。

Note: in some special cases, solutions will also be assessed on the ability to convey information in layman’s terms. Regardless of the structure, applicants should always be prepared to solve through the framework outlined above in order to answer the prompt.

注意:在某些特殊情况下,还将以外行的方式评估解决方案的信息传递能力。 无论采用哪种结构,申请人都应始终准备通过上述框架解决问题,以回答提示。

There have been multiple articles and discussions conducted by interviewers behind the Data Science Case Study portion, and they all boil down success in this stage to one main factor- effective communication.


All the analysis in the world isn’t going to help if interviewees cannot verbally work through and highlight their thought process within the case study. Again, the main highlight in this section of the hiring process are well-developed “soft-skills” and problem-solving capabilities. Demonstrating those traits is key to succeeding in this round.

如果受访者无法通过口头方式进行工作并在案例研究中突出他们的思维过程,那么世界上的所有分析都将无济于事。 再次,招聘过程中此部分的主要亮点是完善的“软技能”和解决问题的能力。 证明这些特质是本轮成功的关键。

To this end, the best advice possible would be to practice actively going through example case studies, such as those available in the Interview Query question bank. Exploring different topics with a friend in an interview-like setting with cold recall (no Googling in between!) will be uncomfortable and awkward, but it’ll also help reveal weaknesses in fleshing out the investigation.

为此,可能最好的建议是实行积极通过例如案例研究,比如在那些可去面试查询题库。 在像面试一样的环境中与朋友一起探寻不同的话题,让人感到冷淡(不使用谷歌搜索!),这会让人感到不舒服和尴尬,但这也将有助于揭示充实调查工作的弱点。

Don’t worry if the first few times are terrible! Developing a rhythm will help with gaining confidence in assessing and learning through these sessions.

如果前几次糟糕,请不要担心! 养成节奏将有助于通过这些课程获得对评估和学习的信心。

As always, feel free to check us out at Interview Query for more tips and practice!


谢谢阅读 (Thanks for Reading)

Originally published at https://www.interviewquery.com on July 23, 2020.


翻译自: https://towardsdatascience.com/acing-the-data-science-case-study-interview-a296e726ddb4

数据科学家访谈录 百度网盘

  • 0
  • 0
    觉得还不错? 一键收藏
  • 0




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


