python如何统计累计每日的人数‘’_每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 902...

点击上方蓝字 会变美

5e1d05a3275e11154d71bde893cbc0c2.png

Jun.

30

Data Application Lab 自2017年6月15日起,每天和你分享讨论一道数据科学(DS)和商业分析(BA)领域常见的面试问题。

自2017年10月4日起,每天再为大家分享一道Leetcode 算法题。

希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。

Day

802

DS Interview Question

How can you ensure that you don’t analyse something that ends up producing meaningless results?

BA Interview Question

Trips and Users

The Trips table holds all taxi trips. Each trip has a unique Id, while Client_Id and Driver_Id are both foreign keys to the Users_Id at the Users table. Status is an ENUM type of (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’).
+----+-----------+-----------+---------+--------------------+----------+
| Id | Client_Id | Driver_Id | City_Id |        Status      |Request_at|
+----+-----------+-----------+---------+--------------------+----------+
| 1  |     1     |    10     |    1    |     completed      |2013-10-01|
| 2  |     2     |    11     |    1    | cancelled_by_driver|2013-10-01|
| 3  |     3     |    12     |    6    |     completed      |2013-10-01|
| 4  |     4     |    13     |    6    | cancelled_by_client|2013-10-01|
| 5  |     1     |    10     |    1    |     completed      |2013-10-02|
| 6  |     2     |    11     |    6    |     completed      |2013-10-02|
| 7  |     3     |    12     |    6    |     completed      |2013-10-02|
| 8  |     2     |    12     |    12   |     completed      |2013-10-03|
| 9  |     3     |    10     |    12   |     completed      |2013-10-03|
| 10 |     4     |    13     |    12   | cancelled_by_driver|2013-10-03|
+----+-----------+-----------+---------+--------------------+----------+

The Users table holds all users. Each user has an unique Users_Id, and Role is an ENUM type of (‘client’, ‘driver’, ‘partner’).
+----------+--------+--------+
| Users_Id | Banned |  Role  |
+----------+--------+--------+
|    1     |   No   | client |
|    2     |   Yes  | client |
|    3     |   No   | client |
|    4     |   No   | client |
|    10    |   No   | driver |
|    11    |   No   | driver |
|    12    |   No   | driver |
|    13    |   No   | driver |
+----------+--------+--------+


Write a SQL query to find the cancellation rate of requests made by unbanned users between Oct 1, 2013 and Oct 3, 2013. For the above tables, your SQL query should return the following rows with the cancellation rate being rounded to two decimal places.

+------------+-------------------+
|     Day    | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 |       0.33        |
| 2013-10-02 |       0.00        |
| 2013-10-03 |       0.50        |
+------------+-------------------+

LeetCode Question

Combinations

Description:

Given two integers n and k, return all possible combinations of k numbers out of 1 … n.

Input: n = 4 and k = 2

Output: [[2,4],[3,4],[2,3],[1,2],[1,3],[1,4],]

Day

801

答案揭晓

DS Interview Question & Answer

How do data management procedures like missing data handling make selection bias worse?

Missing value treatment is one of the primary tasks which a data scientist is supposed to do before starting data analysis. There are multiple methods for missing value treatment. If not done properly, it could potentially result into selection bias. Let see few missing value treatment examples and their impact on selection-

Complete Case Treatment: Complete case treatment is when you remove entire row in data even if one value is missing. You could achieve a selection bias if your values are not missing at random and they have some pattern. Assume you are conducting a survey and few people didn’t specify their gender. Would you remove all those people? Can’t it tell a different story?

Available case analysis: Let say you are trying to calculate correlation matrix for data so you might remove the missing values from variables which are needed for that particular correlation coefficient. In this case your values will not be fully correct as they are coming from population sets.

Mean Substitution: In this method missing values are replaced with mean of other available values.This might make your distribution biased e.g., standard deviation, correlation and regression are mostly dependent on the mean value of variables.

Hence, various data management procedures might include selection bias in your data if not chosen correctly.

BA Interview Question & Answer

Rising Temperature

Given a Weather table, write a SQL query to find all dates' Ids with higher temperature compared to its previous (yesterday's) dates.
+---------+------------------+------------------+
| Id(INT) | RecordDate(DATE) | Temperature(INT) |
+---------+------------------+------------------+
|       1 |       2015-01-01 |               10 |
|       2 |       2015-01-02 |               25 |
|       3 |       2015-01-03 |               20 |
|       4 |       2015-01-04 |               30 |
+---------+------------------+------------------+

For example, return the following Ids for the above Weather table:
+----+
| Id |
+----+
|  2 |
|  4 |
+----+

Answer:

SELECT DISTINCT w1.Id

FROM Weather as w1, Weather as w2

WHERE w1.Temperature > w2.Temperature

  AND w1.RecordDate = DATE_ADD(w2.RecordDate, INTERVAL 1 DAY);

Reference:

https://leetcode.com/problems/rising-temperature/description/

LeetCode Question & Answer

Array DFS Subsetss

Description:

Given a set of distinct integers, nums, return all possible subsets.

Input: [1,2,3]

Output: [[],[1],[1,2],[1,2,3],[1,3],[2],[2,3],[3]]

Assumptions:

The solution set must not contain duplicate subsets.

Solution:

子集问题就是隐式图的深度优先搜索遍历。因为是distinct,所以不需要去重。

924bb1dd268a86d3848a063c0ad68e55.png

Time Complexity: O(2 ^ n)

Space Complexity: O(n)

往期精彩回顾

招人啦!Web Developer看过来~

Python中的类 (Classes) : 数据科学家的基础

求职必知:组织数据科学项目的诀窍

亚麻DS这些面试真题,你能答对几道?

求职面试、统计概率、Tableau、Python、Sql、R等近三十门线上课程最低只要99!

62b93212ec3781b0cf233273f2457f94.png

396368879494f34f876a52d83ae81ae4.png

682c856725ebd779ec2fa73c5d9642f6.png

点「在看」的人都变好看了哦 c6ddd8d703b27816701bfb68ff3b3d35.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值