每个机器学习爱好者应该知道的16个面试问题

A Machine Learning Engineer has to cover the breadth concepts in ML, DL , Probability , Stats, and coding with a good depth of understanding . Machine Learning Engineer interview is not about ‘what’ we always think for why and how so I limited to 2–3 what questions in this discussion and focusing on why and how Questions.

机器学习工程师必须深入理解ML,DL,概率,统计和编码的广度概念。 机器学习工程师的采访是不是“什么”我们一直认为的为什么怎样,所以我仅限于2-3什么问题,在这次讨论中,关注的原因和方式问题。

  1. How to use k-NN for classification and regression?

    如何使用k-NN进行分类和回归?

A) For Classification apply majority vote of neighbors and for Regression, we do mean or median of all k neighbors.

A)对于分类适用邻居的多数票,对于回归,我们确定所有k个邻居的均值或中位数。

2)Why do we use the word ‘Regression’ in Logistic Regression even though we use it for Classification?

2)为什么我们在逻辑回归中使用“回归”一词,即使我们将其用于分类?

A) We use the Logistic Regression for classification after the model predicting the continuous output between (0–1) which we can interpret as the probability of the point belonging to the class.

A)在模型预测(0-1)之间的连续输出后,我们使用逻辑回归进行分类,这可以解释为该点属于该类的概率。

  • such as the sigmoid(W.Xq)> 0.5 we label it as positive else negative.

    例如sigmoid(W.Xq)> 0.5,我们将其标记为正,否则为负。

3)Explain intuition behind Boosting

3)解释加速背后的直觉

A)Train the very base model h(0) with training data as the model that we fit this training data is set to be having a high bias, it makes more number of training errors.

A)使用训练数据训练非常基本的模型h(0),因为我们适合该训练数据的模型设置为具有高偏差,这会导致更多的训练错误。

  • Then store the errors

    然后存储错误
  • Train the next model on the errors got in the previous model

    根据前一个模型中的错误训练下一个模型
  • If we keep on doing this..…each time we get residual errors and we try to predict them in the next model then our final model=Fi(x) = a0 h0(x) + a1 h1(x)…….+ai hi(x)

    如果我们继续这样做.....每次我们都会得到残留误差,并尝试在下一个模型中进行预测,那么最终模型= Fi(x)= a0 h0(x)+ a1 h1(x)……。+爱嗨(x)

4)What does it mean by the precision of a model equal to zero is it possible to have precision equal to 0.

4)等于零的模型的精度是什么意思,就有可能等于0的精度。

A) Precision represents out of all predicted positives of how many are actually positive.

A)精度代表所有实际阳性中的所有预测阳性。

precision = (True positives)/(True positives + Falsepositives)

精度=(真阳性)/(真阳性+假阳性)

  • precision equal to 0 if every predicted point by the model is a false positive.

    如果模型的每个预测点均为假阳性,则精度等于0。

5 )Why we need Calibration?

5)为什么我们需要校准?

  • Calibration is a must if probabilistic class-label is needed as output

    如果需要概率类标签作为输出,则必须进行校准
  • If the metric is log-loss and that needs the P(Y_i|X_i) values, then calibration is a must.

    如果度量是对数损失,并且需要P(Y_i | X_i)值,则必须进行校准。
  • The probabilities output by the models such as LR, naive Bayes are often NOT well calibrated which can be observed by plotting the calibration plot. Hence, we use calibration as a post-processing step to ensure that the final class-probabilities are well-calibrated.

    诸如LR,朴素贝叶斯之类的模型输出的概率通常没有得到很好的校准,这可以通过绘制校准图来观察。 因此,我们将校准用作后处理步骤,以确保最终的类别概率得到了很好的校准。

6)Where is parameter sharing seen in deep learning?

6)深度学习中的参数共享在哪里?

A) Parameter sharing is the sharing of weights by all neurons in a particular feature map.CNN uses the same weight vector to perform the convolution operation and RNN has the same weights at every time stamps.

A)参数共享是特定特征图中所有神经元的权重共享.CNN使用相同的权重向量执行卷积操作,RNN在每个时间戳上具有相同的权重。

7)How many parameters do we have in LSTM?

7)我们在LSTM中有几个参数?

A) 4(mn+m²+m) .For derivation checkout following blog.

A)4(mn +m²+ m)。用于博客后的派生结帐。

8)What is box cox transform? When can it be used?

8)什么是Box Cox转换? 什么时候可以使用?

  • Box-Cox transform helps us convert non-Gaussian distributed variables into Gaussian distributed variables.

    Box-Cox变换可帮助我们将非高斯分布变量转换为高斯分布变量。
  • It is a good idea to perform it if your model expects features that are Gaussian distributed(Gaussian Naive Bayes).

    如果您的模型期望使用高斯分布的特征(Gaussian Naive Bayes),则最好执行此操作。

9)How to use the K-S test to find two random variables X1 and X2 follow the same distribution?

9)如何使用KS检验找到两个随机变量X1和X2遵循相同的分布?

  • Plot CDF for both random variables

    为两个随机变量绘制CDF
  • Assume Null hypothesis: the two random variables come from the same distribution;

    假设为零假设:两个随机变量来自相同的分布;
  • Take Test statistic D = supremum (CDF(X1) — CDF(X2)) throughout the CDF range

    取整个CDF范围内的测试统计量D =最高(CDF(X1)-CDF(X2))
  • Null hypothesis is rejected when D > c(α) * sqrt( (n+m)/nm )

    当D> c(α)* sqrt((n + m)/ nm)时,零假设被拒绝

  • where m and n are no of observations in CDF1 and CDF2 respectively .

    其中m和n分别是CDF1和CDF2中的观测值。

10) Explain the backpropagation mechanism in dropout layers?

10)在辍学层中解释反向传播机制?

  • While training Neural Network with dropout the output is calculated without considering those weights for chosen neurons that are selected to be dropped and they have the same value as they had in previous iterations. And that weight doesn't update while back-propagation.

    在训练带有辍学的神经网络时,在不考虑选择要删除的选定神经元的权重的情况下计算输出,它们的权重与先前迭代中的权重相同。 而且,反向传播时权重不会更新。
  • Note that weights will not become zero they just ignored for iteration.

    请注意,权重不会变为零,它们只是为迭代而忽略。

11)Find the output shape and parameters after following operations?

11)进行以下操作后,找到输出形状和参数?

(7,7,512) ⇒ flatern ⇒ Dense(512)

(7,7,512)⇒平坦⇒密集(512)

(7,7,512) ⇒ Conv (512,(7,7))

(7,7,512)⇒转化(512,(7,7))

  • for (7,7,512) ⇒ flatern ⇒ Dense(512) Trainable parameters = (7*7*512)*512=12845056 ,output shape =512

    对于(7,7,512)⇒展平⇒密集(512)可训练参数=(7 * 7 * 512)* 512 = 12845056,输出形状= 512

  • For (7,7,512) ⇒ Conv (512,(7,7)) Trainable parameters = (7*7*512)*512=12845056 ,output shape =1,1,512

    对于(7,7,512)⇒转换(512,(7,7))可训练参数=(7 * 7 * 512)* 512 = 12845056,输出形状= 1,1,512

12)How will you calculate the P(x/y=0) in the case of x is continuous random variable?

12)如果x是连续随机变量,您将如何计算P(x / y = 0)?

A) If x is a numerical feature then assume that the feature follows Normal distribution. Then we can obtain likelihood probabilities from (PDF)density function whereas the absolute likelihood value for any continuous variable is zero.

A)如果x是数字特征,则假定特征服从正态分布。 然后,我们可以从(PDF)密度函数获得似然率,而任何连续变量的绝对似然值为零。

13)Explain the Correlation and Covariance?

13)解释相关性和协方差?

A) Covariance shows the linear relationship between variables we cant interpret how strong they are related whereas with Correlation it gives the linear relationship strength and direction of the two variables.

A)协方差显示了变量之间的线性关系,我们无法解释它们之间的相关程度,而相关性给出了两个变量的线性关系强度和方向。

14)What is the problem with sigmoid during backpropagation?

14)乙状结肠在反向传播过程中出现了什么问题?

A) The derivative of the Sigmoid function lies between 0 and 0.25. During chain rule multiplication of the gradients, it will tend to zero which results in vanishing gradients problems and that impact on weight update.

A)Sigmoid函数的导数在0到0.25之间。 在梯度的链法则乘法期间,它将趋于零,这将导致梯度问题消失并影响权重更新。

15)Difference between micro average F1 and macro average F1 for a multiclass class classification?

15)对于多类分类,微观平均值F1和宏观平均值F1之间有区别吗?

  • F1 Score = 2*precision*Recall/(precision +recall)

    F1分数= 2 *精度*召回率/(精度+召回率)
  • For 3 classes Classification for each class, there will be respective True positives, False positives, True Negative, False Negative

    对于每个类别的3个类别,将分别有真阳性,假阳性,真阴性,假阴性

a)Micro Avg F1

a)Micro Avg F1

  • Microaverage of precision=TP1+TP2+TP3/(TP1+TP2+TP3+FP1+FP2+FP3)

    精度的微平均值= TP1 + TP2 + TP3 /(TP1 + TP2 + TP3 + FP1 + FP2 + FP3)

  • Microaverage of Recall=TP1+TP2+TP3/(TP1+TP2+TP3+FN1+FN2+FN3)

    召回率的微平均值= TP1 + TP2 + TP3 /(TP1 + TP2 + TP3 + FN1 + FN2 + FN3)

  • Micro Avg F1=2*precision*Recall/(precision +recall)

    Micro Avg F1 = 2 * precision * Recall /(precision + recall)

b) Macro-average Method

b)宏观平均法

  • Macroaverage of precision=P1+P2+P3/3

    精度的宏观平均值= P1 + P2 + P3 / 3

  • Macroaverage of Recall=R1+R2+R3/3

    召回率的宏平均值= R1 + R2 + R3 / 3

  • Where P1=TP1/(TP1+FP1) , R1=TP1/(TP1+FN1) Same for P2,P3

    其中P1 = TP1 /(TP1 + FP1),R1 = TP1 /(TP1 + FN1)与P2,P3相同
  • Macro Avg F1=2*precision*Recall/(precision +recall)

    宏平均F1 = 2 *精度*调用/(精度+调用)

16) Why Image augmentation help in Image classification tasks?

16)为什么图像增强在图像分类任务中有帮助?

A) Image data augmentation used to create or expand data by artificially generating new images from changing input images, such as translation, scaling, mirror, steering, Zoom, etc.Such that we can make our model be robust to input image change.

A)用于通过更改输入图像(例如平移,缩放,镜像,转向,缩放等)人工生成新图像来创建或扩展数据的图像数据增强,因此我们可以使模型对输入图像更改具有鲁棒性。

www.appliedaicourse.com

www.appliedaicourse.com

翻译自: https://medium.com/towards-artificial-intelligence/16-interview-questions-every-machine-learning-enthusiast-should-know-a4142d5e00cc

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值