python 进行线性回归_使用Python进行数据科学:进行线性回归和测量速度的8种方法...

python 进行线性回归by Tirthajyoti Sarkar 由Tirthajyoti Sarkar In this article, we discuss 8 ways to perform simple linear regression using Python code/packages. We gloss over their pros and cons, and show t...
摘要由CSDN通过智能技术生成

python 进行线性回归

by Tirthajyoti Sarkar

由Tirthajyoti Sarkar

In this article, we discuss 8 ways to perform simple linear regression using Python code/packages. We gloss over their pros and cons, and show their relative computational complexity measure.

在本文中,我们讨论了使用Python代码/包执行简单线性回归的8种方法。 我们掩盖了它们的优缺点,并展示了它们相对的计算复杂性。

For many data scientists, linear regression is the starting point of many statistical modeling and predictive analysis projects. The importance of fitting (accurately and quickly) a linear model to a large data set cannot be overstated. As pointed out in this article, ‘LINEAR’ term in the linear regression model refers to the coefficients, and not to the degree of the features.

对于许多数据科学家而言, 线性回归是许多统计建模和预测分析项目的起点。 (准确,快速地)将线性模型拟合到大数据集的重要性不可夸大。 正如本文所指出的那样,线性回归模型中的“ 线性 ”一词是指系数,而不是特征的程度。

Features (or independent variables) can be of any degree or even transcendental functions like exponential, logarithmic, sinusoidal. Thus, a large body of natural phenomena can be modeled (approximately) using these transformations and linear model even if the functional relationship between the output and features are highly nonlinear.

特征(或自变量)可以具有任何程度,甚至可以具有超越函数,例如指数,对数,正弦曲线。 因此,即使输出和特征之间的函数关系是高度非线性的,也可以使用这些变换和线性模型来建模(近似)大量自然现象。

On the other hand, Python is fast emerging as the de-facto programming language of choice for data scientists. Therefore, it is critical for a data scientist to be aware of all the various methods he/she can quickly fit a linear model to a fairly large data set and asses the relative importance of each feature in the outcome of the process.

另一方面,PythonSwift崛起,成为数据科学家首选的事实上的编程语言 。 因此,对于数据科学家而言,至关重要的是要了解他/她可以快速将线性模型拟合到相当大的数据集并评估每个特征在处理结果中的相对重要性的所有各种方法。

However, is there only one way to perform linear regression analysis in Python? In case of multiple available options, how to choose the most effective method?
但是,只有一种方法可以在Python中执行线性回归分析吗? 如果有多个可用选项,如何选择最有效的方法?

Because of the wide popularity of the machine learning library scikit-learn, a common approach is often to call the Linear Model class from that library and fit the data. While this can offer additional advantages of applying other pipeline features of machine learning (e.g. data normalization, model coefficient regularization, feeding the linear model to another downstream model), this is often not the fastest or cleanest method when a data analyst needs just a quick and easy way to determine the regression coefficients (and some basic associated statistics).

由于机器学习库scikit-learn的广泛普及,通常的方法通常是从该库中调用Linear Model类并拟合数据。 虽然这可以提供应用机器学习的其他管道功能 (例如,数据标准化,模型系数正则化,将线性模型馈送到另一个下游模型)的其他优点,但是当数据分析人员只需要快速分析时,这通常不是最快或最干净的方法。确定回归系数的

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值