逻辑回归的分布式实现 [Logistic Regression / Machine Learning / Spark ]

最新推荐文章于 2022-10-20 19:42:22 发布

weixin_30940783

最新推荐文章于 2022-10-20 19:42:22 发布

阅读量505

点赞数

文章标签：大数据数据结构与算法 python

原文链接：http://www.cnblogs.com/freyr/p/4501039.html

版权

1- 问题提出

2- 逻辑回归

3- 理论推导

4- Python/Spark实现

 1 # -*- coding: utf-8 -*-
 2 from pyspark import SparkContext
 3 from math import *
 4 
 5 theta = [0, 0, 0]    #初始theta值
 6 alpha = 0.001    #学习速率
 7 
 8 def inner(x, y):
 9     return sum([i*j for i,j in zip(x,y)])
10         
11 def func(lst):
12     h = (1 + exp(-inner(lst, theta)))**(-1)
13     return map(lambda x: (h - lst[-1]) * x, lst[:-1])
14 
15 
16 sc = SparkContext('local')
17 
18 rdd = sc.textFile('/home/freyr/logisticRegression.txt')\
19         .map(lambda line: map(float, line.strip().split(',')))\
20         .map(lambda lst: [1]+lst)
21 
22 
23 for i in range(400):
24     partheta = rdd.map(func)\
25                    .reduce(lambda x,y: [i+j for i,j in zip(x,y)])
26 
27     for j in range(3):
28         theta[j] = theta[j] - alpha * partheta[j]
29 
30 print 'theta = %s' % theta

PS: logisticRegression.txt

转载于:https://www.cnblogs.com/freyr/p/4501039.html

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_30940783

关注关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
逻辑回归的分布式实现 [Logistic Regression / Machine Learning / Spark ]

1- 问题提出2- 逻辑回归3- 理论推导4- Python/Spark实现 1 # -*- coding: utf-8 -*- 2 from pyspark import SparkContext 3 from math import * 4 5 theta = [0, 0, 0] #初始the...
复制链接

扫一扫