spark处理大数据实例

weixin_51992731

于 2022-01-05 14:55:39 发布

阅读量720

点赞数

文章标签： spark 大数据 big data

本文链接：https://blog.csdn.net/weixin_51992731/article/details/122323747

版权

    def get_debt_rank(self):
        '''
        统计各年龄段债务情况（严重/不严重）
        :return:
        '''
        all_list=[]
        bin = [0, 30, 45, 60, 75, 100]
        df_age_debt = df.select(df['age'], df['DebtRatio'])
        age_debt_y = []
        for i in range(5):
            y0 = df_age_debt.filter(df['age'].between(bin[i], bin[i + 1])). \
                filter(df['DebtRatio'] < 1).count()
            print(y0)
            y1 = df_age_debt.filter(df['age'].between(bin[i], bin[i + 1])). \
                filter(df['DebtRatio'] >= 1).count()
            print(y1)
            age_debt_y.append([y0, y1])
        all_list.append(age_debt_y)
        # 数据可视化data_web.py
        return all_list

统计各年龄情况

    def XiangGuanXing(self):
        feature=['y','age','30-59days','DebtRatio&#

最低0.47元/天解锁文章

weixin_51992731

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
spark处理大数据实例

def get_debt_rank(self): ''' 统计各年龄段债务情况（严重/不严重） :return: ''' all_list=[] bin = [0, 30, 45, 60, 75, 100] df_age_debt = df.select(df['age'], df['DebtRatio']) age_debt_y = [] for i i...
复制链接

扫一扫