在openbayes上运行联邦学习代码

假期因为算力限制,必须要使用线上的算力平台,这里我总结了几个使用过的,以下都是个人使用感受,非常主观,如有不同,就是我错了 :)。

目前我试用过过的算力平台包括:

  • openbayes:yyds,这是我目前唯一继续使用的,用起来很舒服,虽然现在没有每周送3090的活动了,但是注册还是会送4090算力,其余的价格也真心不贵,物有所值

  • 九天必昇:早期我通过注册邀请白嫖了特别特别多的V100算力,但是实在是太卡了,太卡了,太卡了!还有就是抢不到算力,根本用不了(移动我错了

  • 驱动云:也是可以白嫖很多算力的平台,定期送券,但是不稳定,实在不适合跑大模型,很喜欢断开,连接不太稳定,然后算力不太清晰

  • autodl:autodl也是目前比较火的算力平台,很多同学在用,只不过我更习惯openbayes,就没有太多涉及,这个平台也是不错滴

用我的专属链接,注册 OpenBayes,新用户可获得 4 小时 RTX 4090 + 5小时 CPU 的免费使用时长,永久有效
https://openbayes.com/console/signup?r=Fywoooo_6Yu9

具体算力价格对比:

openbayes:

autodl:

驱动云:(不知道具体是什么卡,TFLOPS到底是多少)

九天必昇:(算力实在是抢不到)

具体使用感受

我现在主要使用openbayes,下面就通过一个联邦学习代码的例子来说明:

  1. 首先需要选择算力和torch/tensorflow等镜像版本

这里可以直接选择GPU版本,对新手还是比较友好的,以免错误下载了cpu版本:

然后可以直接git,我在这里用的是比较经典的PFLlip:TsingZ0/PFLlib: Personalized federated learning simulation platform with non-IID and unbalanced dataset (github.com)

2. 然后按照readme的提示安装环境就好,不过openbayes有个弊端就是每次重开都要重新配置环境,使用conda create的时候也喜欢有点小毛病,所以我个人更习惯在base环境上直接配

3. 根据readme的提示,直接使用``python generate_mnist.py iid - -```就好,这里我选择的是mnist数据集,iid的分法,在此代码中,dir是Practical non-IID,pat则是Pathological non-IID,划分后结果如下:

(base) root@fywoooo-ubmarzkgq3wf-main:/openbayes/home/PFLlib/dataset# python generate_mnist.py iid - - 
Client 0 Size of data: 1882 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 71), (1, 280), (2, 83), (3, 300), (4, 336), (5, 67), (6, 137), (7, 253), (8, 111), (9, 244)] -------------------------------------------------- 
Client 1 Size of data: 2922 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 269), (1, 391), (2, 347), (3, 230), (4, 250), (5, 294), (6, 300), (7, 267), (8, 253), (9, 321)] -------------------------------------------------- 
Client 2 Size of data: 1323 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 106), (1, 125), (2, 37), (3, 161), (4, 185), (5, 88), (6, 216), (7, 132), (8, 211), (9, 62)] -------------------------------------------------- 
Client 3 Size of data: 2471 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 289), (1, 180), (2, 314), (3, 314), (4, 217), (5, 271), (6, 232), (7, 61), (8, 337), (9, 256)] -------------------------------------------------- 
Client 4 Size of data: 1738 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 237), (1, 46), (2, 77), (3, 190), (4, 293), (5, 241), (6, 228), (7, 177), (8, 79), (9, 170)] -------------------------------------------------- 
Client 5 Size of data: 2176 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 167), (1, 358), (2, 110), (3, 348), (4, 287), (5, 127), (6, 210), (7, 248), (8, 178), (9, 143)] -------------------------------------------------- 
Client 6 Size of data: 1949 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 178), (1, 356), (2, 60), (3, 118), (4, 296), (5, 300), (6, 88), (7, 152), (8, 268), (9, 133)] -------------------------------------------------- 
Client 7 Size of data: 1423 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 163), (1, 61), (2, 342), (3, 201), (4, 86), (5, 41), (6, 49), (7, 335), (8, 79), (9, 66)] -------------------------------------------------- 
Client 8 Size of data: 1869 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 105), (1, 352), (2, 143), (3, 171), (4, 36), (5, 310), (6, 204), (7, 170), (8, 336), (9, 42)] -------------------------------------------------- 
Client 9 Size of data: 1768 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 271), (1, 40), (2, 331), (3, 323), (4, 110), (5, 183), (6, 54), (7, 241), (8, 97), (9, 118)] -------------------------------------------------- 
Client 10 Size of data: 1961 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 315), (1, 355), (2, 49), (3, 301), (4, 183), (5, 233), (6, 152), (7, 220), (8, 69), (9, 84)] -------------------------------------------------- 
Client 11 Size of data: 1840 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 212), (1, 248), (2, 98), (3, 314), (4, 237), (5, 179), (6, 312), (7, 60), (8, 67), (9, 113)] -------------------------------------------------- 
Client 12 Size of data: 2177 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 310), (1, 303), (2, 230), (3, 178), (4, 297), (5, 171), (6, 148), (7, 173), (8, 164), (9, 203)] -------------------------------------------------- 
Client 13 Size of data: 1836 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 288), (1, 255), (2, 59), (3, 274), (4, 111), (5, 224), (6, 131), (7, 235), (8, 117), (9, 142)] -------------------------------------------------- 
Client 14 Size of data: 2094 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 286), (1, 180), (2, 260), (3, 122), (4, 234), (5, 125), (6, 215), (7, 345), (8, 82), (9, 245)] -------------------------------------------------- 
Client 15 Size of data: 1961 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 190), (1, 154), (2, 316), (3, 316), (4, 109), (5, 91), (6, 44), (7, 361), (8, 322), (9, 58)] -------------------------------------------------- 
Client 16 Size of data: 1817 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 84), (1, 160), (2, 187), (3, 278), (4, 77), (5, 183), (6, 130), (7, 284), (8, 287), (9, 147)] -------------------------------------------------- 
Client 17 Size of data: 1407 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 102), (1, 69), (2, 138), (3, 109), (4, 54), (5, 161), (6, 217), (7, 57), (8, 190), (9, 310)] -------------------------------------------------- 
Client 18 Size of data: 1544 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 249), (1, 110), (2, 56), (3, 225), (4, 64), (5, 251), (6, 90), (7, 332), (8, 89), (9, 78)] -------------------------------------------------- 
Client 19 Size of data: 33842 Labels: [0 1 2 3 4 5 6 7 8 9] Samples of labels: [(0, 3011), (1, 3854), (2, 3753), (3, 2668), (4, 3362), (5, 2773), (6, 3719), (7, 3190), (8, 3489), (9, 4023)] -------------------------------------------------- 
Total number of samples: 70000 The number of train samples: [1411, 2191, 992, 1853, 1303, 1632, 1461, 1067, 1401, 1326, 1470, 1380, 1632, 1377, 1570, 1470, 1362, 1055, 1158, 25381] The number of test samples: [471, 731, 331, 618, 435, 544, 488, 356, 468, 442, 491, 460, 545, 459, 524, 491, 455, 352, 386, 8461] Saving to disk. Finish generating dataset.

4. 这里已经创建好了数据集,然后进入system文件夹里的主函数,可以根据需要设定相关参数

parser = argparse.ArgumentParser() # 
general parser.add_argument('-go', "--goal", type=str, default="test", help="The goal for this experiment") 
parser.add_argument('-dev', "--device", type=str, default="cuda", choices=["cpu", "cuda"]) parser.add_argument('-did', "--device_id", type=str, default="0") parser.add_argument('-data', "--dataset", type=str, default="mnist") 
parser.add_argument('-nb', "--num_classes", type=int, default=10) parser.add_argument('-m', "--model", type=str, default="cnn") 
parser.add_argument('-lbs', "--batch_size", type=int, default=10) parser.add_argument('-lr', "--local_learning_rate", type=float, default=0.005, help="Local learning rate") parser.add_argument('-ld', "--learning_rate_decay", type=bool, default=False) parser.add_argument('-ldg', "--learning_rate_decay_gamma", type=float, default=0.99) parser.add_argument('-gr', "--global_rounds", type=int, default=2000) parser.add_argument('-ls', "--local_epochs", type=int, default=1, help="Multiple update steps in one local epoch.") 
parser.add_argument('-algo', "--algorithm", type=str, default="FedAvg") parser.add_argument('-jr', "--join_ratio", type=float, default=1.0, help="Ratio of clients per round") 
parser.add_argument('-rjr', "--random_join_ratio", type=bool, default=False, help="Random ratio of clients per round") 
parser.add_argument('-nc', "--num_clients", type=int, default=20, help="Total number of clients") 
parser.add_argument('-pv', "--prev", type=int, default=0, help="Previous Running times") parser.add_argument('-t', "--times", type=int, default=1, help="Running times") parser.add_argument('-eg', "--eval_gap", type=int, default=1, help="Rounds gap for evaluation") parser.add_argument('-dp', "--privacy", type=bool, default=False, help="differential privacy,差分隐私") 
...
 

5. 使用默认参数运行:``python main.py -data mnist -m cnn -algo FedAvg -gr 2000 -did 0 ,会打印参数与模型:

具体的训练过程如下:

-------------Round number: 0------------- 
Evaluate global model 
Averaged Train Loss: 2.3126 
Averaged Test Accurancy: 0.0581 
Averaged Test AUC: 0.4349 
Std Test Accurancy: 0.0187 Std Test AUC: 0.0450 
------------------------- time cost ------------------------- 12.401304006576538 ----------

---Round number: 1------------- 
Evaluate global model 
Averaged Train Loss: 0.9367 
Averaged Test Accurancy: 0.9300 
Averaged Test AUC: 0.9899 
Std Test Accurancy: 0.0145 Std Test AUC: 0.0021 
------------------------- time cost ------------------------- 11.002075910568237 ----------

---Round number: 2------------- 
Evaluate global model 
Averaged Train Loss: 0.1904 
Averaged Test Accurancy: 0.9526 
Averaged Test AUC: 0.9930 
Std Test Accurancy: 0.0108 
Std Test AUC: 0.0018 
------------------------- time cost ------------------------- 11.484775066375732 
-------------Round number: 3------------- 
Evaluate global model 
Averaged Train Loss: 0.1185 
Averaged Test Accurancy: 0.9664 
Averaged Test AUC: 0.9951 
Std Test Accurancy: 0.0083 
Std Test AUC: 0.0013
 

  • 18
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值