1、angel
https://github.com/Angel-ML
Angel is developed with Java and Scala. It supports running on Yarn and Kubernetes. With the PS Service abstraction, it provides two modules, namely Spark on Angel and Pytorch on Angel separately, which enables the integration of the power of Spark/PyTorch and Parameter Server for distributed training. Graph Computing and deep learning frameworks support is under development and will be released in the future.
2、kubeflow
a、分布式训练
b、pipeline
c、katib 超参数调优
3、sqlflow
4、elasticdl
5、mlflow
6、automl
7、applied machine learning
8、OAM