【算法】JAVA算法库

一,java算法库总结表格

java算法库列举
 库/框架丰富程度支持的算法特点CPU/GPU学习成本依赖冗余行业应用优点缺点发展前景
Mahout中等推荐、聚类、分类等分布式机器学习算法可在CPU上运行,不支持GPU加速中等依赖较少推荐系统、大数据分析等领域提供了分布式机器学习算法和工具不支持深度学习算法发展较慢,社区活跃度相对较低
Hadoop MLlib分类、回归、聚类等分布式机器学习算法可在CPU上运行,不支持GPU加速中等依赖较少大数据分析等领域提供了丰富的分布式机器学习算法不支持深度学习算法发展前景看好
FlinkML分类、回归、聚类等流式和批量机器学习算法可在CPU上运行,不支持GPU加速中等依赖较少金融、电信等领域流式计算能力强不支持深度学习算法发展前景看好
Colt数值计算、线性代数等提供各种数学、统计和矩阵运算功能可在CPU上运行,不支持GPU加速简单依赖较少科学计算、统计分析等领域提供了丰富的数学和统计算法,性能优越不支持机器学习和深度学习算法稳定发展,持续维护
Coffee深度学习提供了深度学习算法和工具可在CPU上运行,不支持GPU加速高 依赖较少图像识别、自然语言处理等领域支持多种深度学习算法,使用方便,支持Keras模型的迁移 社区活跃度相对较低新兴框架,发展前景未知

Tensor

Flow4j

深度学习提供了深度学习算法和工具可在CPU和GPU上运行较高依赖较少图像识别、自然语言处理等领域强大的生态系统,广泛应用对于初学者来说较为复杂以人工智能为核心的发展前景广阔
DL4J深度学习提供了深度学习算法和工具可在CPU和GPU上运行 较高需要安装依赖库图像识别、自然语言处理等领域强大的生态系统,广泛应用对于初学者来说较为复杂增长迅速,前景看好
Apache Commons Math中等数值分析、线性代数、统计等提供各种数学算法和工具,包括插值、积分、矩阵运算等 可在CPU上运行,不支持GPU加速容易依赖冗余科学计算、数据分析、金融等领域提供了广泛的数学算法和工具,文档丰富不支持机器学习和深度学习算法稳定发展,持续维护
Smile 分类、回归、聚类等提供各种机器学习算法可在CPU上运行,不支持GPU加速简单依赖较少数据挖掘、模式识别等领域简洁易用,性能较好不支持深度学习算法稳定发展,社区活跃
Weka机器学习提供各种机器学习算法,包括分类、回归、聚类等可在CPU上运行,不支持GPU加速中等依赖冗余数据挖掘、模式识别等领域提供了丰富的机器学习算法和工具,易于使用不支持深度学习算法成熟稳定,广泛应用
H2O机器学习提供了丰富的机器学习算法和自动化特征工程可在CPU上运行,不支持GPU加速中等依赖较少金融、电信等领域 分布式处理能力强,提供了自动化特征工程的功能不支持深度学习算法发展前景看好

Deep

learning4j

深度学习提供了深度学习算法和工具可在CPU和GPU上运行依赖冗余图像识别、自然语言处理等领域支持分布式训练和推理,与Hadoop、Spark等框架集成对于初学者来说较为复杂增长迅速,前景看

  

二,代码举例

1、Mahout 示例代码:

这段代码展示了 Mahout 库中实现基于用户的推荐的方法,使用的数据集是电影评分数据集。

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;

import java.io.File;
import java.io.IOException;
import java.util.List;

public class MahoutDemo {
    public static void main(String[] args) throws IOException, TasteException {
        DataModel model = new FileDataModel(new File("data/movies.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);
        UserBasedRecommender recommender = new GenericUserBasedRecommender(model, neighborhood, similarity);
        List<RecommendedItem> recommendations = recommender.recommend(1, 3);
        for (RecommendedItem recommendation : recommendations) {
            System.out.println(recommendation);
        }
    }
}

2、Hadoop MLlib 示例代码:

       这段代码展示了使用 Hadoop MLlib 库实现 K-Means 聚类的方法,使用的数据集是一个三维数据点的集合。

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.mllib.clustering.KMeans;
import org.apache.spark.mllib.clustering.KMeansModel;
import org.apache.spark.mllib.linalg.Vector;
import org.apache.spark.mllib.linalg.Vectors;

public class MLlibDemo {
    public static void main(String[] args) {
        SparkConf conf = new SparkConf().setAppName("KMeansExample").setMaster("local");
        JavaSparkContext sc = new JavaSparkContext(conf);

        // Load and parse data
        String path = "data/kmeans_data.txt";
        JavaRDD<String> data = sc.textFile(path);
        JavaRDD<Vector> parsedData = data.map(s -> {
            String[] sarray = s.split(" ");
            double[] values = new double[sarray.length];
            for (int i = 0; i < sarray.length; i++) {
                values[i] = Double.parseDouble(sarray[i]);
            }
            return Vectors.dense(values);
        });
        parsedData.cache();

        // Cluster the data into two classes using KMeans
        int numClusters = 2;
        int numIterations = 20;
        KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters, numIterations);

        // Evaluate clustering by computing Within Set Sum of Squared Errors
        double WSSSE = clusters.computeCost(parsedData.rdd());
        System.out.println("Within Set Sum of Squared Errors = " + WSSSE);

        // Save and load model
        clusters.save(sc.sc(), "data/KMeansModel");
        KMeansModel sameModel = KMeansModel.load(sc.sc(), "data/KMeansModel");
    }
}

3、FlinkML 示例代码:

        这段代码展示了使用 FlinkML 库实现 K-Means 聚类的方法,使用的数据集同样是一个三维数据点的集合。

import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.io.CsvReader;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.ml.clustering.KMeans;
import org.apache.flink.ml.common.LabeledVector;
import org.apache.flink.ml.math.DenseVector;
import org.apache.flink.ml.math.Vector;

public class FlinkMLDemo {
    public static void main(String[] args) throws Exception {
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

        CsvReader csvReader = env.readCsvFile("data/kmeans_data.txt")
                .fieldDelimiter(" ")
                .ignoreComments("%")
                .includeFields(false, false, true);

        KMeans kMeans = new KMeans()
                .setK(2)
                .setMaxIterations(20);

        LabeledVector[] points = csvReader
                .map(tuple -> new Tuple2<>((double) tuple.f0, (double) tuple.f1))
                .map(tuple -> new LabeledVector(tuple.f0, DenseVector.of(tuple.f1)))
                .collect()
                .toArray(new LabeledVector[0]);

        Vector[] centers = kMeans.fit(points).centroids();

        for (Vector center : centers) {
            System.out.println(center);
        }
    }
}

4、Colt 示例代码:

      这段代码展示了使用 Colt 库中的矩阵运算和向量运算功能,分别进行了两个矩阵的矩阵乘法和一个向量的平方操作。

import cern.colt.list.DoubleArrayList;
import cern.colt.list.IntArrayList;
import cern.colt.matrix.DoubleFactory2D;
import cern.colt.matrix.DoubleMatrix2D;
import cern.jet.math.Functions;

public class ColtDemo {
    public static void main(String[] args) {
        DoubleArrayList x = new DoubleArrayList(new double[]{1, 2, 3});
        DoubleArrayList y = new DoubleArrayList(new double[]{4, 5, 6});

        DoubleMatrix2D a = DoubleFactory2D.dense.make(new double[][]{{1, 2}, {3, 4}});
        DoubleMatrix2D b = DoubleFactory2D.dense.make(new double[][]{{5, 6}, {7, 8}});

        DoubleMatrix2D c = a.zMult(b, null);

        System.out.println(c);

        DoubleArrayList z = new DoubleArrayList(new double[]{1, 2, 3, 4, 5});
        IntArrayList index = new IntArrayList(new int[]{0, 2, 4});

        z.viewSelection(index).assign(Functions.square);

        System.out.println(z);
    }
}

5、Coffee 示例代码:

         这段代码展示了使用 Coffee 库中的逻辑回归算法对鸢尾花数据集进行分类的方法。

import com.alibaba.alink.operator.batch.classification.LogisticRegressionTrainBatchOp;
import com.alibaba.alink.operator.batch.source.CsvSourceBatchOp;
import com.alibaba.alink.pipeline.Pipeline;
import com.alibaba.alink.pipeline.classification.LogisticRegression;
import com.alibaba.alink.pipeline.dataproc.vector.VectorAssembler;

public class CoffeeDemo {
    public static void main(String[] args) throws Exception {
        CsvSourceBatchOp source = new CsvSourceBatchOp()
                .setFilePath("data/iris.csv")
                .setFieldDelimiter(",");

        VectorAssembler assembler = new VectorAssembler()
                .setSelectedCols(new String[]{"sepal_length", "sepal_width", "petal_length", "petal_width"})
                .setOutputCol("features");

        LogisticRegressionTrainBatchOp lr = new LogisticRegressionTrainBatchOp()
                .setLabelCol("class")
                .setFeatureCols("features")
                .setEpsilon(0.001)
                .setWithIntercept(true);

        Pipeline pipeline = new Pipeline().add(assembler).add(lr);
        pipeline.fit(source).transform(source).print();
    }
}

6、TensorFlow4j示例代码

        这段代码展示了如何使用 TensorFlow Java API 创建一个简单的计算图,并输出一个字符串常量节点的值。

import org.tensorflow.Graph;
import org.tensorflow.Session;
import org.tensorflow.Tensor;

public class TensorFlowJavaDemo {
    public static void main(String[] args) {
        try (Graph graph = new Graph()) {
            // 创建计算图
            final String value = "Hello, TensorFlow!";
            try (Tensor<String> tensor = Tensor.create(value.getBytes())) {
                // 在计算图中定义输入节点
                graph.opBuilder("Const", "MyConst").setAttr("dtype", tensor.dataType()).setAttr("value", tensor).build();
            }

            // 创建会话并运行计算图
            try (Session session = new Session(graph)) {
                try (Tensor<?> output = session.runner().fetch("MyConst").run().get(0)) {
                    // 获取输出节点的值
                    byte[] bytes = new byte[(int) output.numBytes()];
                    output.copyTo(bytes);
                    System.out.println(new String(bytes));
                }
            }
        }
    }
}

7、DL4J 示例代码:

        这段代码展示了使用 DL4J(Deeplearning4j)库实现一个简单的多层感知器(MLP)模型,对 MNIST 数据集进行手写数字分类。

import org.deeplearning4j.datasets.iterator.impl.MnistDataSetIterator;
import org.deeplearning4j.nn.api.OptimizationAlgorithm;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.dataset.DataSet;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.lossfunctions.LossFunctions;

public class DL4JDemo {
    public static void main(String[] args) throws Exception {
        int batchSize = 64;
        int numClasses = 10;
        int numEpochs = 10;

        DataSetIterator mnistTrain = new MnistDataSetIterator(batchSize, true, 12345);
        DataSetIterator mnistTest = new MnistDataSetIterator(batchSize, false, 12345);

        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .seed(12345)
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .weightInit(WeightInit.XAVIER)
                .list()
                .layer(0, new DenseLayer.Builder()
                        .nIn(784)
                        .nOut(256)
                        .activation(Activation.RELU)
                        .build())
                .layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nIn(256)
                        .nOut(numClasses)
                        .activation(Activation.SOFTMAX)
                        .build())
                .build();

        org.deeplearning4j.nn.multilayer.MultiLayerNetwork model = new org.deeplearning4j.nn.multilayer.MultiLayerNetwork(conf);
        model.init();
        model.setListeners(new ScoreIterationListener(10));

        for (int i = 0; i < numEpochs; i++) {
            model.fit(mnistTrain);
        }

        org.nd4j.evaluation.classification.Evaluation evaluation = model.evaluate(mnistTest);
        System.out.println(evaluation.stats());
    }
}

8、Apache Commons Math 示例代码:

        这段代码展示了使用 Apache Commons Math 库计算一组数据的均值、方差、最大值和中位数。

import org.apache.commons.math3.stat.StatUtils;
import org.apache.commons.math3.stat.descriptive.DescriptiveStatistics;

public class ApacheCommonsMathDemo {
    public static void main(String[] args) {
        double[] data = {1.2, 2.3, 0.8, 3.9, 2.1};

        double mean = StatUtils.mean(data);
        System.out.println("Mean: " + mean);

        double variance = StatUtils.variance(data);
        System.out.println("Variance: " + variance);

        double max = StatUtils.max(data);
        System.out.println("Max: " + max);

        DescriptiveStatistics stats = new DescriptiveStatistics(data);
        double median = stats.getPercentile(50);
        System.out.println("Median: " + median);
    }
}

9、Smile 示例代码:

        这段代码展示了使用 Smile 库实现一个随机森林模型,对鸢尾花数据集进行分类,并对新样本进行预测。

import smile.data.AttributeDataset;
import smile.data.parser.ArffParser;
import smile.classification.RandomForest;

public class SmileDemo {
    public static void main(String[] args) throws Exception {
        ArffParser parser = new ArffParser();
        parser.setResponseIndex(4);

        AttributeDataset dataset = parser.parse("data/iris.arff");
        double[][] x = dataset.toArray(new double[dataset.size()][]);
        int[] y = dataset.toArray(new int[dataset.size()]);

        RandomForest model = new RandomForest(x, y, 100);
        int prediction = model.predict(new double[]{5.1, 3.5, 1.4, 0.2});
        System.out.println("Prediction: " + prediction);
    }
}

10、Weka 示例代码:

        这段代码展示了使用 Weka 库实现一个线性回归模型,并对鸢尾花数据集进行建模和预测。

import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
import weka.classifiers.functions.LinearRegression;

public class WekaDemo {
    public static void main(String[] args) throws Exception {
        DataSource source = new DataSource("data/iris.arff");
        Instances dataset = source.getDataSet();
        dataset.setClassIndex(dataset.numAttributes() - 1);

        LinearRegression model = new LinearRegression();
        model.buildClassifier(dataset);

        double[] coefficients = model.coefficients();
        System.out.println("Coefficients: ");
        for (double coefficient : coefficients) {
            System.out.println(coefficient);
        }
    }
}

11、H2O 示例代码:

        这段代码展示了使用 H2O 库加载一个梯度提升模型(GBM),并对新样本进行二分类预测。

import hex.genmodel.MojoModel;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.prediction.BinomialModelPrediction;

public class H2ODemo {
    public static void main(String[] args) throws Exception {
        MojoModel model = MojoModel.load("data/GBM_model.zip");

        EasyPredictModelWrapper wrapper = new EasyPredictModelWrapper(model);

        RowData row = new RowData();
        row.put("sepal_length", "5.1");
        row.put("sepal_width", "3.5");
        row.put("petal_length", "1.4");
        row.put("petal_width", "0.2");

        BinomialModelPrediction prediction = wrapper.predictBinomial(row);
        System.out.println("Prediction: " + prediction.label);
    }
}

12、Deeplearning4j 示例代码:

        这段代码展示了使用 Deeplearning4j 库实现一个多层感知器(MLP)模型,对鸢尾花数据集进行分类,并计算模型在测试集上的评估指标。

import org.deeplearning4j.datasets.iterator.impl.IrisDataSetIterator;
import org.deeplearning4j.eval.Evaluation;
import org.deeplearning4j.nn.api.OptimizationAlgorithm;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.iterator.MultipleEpochsIterator;
import org.nd4j.linalg.lossfunctions.LossFunctions;

public class Deeplearning4jDemo {
    public static void main(String[] args) {
        int batchSize = 30;
        int numClasses = 3;
        int numEpochs = 50;
        int seed = 123;

        DataSetIterator irisIterator = new IrisDataSetIterator(batchSize, true, seed);
        DataSetIterator multipleEpochsIterator = new MultipleEpochsIterator(numEpochs, irisIterator);

        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .seed(seed)
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .weightInit(WeightInit.XAVIER)
                .list()
                .layer(0, new DenseLayer.Builder()
                        .nIn(4)
                        .nOut(10)
                        .activation(Activation.RELU)
                        .build())
                .layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT)
                        .nIn(10)
                        .nOut(numClasses)
                        .activation(Activation.SOFTMAX)
                        .build())
                .build();

        org.deeplearning4j.nn.multilayer.MultiLayerNetwork model = new org.deeplearning4j.nn.multilayer.MultiLayerNetwork(conf);
        model.init();
        model.setListeners(new ScoreIterationListener(10));

        model.fit(multipleEpochsIterator);

        Evaluation evaluation = model.evaluate(irisIterator);
        System.out.println(evaluation.stats());
    }
}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

shinelord明

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值