《机器学习工具与方法》--- WEKA实战二

《机器学习工具与方法》— WEKA实战二


第一题

题目:Glass.arff-Classify-Ibk-10折交叉验证选择元学习器FilteredClassifier-IBk分类器,并选择AddNoise无监督属性过滤进行数据分析,同时可以根据数据画图进行结果分析。

解答:

  1. 导入玻璃数据集,分类器中选中weka.classifiers.meta.FilteredClassifier,选择元学习器weka.classifiers.lazy.IBk,即k邻近算法进行分类,同时选中过滤器weka.filters.unsupervised.attribute.AddNoise以增加数据噪声。操作截图如下:

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传image-20211016162023588.png

  2. 设置K邻近算法的K分别等于1,2,3 ,设置噪声百分比从0%增加到100%,进行分类正确率的汇总:

噪声百分比K = 1K = 2K = 3
0%70.5667.7571.96
10%61.2166.8270.56
20%52.8060.7465.88
30%45.7955.1461.24
40%36.9247.2050.00
50%33.1841.1243.46
60%27.5736.9238.79
70%20.5628.9729.91
80%16.8222.9023.83
90%12.6217.2919.63
100%6.077.947.01

表格单位填写的是在叠加噪声后数据在K邻近算法下的十折交叉验证分类正确率。

  1. 绘图分析:横坐标代表噪声百分比,纵坐标代表分类正确率。image-20211016171312061.png

    结合上图我们发现:

    • 当噪声增大时,分类准确率随之降低

    • k值对分类正确率的影响需要分情况考量,增大k值会抑制噪声,增加分类准确率;k值过大且噪声百分比较小时,会降低分类准确率。

    • 数据集会受到噪声的干扰,k邻近学习需要找到合适的k值,既能抑制噪声,又不会显著降低分类准确率。

第二题

题目:选择两个分类器进行实验,比较Glass-Ibk-J48、FilteredClassifier-Resample,进行不同采样百分比,进行分类实验。

解答:

  1. 导入glass数据集,分类器选中FilteredClassifier,其中classifier选中Ibk(K邻近 K=1)或J48(决策树),Filter选中resemple(重采样大小设置为10%~100%)操作如下图所示:

    image-20211016230941808.png

  2. 不断修改重采样比率填写下表:

训练集百分比IBK(K邻近算法)J48(决策树算法)
10%54.2145.33
20%56.0747.66
30%57.4857.01
40%62.6257.94
50%63.5561.22
60%64.4963.08
70%63.5564.49
80%66.8263.55
90%68.2263.55
100%66.8264.95

填入数据为算法在对应重采样下的分类准确率,单位为%。

  1. 绘图分析:

    image-20211016232836051.png

    从上图中我们可以发现:

    • 当增大训练数据量时,分类准确率会随之增加
    • 相对于Ibk,增大训练数据量对J48的影响更显著
  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
目录列表: 2dplanes.arff abalone.arff ailerons.arff Amazon_initial_50_30_10000.arff anneal.arff anneal.ORIG.arff arrhythmia.arff audiology.arff australian.arff auto93.arff autoHorse.arff autoMpg.arff autoPrice.arff autos.arff auto_price.arff balance-scale.arff bank.arff bank32nh.arff bank8FM.arff baskball.arff bodyfat.arff bolts.arff breast-cancer.arff breast-w.arff breastTumor.arff bridges_version1.arff bridges_version2.arff cal_housing.arff car.arff cholesterol.arff cleveland.arff cloud.arff cmc.arff colic.arff colic.ORIG.arff contact-lenses.arff cpu.arff cpu.with.vendor.arff cpu_act.arff cpu_small.arff credit-a.arff credit-g.arff cylinder-bands.arff delta_ailerons.arff delta_elevators.arff dermatology.arff detroit.arff diabetes.arff diabetes_numeric.arff echoMonths.arff ecoli.arff elevators.arff elusage.arff eucalyptus.arff eye_movements.arff fishcatch.arff flags.arff fried.arff fruitfly.arff gascons.arff glass.arff grub-damage.arff heart-c.arff heart-h.arff heart-statlog.arff hepatitis.arff house_16H.arff house_8L.arff housing.arff hungarian.arff hypothyroid.arff ionosphere.arff iris.2D.arff iris.arff kdd_coil_test-1.arff kdd_coil_test-2.arff kdd_coil_test-3.arff kdd_coil_test-4.arff kdd_coil_test-5.arff kdd_coil_test-6.arff kdd_coil_test-7.arff kdd_coil_train-1.arff kdd_coil_train-3.arff kdd_coil_train-4.arff kdd_coil_train-5.arff kdd_coil_train-6.arff kdd_coil_train-7.arff kdd_el_nino-small.arff kdd_internet_usage.arff kdd_ipums_la_97-small.arff kdd_ipums_la_98-small.arff kdd_ipums_la_99-small.arff kdd_JapaneseVowels_test.arff kdd_JapaneseVowels_train.arff kdd_synthetic_control.arff kdd_SyskillWebert-Bands.arff kdd_SyskillWebert-BioMedical.arff kdd_SyskillWebert-Goats.arff kdd_SyskillWebert-Sheep.arff kdd_UNIX_user_data.arff kin8nm.arff kr-vs-kp.arff labor.arff landsat_test.arff landsat_train.arff letter.arff liver-disorders.arff longley.arff lowbwt.arff lung-cancer.arff lymph.arff machine_cpu.arff mbagrade.arff meta.arff mfeat-factors.arff mfeat-fourier.arff mfeat-karhunen.arff mfeat-morphological.arff mfeat-pixel.arff mfeat-zernike.arff molecular-biology_promoters.arff monks-problems-1_test.arff monks-problems-1_train.arff monks-problems-2_test.arff monks-problems-2_train.arff monks-problems-3_test.arff monks-problems-3_train.arff mushroom.arff mv.arff nursery.arff optdigits.arff page-blocks.arff pasture.arff pbc.arff pendigits.arff pharynx.arff pol.arff pollution.arff postoperative-patient-data.arff primary-tumor.arff puma32H.arff puma8NH.arff pwLinear.arff pyrim.arff quake.arff ReutersCorn-test.arff ReutersCorn-train.arff ReutersGrain-test.arff ReutersGrain-train.arff schlvote.arff segment-challenge.arff segment-test.arff segment.arff sensory.arff servo.arff sick.arff sleep.arff solar-flare_1.arff solar-flare_2.arff sonar.arff soybean.arff spambase.arff spectf_test.arff spectf_train.arff spectrometer.arff spect_test.arff spect_train.arff splice.arff sponge.arff squash-stored.arff squash-unstored.arff stock.arff strike.arff supermarket.arff triazines.arff unbalanced.arff vehicle.arff veteran.arff vineyard.arff vote.arff vowel.arff water-treatment.arff waveform-5000.arff weather.nominal.arff weather.numeric.arff white-clover.arff wine.arff wisconsin.arff zoo.arff

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值