matlab实现C45算法并绘图,matlab画各种概率分布图像

Matlab画各种概率分布函数曲线

help PDF

PDF Density function for a specified distribution.

Y = PDF(NAME,X,A) returns an array of values of the probability density

function for the one-parameter probability distribution specified by NAME

with parameter values A, evaluated at the values in X.

Y = PDF(NAME,X,A,B) or Y = PDF(NAME,X,A,B,C) returns values of the

probability density function for a two- or three-parameter probability

distribution with parameter values A, B (and C).

The size of Y is the common size of the input arguments. A scalar input

functions as a constant matrix of the same size as the other inputs. Each

element of Y contains the probability density evaluated at the

corresponding elements of the inputs.

Acceptable strings for name are:

?'beta' (Beta distribution)

?'bino' (Binomial distribution)二项分布

?'chi2' (Chi-square distribution)卡方分布

?'exp' (Exponential distribution)指数分布

?'ev' (Extreme value distribution) 极值分布

?'f' (F distribution) 费雪分布

?'gam' (Gamma distribution) 伽马分布

?'gev' (Generalized extreme value distribution)

?'gp' (Generalized Pareto distribution) 极值分布函数

?'geo' (Geometric distribution)几何分布

?'hyge' (Hypergeometric distribution) 超几何分布

?'logn' (Lognormal distribution) 对数正态分布

?'nbin' (Negative binomial distribution) 负二项分布

?'ncf' (Noncentral F distribution) 非中心F分布

?'nct' (Noncentral t distribution)非中心t分布

?'ncx2' (Noncentral chi-square distribution) 非中心卡方分布?'norm' (Normal distribution) 正态分布

?'poiss' (Poisson distribution) 泊松分布

?'rayl' (Rayleigh distribution) 瑞利分布

?'t' (t distribution) t分布

?'unif' (Uniform distribution) 均匀分布

?'unid' (Discrete uniform distribution) 离散均匀分布

?'wbl' (Weibull distribution)韦伯分布

决策树是一种常用的分类算法,其中C4.5决策树算法是决策树的一种改进算法。下面是一个基于Python的C4.5决策树算法实现示例: 首先,我们需要定义一个节点类,用于存储每个节点的信息和属性。 ```python class Node: def __init__(self, attribute=None, threshold=None, label=None, left=None, right=None): self.attribute = attribute # 属性名 self.threshold = threshold # 划分阈值 self.label = label # 叶子节点的类别 self.left = left # 左子节点 self.right = right # 右子节点 ``` 然后,我们需要定义一个C4.5决策树算法类,其中包含以下方法: 1. `__init__`:初始化决策树模型。 2. `entropy`:计算数据集的熵。 3. `conditional_entropy`:计算数据集在某个属性上的条件熵。 4. `information_gain`:计算信息增益。 5. `majority_vote`:统计数据集中出现最多的类别。 6. `build_tree`:构建决策树。 7. `predict`:预测新样本的类别。 ```python import numpy as np from collections import Counter class C45DecisionTree: def __init__(self, max_depth=5, min_samples_split=2): self.max_depth = max_depth # 最大深度 self.min_samples_split = min_samples_split # 最小分割样本数 def entropy(self, y): """计算数据集的熵""" counter = Counter(y) probs = [count / len(y) for count in counter.values()] return -sum(p * np.log2(p) for p in probs) def conditional_entropy(self, X, y, feature_idx, threshold): """计算数据集在某个属性上的条件熵""" left_mask = X[:, feature_idx] < threshold right_mask = X[:, feature_idx] >= threshold left_probs = len(y[left_mask]) / len(y) right_probs = len(y[right_mask]) / len(y) left_entropy = self.entropy(y[left_mask]) right_entropy = self.entropy(y[right_mask]) return left_probs * left_entropy + right_probs * right_entropy def information_gain(self, X, y, feature_idx, threshold): """计算信息增益""" parent_entropy = self.entropy(y) child_entropy = self.conditional_entropy(X, y, feature_idx, threshold) return parent_entropy - child_entropy def majority_vote(self, y): """统计数据集中出现最多的类别""" counter = Counter(y) most_common = counter.most_common(1) return most_common[0][0] def build_tree(self, X, y, depth=0): """构建决策树""" # 判断是否达到最大深度或最小分割样本数 if depth >= self.max_depth or len(y) < self.min_samples_split: return Node(label=self.majority_vote(y)) n_features = X.shape[1] best_feature, best_threshold, best_gain = None, None, 0 for feature_idx in range(n_features): # 计算每个属性的信息增益 thresholds = np.unique(X[:, feature_idx]) for threshold in thresholds: gain = self.information_gain(X, y, feature_idx, threshold) if gain > best_gain: best_feature, best_threshold, best_gain = feature_idx, threshold, gain # 判断是否需要划分 if best_gain > 0: left_mask = X[:, best_feature] < best_threshold right_mask = X[:, best_feature] >= best_threshold left_node = self.build_tree(X[left_mask], y[left_mask], depth + 1) right_node = self.build_tree(X[right_mask], y[right_mask], depth + 1) return Node(attribute=best_feature, threshold=best_threshold, left=left_node, right=right_node) # 无法划分,返回叶子节点 return Node(label=self.majority_vote(y)) def predict(self, X): """预测新样本的类别""" node = self.root while node.label is None: if X[node.attribute] < node.threshold: node = node.left else: node = node.right return node.label ``` 最后,我们可以使用该算法对一个数据集进行分类。 ```python # 导入数据集 from sklearn.datasets import load_iris iris = load_iris() X = iris.data y = iris.target # 构建决策树 model = C45DecisionTree(max_depth=5, min_samples_split=2) model.root = model.build_tree(X, y) # 对新样本进行分类 new_sample = [5.0, 3.6, 1.3, 0.25] label = model.predict(new_sample) print(label) ``` 以上代码输出结果为 `0`,表示该新样本属于第一类。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值