Fuzzy-rough nearest neighbor classifier

最新推荐文章于 2021-02-28 17:14:18 发布

行走的五花肉

最新推荐文章于 2021-02-28 17:14:18 发布

阅读量404

点赞数

分类专栏：故障诊断

本文链接：https://blog.csdn.net/weixin_42545466/article/details/104489958

版权

故障诊断专栏收录该内容

2 篇文章 0 订阅

订阅专栏

Fuzzy-rough nearest neighbor classifier

Fuzzy-rough nearest neighbor classifier attempts to
enhance the conventional K-nearest neighbor classifier by exploiting fuzzy-rough uncertainty. While preserving the advantages of
the conventional K-nearest neighbor method, fuzzy-rough counterpart does not need to know the optimal value of K parameter;

在本文中提出的方法是将Fuzzy-rough nearest neighbor classifier与其他方法相结合。

In this paper, fuzzy-rough nearest neighbor classifier is combined with consistency-based subset evaluation and fuzzy-rough
instance selection method.
a fuzzy-rough set based instance

selection method is applied in order to significantly reduce the
number of instances, but maintaining high classification accuracy.
This instance selection approach uses the weak gamma evaluator.

（重要）前人提出的一种方法

Chen (2014) （A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection）presented a hybrid intelligent model for breast cancer diagnoses that can work in the absence of labeled training data. Hence, this work studies the feature selection methods in
unsupervised learning models. The model integrates clustering and feature se-lection. The study indicates that selecting a subset of relevant features instead of using all the features in the original data set can enhance the interpretability of clustering results.

另一种方法（The indiscernibility relation method处理缺失值）

Nahato, Harichandran, and Arputharaj (2015) have designed a
medical classification model which combines the rough set indiscernibility relation method and the backpropagation neural network. The indiscernibility relation method is utilized for handling missing values and obtaining the appropriate feature subset. They reported a classification accuracy of 98.60% for breast cancer diagnosis

都是将粗糙集模糊集和其他的方法进行融合，看一看粗糙集模糊集在其中起到了什么作用Fuzzy set based methods have been successfully applied for

building robust classification systems in medical and other fields.
For instance, Ganji and Abadeh (2011) combined fuzzy logic and
ant colony optimization for diagnosis of diabetes disease. Liu
et al. (2012) designed an enhanced fuzzy K-nearest neighbor classification algorithm for thyroid disease classification. In this algorithm, the particle swarm optimization algorithm was utilized for
parameter tuning. Shang and Barnes (2013) developed a classification model which combines the fuzzy-rough feature selection and
support vector machine algorithm. The combined model was
applied to Mars terrain image classification. Chen et al. (2013) presented a hybrid intelligent classification model which consists of
fuzzy K-nearest neighbor and principle component analysis for
Parkinson’s disease diagnosis. Rodger (2014a) developed a statistical model based on fuzzy nearest neighbor, regression and fuzzy
logic for improving energy costs savings in buildings. Rodger
(2014b) developed a fuzzy feasibility Bayesian probabilistic estimation model for a supply chain. Lee, Anaraki, Ahn, and An
(2015) developed a classification model based on fuzzy-rough feature selection and multi-tree genetic programming for intension
pattern recognition using brain signal.

在本文中将粗糙集模糊集作为数据预处理与分类器

Finally, an optimal training set is obtained via
fuzzy-rough instance selection and consistency-based feature
selection and this set is used to build a classification model based
on fuzzy-rough nearest neighbor classifier.

特征选择

In individual feature
selection methods, an evaluation metric, such as information gain,
signal to noise statistic, correlation coefficient, t-statistic, and
chi-square statistic, is computed for each feature and a ranking
of attributes based on their individual evaluations is obtained.
Correlation-based feature selection and consistency-based subset
evaluation are the two common types of feature subset selection
（MDPI A Feature Subset Selection Method Based On High-Dimensional Mutual Information）

看一下本论文提出的方法与其他论文提出特征选择部分的不同

Liu and Setiono (1996) proposed a probabilistic approach to feature selection, which evaluates the worth of a subset of attributes
by the level of consistency in the class values. The
consistency-based subset evaluation method uses the consistency
metric given by Eq. (1) (Hall & Holmes, 2003):
Consistencys ¼ 1 PJi¼0jDijjMij N ð1Þ
where s denotes an attribute subset, J represents the number of distinct combinations of attribute values for s, |Di| represents the number of occurrences of ith attribute value combination and |Mi|
represents the cardinality of the majority class for ith attribute
value combination and N denotes total number of instances in the
data set (Hall & Holmes, 2003).
The consistency-based subset evaluation method generates a
random subset, S, from the feature subset space (N) in every round
of the process. If the number of features © contained by S is less
than the current best subset, the inconsistency rate of data prescribed in S is checked against the inconsistency rate of the current
best subset. If S has more or equal consistency with the best subset,
then the best subset is replaced by S. The general structure of the
consistency-based feature selection method is outlined in Table 1.

特征选择算法

Any suitable search algorithm can be utilized for searching attribute selection.
Besides, three different approximation methods are implemented
（Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking）

进一步的工作构想

Moreover, the performance of metaheuristic
methods, such as particle swarm optimization, ant colony optimization, and genetic algorithms should be taken into consideration in conjunction with consistency-based feature selection.