Papers by Keyword: Feature Selection

Paper TitlePage

Authors: Kai Yang, Yong Long Jin, Zhi Jun He
Abstract: Concept lattice is the core data structure of formal concept analysis and represents the order relationship between the concepts iconically. Feature selection has been the focus of research in machine learning.And feature selection has been shown very effective in removing irrelevant and redundant features,also increasing efficiency in learning process and obtaining more intelligible learned results.This paper proposes a new briefest feature subset selection algorithm based on preference attribute on the basis of study of concept lattice theory. User can put forward a preference attribute according to their subjective experiences, all the briefest feature subsets containing the given attribute can be discovered by the algorithm. It firstly find some special concept pairs and calculate their waned-value hypergraph, then obtain the minimal transversal of the hypergraph as a result. A practical example proves the method is cogent and effective.
Authors: Hu Li, Peng Zou, Wei Hong Han
Abstract: Information explosion brings lots of challenges to text classification. The dimension disaster led to a sharp increase of computational complexity and lower classification accuracy. Therefore, it is critical to use feature selection techniques before actual classification. Automatic classification of English text has been researched for many years, but little on Chinese text. In this paper, several classic feature selection methods, namely TF, IG and CHI, are compared on classifying Chinese text. Meanwhile, we take imbalanced data into consideration in the paper. Experimental results show that CHI performed better than IG and TF when the dataset is imbalanced, but no obvious difference on balanced data.
Authors: Xiao Yue Chen, Jian Zhong Zhou, Xiao Min Xu, Yong Chuan Zhang
Abstract: Fault diagnosis is very important to ensure the safe operation of hydraulic generator units (HGU). Because of the complexity of HGU, the vast amounts of measured data and the redundant information, the accuracy and instantaneity of fault diagnosis are severely limited. At present, feature selection technique is an effective method to break through this bottleneck. According to the specific characteristics of HGU faults, this paper puts forward a hierarchical feature selection method based on classification tree (HFSMCT). HFSMCT selects the most effective feature for each branch node through filtering evaluation criteria and heuristic search strategy, and all the selected features constitute the final feature set. Moreover, HFSMCT is easy to design and implement, and it is very prominent in computational efficiency and accuracy. The simulation results also prove that HFSMCT is very suitable for HGU fault diagnosis.
Authors: Y.H. Gai, Gang Yu
Abstract: This paper presents a novel hybrid feature selection algorithm based on Ant Colony Optimization (ACO) and Probabilistic Neural Networks (PNN). The wavelet packet transform (WPT) was used to process the bearing vibration signals and to generate vibration signal features. Then the hybrid feature selection algorithm was used to select the most relevant features for diagnostic purpose. Experimental results for bearing fault diagnosis have shown that the proposed hybrid feature selection method has greatly improved the diagnostic performance.
Authors: Man Sheng Xiao, Zhe Xiao, Zhi Liu
Abstract: For the problem about a large number of irrelevant and redundant features may reduce the performance of data classification in massive data sets, a method of feature automatic selection based on mutual information and fuzzy clustering algorithm is proposed. The method is carried out as follows: The first is to work out the feature correlation based on mutual information, and to group the data according to the feature of the maximum correlation. The second is to automatically determine the optimal number of feature and compression features dimension by fuzzy c-means clustering algorithm in the data groups. The theoretical analysis and the experiment indicate that the method can obtain higher efficiency in data classification.
Authors: Si An, Xin Hua Fan
Abstract: In view of the performance of dividing the audio category is closely related to the speech feature parameters selected,a systematic and practical characteristic parameter selection method is presented,orthogonal experimental design method based on analysis of variance.This method has better characteristic [1],the construction of the experiment is simple and the result is easy to analysis,it’s also has directive function to the subsequent experiments.We first do the feture parameters and the level selection,then select amount of appropriate points with representative and typical from a large number of experimental points to construct the orthogonal table to do a analysis of variance by mathematical statistics and orthogonal principle,finding the optimal feature combination. In this pater,the experimental corpus that is voice content is divided into two categories,the kind of traffic and the legal category.Comparing the experimental results shows that the method is not only effective to do feature parameter selection in voice content classification but also has direct meaning to the subsequent experiment design and research.
Authors: Hao Yan Guo, Da Zheng Wang
Abstract: The traditional motivation behind feature selection algorithms is to find the best subset of features for a task using one particular learning algorithm. However, it has been often found that no single classifier is entirely satisfactory for a particular task. Therefore, how to further improve the performance of these single systems on the basis of the previous optimal feature subset is a very important issue.We investigate the notion of optimal feature selection and present a practical feature selection approach that is based on an optimal feature subset of a single CAD system, which is referred to as a multilevel optimal feature selection method (MOFS) in this paper. Through MOFS, we select the different optimal feature subsets in order to eliminate features that are redundant or irrelevant and obtain optimal features.
Authors: Ying Chieh Tsai, Ching Hsue Cheng, Jing Rong Chang
Abstract: The knowledge obtained from the experience of monitoring manufacturing process is critical to guarantee good products produced at the end of manufacturing line. Recently, many methods have been developed for the described purpose above. In this paper, a new knowledge discovery model based on soft computing is proposed. The proposed model contains a new algorithm Modified Correlation-based Feature Selection (MCFS), a new algorithm Modified Minimum Entropy Principle Algorithm (MMEPA), and Variable Precision Rough Set Model (VP-model). After conducting a real case of monitoring the process of manufacturing industrial conveyor belt, some advantages of the proposed model are that (1) MCFS can quickly identifying and screening irrelevant, redundant, and noisy features for data reduction; (2) MMEPA can objectively construct membership functions of fuzzy sets for fuzzifing the reduced dataset; (3) VP-model can extract causal relationship rules for controlling product quality; (4) Extracted rules by the proposed knowledge discovery model are easily understood and interpretable.
Authors: Hong Zhi Wang, Jian Ping Zhang, Zun Yi Shang
Abstract: In network traffic classification, by conventional PCA method, more features still exist due to uniform contribution rates for most of features. To overcome this problem, in this paper, a novel feature selection method is proposed to reduce data dimension of network traffic. A contribution rate of various features in each component is calculated by a new weight criterion. A maxima-order principle is proposed to determine feature selection. Based on three multi-class classification methods, performance comparison is conducted by actual traffic data with 10-fold cross-validation. Experiment shows that the proposed method has higher classification accuracy than conventional PCA method.
Authors: Chun Mei Yu, Sheng Bo Yang
Abstract: To increase fault classification performance and reduce computational complexity,the feature selection process has been used for fault diagnosis.In this paper, we proposed a sparse representation based feature selection method and gave detailed procedure of the algorithm. Traditional selecting methods based on wavelet package decomposition and Bhattacharyya distance methods,and sparse methods, including sparse representation classifier, sparsity preserving projection and sparse principal component analysis,were compared to the proposed method.Simulations showed the proposed selecting method gave better performance on fault diagnosis with Tennessee Eastman Process data.
Showing 1 to 10 of 111 Paper Titles