1-3hit |
We present a training algorithm to create a neural network (NN) ensemble that performs classification tasks. It employs a competitive decay of hidden nodes in the component NNs as well as a selective deletion of NNs in ensemble, thus named a pruning algorithm for NN ensembles (PNNE). A node cooperation function of hidden nodes in each NN is introduced in order to support the decaying process. The training is based on the negative correlation learning that ensures diversity among the component NNs in ensemble. The less important networks are deleted by a criterion that indicates over-fitting. The PNNE has been tested extensively on a number of standard benchmark problems in machine learning, including the Australian credit card assessment, breast cancer, circle-in-the-square, diabetes, glass identification, ionosphere, iris identification, and soybean identification problems. The results show that classification performances of NN ensemble produced by the PNNE are better than or competitive to those by the conventional constructive and fixed architecture algorithms. Furthermore, in comparison to the constructive algorithm, NN ensemble produced by the PNNE consists of a smaller number of component NNs, and they are more diverse owing to the uniform training for all component NNs.
This paper presents a dynamic node decaying method (DNDM) for layered artificial neural networks that is suitable for classification problems. Our purpose is not to minimize the total output error but to obtain high generalization ability with minimal structure. Users of the conventional back propagation (BP) learning algorithm can convert their program to the DNDM by simply inserting a few lines. This method is an extension of a previously proposed method to more general classification problems, and its validity is tested with recent standard benchmark problems. In addition, we analyzed the training process and the effects of various parameters. In the method, nodes in a layer compete for survival in an automatic process that uses a criterion. Relatively less important nodes are decayed gradually during BP learning while more important ones play larger roles until the best performance under given conditions is achieved. The criterion evaluates each node by its total influence on progress toward the upper layer, and it is used as the index for dynamic competitive decaying. Two additional criteria are used: Generalization Loss to measure over-fitting and Learning Progress to stop training. Determination of these criteria requires a few human interventions. We have applied this algorithm to several standard benchmark problems such as cancer, diabetes, heart disease, glass, and iris problems. The results show the effectiveness of the method. The classification error and size of the generated networks are comparable to those obtained by other methods that generally require larger modification, or complete rewriting, of the program from the conventional BP algorithm.
In this paper, we present a learning approach, positive correlation learning (PCL), that creates a multilayer neural network with good generalization ability. A correlation function is added to the standard error function of back propagation learning, and the error function is minimized by a steepest-descent method. During training, all the unnecessary units in the hidden layer are correlated with necessary ones in a positive sense. PCL can therefore create positively correlated activities of hidden units in response to input patterns. We show that PCL can reduce the information on the input patterns and decay the weights, which lead to improved generalization ability. Here, the information is defined with respect to hidden unit activity since the hidden unit plays a crucial role in storing the information on the input patterns. That is, as previously proposed, the information is defined by the difference between the uncertainty of the hidden unit at the initial stage of learning and the uncertainty of the hidden unit at the final stage of learning. After deriving new weight update rules for the PCL, we applied this method to several standard benchmark classification problems such as breast cancer, diabetes and glass identification problems. Experimental results confirmed that the PCL produces positively correlated hidden units and reduces significantly the amount of information, resulting improved generalization ability.