1-10hit |
This paper shows that sum-of-product expression (SOP) minimization produces the generalization ability. We show this in three steps. First, various classes of SOPs are generated. Second, minterms of SOP are randomly selected to generate partially defined functions. And, third, from the partially defined functions, original functions are reconstructed by SOP minimization. We consider Achilles heel functions, majority functions, monotone increasing cascade functions, functions generated from random SOPs, monotone increasing random SOPs, circle functions, and globe functions. As for the generalization ability, the presented method is compared with Naive Bayes, multi-level perceptron, support vector machine, JRIP, J48, and random forest. For these functions, in many cases, only 10% of the input combinations are sufficient to reconstruct more than 90% of the truth tables of the original functions.
Min YOON Yeboon YUN Hirotaka NAKAYAMA
Support vector algorithms try to maximize the shortest distance between sample points and discrimination hyperplane. This paper suggests the total margin algorithms which consider the distance between all data points and the separating hyperplane. The method extends and modifies the existing algorithms. Experimental studies show that the total margin algorithms provide good performance comparing with the existing support vector algorithms.
Naotake KAMIURA Teijiro ISOKAWA Yutaka HATA Nobuyuki MATSUI Kazuharu YAMATO
To enhance fault tolerance ability of the feedforward neural networks (NNs for short) implemented in hardware, we discuss the learning algorithm that converges without adding extra neurons and a large amount of extra learning time and cycles. Our algorithm modified from the standard backpropagation algorithm (SBPA for short) limits synaptic weights of neurons in range during learning phase. The upper and lower bounds of the weights are calculated according to the average and standard deviation of them. Then our algorithm reupdates any weight beyond the calculated range to the upper or lower bound. Since the above enables us to decrease the standard deviation of the weights, it is useful in enhancing fault tolerance. We apply NNs trained with other algorithms and our one to a character recognition problem. It is shown that our one is superior to other ones in reliability, extra learning time and/or extra learning cycles. Besides we clarify that our algorithm never degrades the generalization ability of NNs although it coerces the weights within the calculated range.
Considering the pattern classification/recognition tasks, the influence of the activation function on fault tolerance property of feedforward neural networks is empirically investigated. The simulation results show that the activation function largely influences the fault tolerance and the generalization property of neural networks. It is found that, neural networks with symmetric sigmoid activation function are largely fault tolerant than the networks with asymmetric sigmoid function. However the close relation between the fault tolerance and the generalization property was not observed and the networks with asymmetric activation function slightly generalize better than the networks with the symmetric activation function. First, the influence of the activation function on fault tolerance property of neural networks is investigated on the XOR problem, then the results are generalized by evaluating the fault tolerance property of different NNs implementing different benchmark problems.
Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
We present an extension of the previously proposed 3-layer feedforward network called a cascaded network. Cascaded networks are trained to realize category classification employing binary input vectors and locally represented binary target output vectors. To realize a nonlinearly separable task the extended cascaded network presented here is consreucted by introducing high order cross producted inputs at the input layer. In the construction of the cascaded network, two 2-layer networks are first trained independently by delta rule and then cascaded. After cascading, the intermediate layer can be understood as a hidden layer which is trained to attain preassigned saturated outputs in response to the training set. In a cascaded network trained to categorize binary image patterns, saturation of hidden outputs reduces the effect of corrupted disturbances presented in the input. We demonstrated that the extended cascaded network was able to realize a nonlinearly separable task and yielded better generalization ability than the Backpropagation network.
Kazuki ITO Masanori HAMAMOTO Joarder KAMRUZZAMAN Yukio KUMAGAI
A new neural network system for object recognition is proposed which is invariant to translation, scaling and rotation. The system consists of two parts. The first is a preprocessor which obtains projection from the input image plane such that the projection features are translation and scale invariant, and then adopts the Rapid Transform which makes the transformed outputs rotation invariant. The second part is a neural net classifier which receives the outputs of preprocessing part as the input signals. The most attractive feature of this system is that, by using only a simple shift invariant transformation (Rapid transformation) in conjunction with the projection of the input image plane, invariancy is achieved and the system is of reasonably small size. Experiments with six geometrical objects with different degrees of scaling and rotation shows that the proposed system performs excellent when the neural net classifier is trained by the Cascade-correlation learning algorithm proposed by Fahlman and Lebiere.
Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
It has been reported that generalization performance of multilayer feedformard networks strongly depends on the attainment of saturated hidden outputs in response to the training set. Usually standard Backpropagation (BP) network mostly uses intermediate values of hidden units as the internal representation of the training patterns. In this letter, we propose construction of a 3-layer cascaded network in which two 2-layer networks are first trained independently by delta rule and then cascaded. After cascading, the intermediate layer can be viewed as hidden layer which is trained to attain preassigned saturated outputs in response to the training set. This network is particularly easier to construct for linearly separable training set, and can also be constructed for nonlinearly separable tasks by using higher order inputs at the input layer or by assigning proper codes at the intermediate layer which can be obtained from a trained Fahlman and Lebiere's network. Simulation results show that, at least, when the training set is linearly separable, use of the proposed cascaded network significantly enhances the generalization performance compared to BP network, and also maintains high generalization ability for nonlinearly separable training set. Performance of cascaded network depending on the preassigned codes at the intermediate layer is discussed and a suggestion about the preassigned coding is presented.
Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
The most commonly used activation function in Backpropagation learning is sigmoidal while linear function is also sometimes used at the output layer with the view that choice between these activation functions does not make considerable differences in network's performance. In this letter, we show distinct performance between a network with linear output units and a similar network with sigmoid output units in terms of convergence behavior and generalization ability. We experimented with two types of cost functions, namely, sum-squared error used in standard Backpropagation and log-likelihood recently reported. We find that, with sum-squared error cost function and hidden units with nonsteep sigmoid function, use of linear units at the output layer instead of sigmoidal ones accelerates the convergence speed considerably while generalization ability is slightly degraded. Network with sigmoid output units trained by log-likelihood cost function yields even faster convergence and better generalization but does not converge at all with linear output units. It is also shown that a network with linear output units needs more hidden units for convergence.
Masanori HAMAMOTO Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
We apply Fahlman and Lebiere's (FL) algorithm to network synthesis and incremental learning by making use of already-trained networks, each performing a specified task, to design a system that performs a global or extended task without destroying the information gained by the previously trained nets. Investigation shows that the synthesized or expanded FL networks have generalization ability superior to Back propagation (BP) networks in which the number of newly added hidden units must be pre-specified.
Masanori HAMAMOTO Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
Fahlman and Lebiere's (FL) learning algorithm begins with a two-layer network and in course of training, can construct various network architectures. We applied FL algorithm to the same three-layer network architecture as a back propagation (BP) network and compared their generalization properties. Simulation results show that FL algorithm yields excellent saturation of hidden units which can not be achieved by BP algorithm and furthermore, has more desirable generalization ability than that of BP algorithm.