1-8hit |
Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
It has been reported that generalization performance of multilayer feedformard networks strongly depends on the attainment of saturated hidden outputs in response to the training set. Usually standard Backpropagation (BP) network mostly uses intermediate values of hidden units as the internal representation of the training patterns. In this letter, we propose construction of a 3-layer cascaded network in which two 2-layer networks are first trained independently by delta rule and then cascaded. After cascading, the intermediate layer can be viewed as hidden layer which is trained to attain preassigned saturated outputs in response to the training set. This network is particularly easier to construct for linearly separable training set, and can also be constructed for nonlinearly separable tasks by using higher order inputs at the input layer or by assigning proper codes at the intermediate layer which can be obtained from a trained Fahlman and Lebiere's network. Simulation results show that, at least, when the training set is linearly separable, use of the proposed cascaded network significantly enhances the generalization performance compared to BP network, and also maintains high generalization ability for nonlinearly separable training set. Performance of cascaded network depending on the preassigned codes at the intermediate layer is discussed and a suggestion about the preassigned coding is presented.
Masanori HAMAMOTO Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
We apply Fahlman and Lebiere's (FL) algorithm to network synthesis and incremental learning by making use of already-trained networks, each performing a specified task, to design a system that performs a global or extended task without destroying the information gained by the previously trained nets. Investigation shows that the synthesized or expanded FL networks have generalization ability superior to Back propagation (BP) networks in which the number of newly added hidden units must be pre-specified.
Masanori HAMAMOTO Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
Fahlman and Lebiere's (FL) learning algorithm begins with a two-layer network and in course of training, can construct various network architectures. We applied FL algorithm to the same three-layer network architecture as a back propagation (BP) network and compared their generalization properties. Simulation results show that FL algorithm yields excellent saturation of hidden units which can not be achieved by BP algorithm and furthermore, has more desirable generalization ability than that of BP algorithm.
Kazuki ITO Masanori HAMAMOTO Joarder KAMRUZZAMAN Yukio KUMAGAI
A new neural network system for object recognition is proposed which is invariant to translation, scaling and rotation. The system consists of two parts. The first is a preprocessor which obtains projection from the input image plane such that the projection features are translation and scale invariant, and then adopts the Rapid Transform which makes the transformed outputs rotation invariant. The second part is a neural net classifier which receives the outputs of preprocessing part as the input signals. The most attractive feature of this system is that, by using only a simple shift invariant transformation (Rapid transformation) in conjunction with the projection of the input image plane, invariancy is achieved and the system is of reasonably small size. Experiments with six geometrical objects with different degrees of scaling and rotation shows that the proposed system performs excellent when the neural net classifier is trained by the Cascade-correlation learning algorithm proposed by Fahlman and Lebiere.
One of the reasons of slow convergence in Backpropagation learning is the diminishing value of the derivative of the commonly used activation functions as the nodes approach extreme values, namely, 0 or 1. In this letter, we propose arctangent activation function to accelerate learning speed. Simulation results indicate considerable improvement in convergence performance.
Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
The most commonly used activation function in Backpropagation learning is sigmoidal while linear function is also sometimes used at the output layer with the view that choice between these activation functions does not make considerable differences in network's performance. In this letter, we show distinct performance between a network with linear output units and a similar network with sigmoid output units in terms of convergence behavior and generalization ability. We experimented with two types of cost functions, namely, sum-squared error used in standard Backpropagation and log-likelihood recently reported. We find that, with sum-squared error cost function and hidden units with nonsteep sigmoid function, use of linear units at the output layer instead of sigmoidal ones accelerates the convergence speed considerably while generalization ability is slightly degraded. Network with sigmoid output units trained by log-likelihood cost function yields even faster convergence and better generalization but does not converge at all with linear output units. It is also shown that a network with linear output units needs more hidden units for convergence.
Yukio KUMAGAI Joarder KAMRUZZAMAN Hiromitsu HIKITA
In this letter, we present a distinct alternative of cross talk formulation of associative memory based on the outer product algorithm extended to the higher order and a performance evaluation in terms of the probability of exact data recall by using this formulation. The significant feature of these formulations is that both cross talk and the probability formulated are explicitly represented as the functional forms of Hamming distance between the memorized keys and the applied input key, and the degree of higher order correlation. Simulation results show that exact data retrieval ability of the associative memory using randomly generated data and keys is in well agreement with our theoretical estimation.
Joarder KAMRUZZAMAN Yukio KUMAGAI Hiromitsu HIKITA
We present an extension of the previously proposed 3-layer feedforward network called a cascaded network. Cascaded networks are trained to realize category classification employing binary input vectors and locally represented binary target output vectors. To realize a nonlinearly separable task the extended cascaded network presented here is consreucted by introducing high order cross producted inputs at the input layer. In the construction of the cascaded network, two 2-layer networks are first trained independently by delta rule and then cascaded. After cascading, the intermediate layer can be understood as a hidden layer which is trained to attain preassigned saturated outputs in response to the training set. In a cascaded network trained to categorize binary image patterns, saturation of hidden outputs reduces the effect of corrupted disturbances presented in the input. We demonstrated that the extended cascaded network was able to realize a nonlinearly separable task and yielded better generalization ability than the Backpropagation network.