In the present paper, we attempt to show that the information about input patterns must be as small as possible for improving the generalization performance under the condition that the network can produce targets with appropriate accuracy. The information is defined with respect to the hidden unit activity and we suppose that the hidden unit has a crucial role to store the information content about input patterns. The information is defined by the difference between uncertainty of the hidden unit at the initial stage of the learning and the uncertainty of the hidden unit at the final stage of the learning. After having formulated an update rule for the information minimization, we applied the method to a problem of language acquisition: the inference of the past tense forms of regular and irregular verbs. Experimental results confirmed that by our method, the information was significantly decreased and the generalization performance was greatly improved.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Ryotaro KAMIMURA, Toshiyuki TAKAGI, Shohachiro NAKANISHI, "Improving Generalization Performance by Information Minimization" in IEICE TRANSACTIONS on Information,
vol. E78-D, no. 2, pp. 163-173, February 1995, doi: .
Abstract: In the present paper, we attempt to show that the information about input patterns must be as small as possible for improving the generalization performance under the condition that the network can produce targets with appropriate accuracy. The information is defined with respect to the hidden unit activity and we suppose that the hidden unit has a crucial role to store the information content about input patterns. The information is defined by the difference between uncertainty of the hidden unit at the initial stage of the learning and the uncertainty of the hidden unit at the final stage of the learning. After having formulated an update rule for the information minimization, we applied the method to a problem of language acquisition: the inference of the past tense forms of regular and irregular verbs. Experimental results confirmed that by our method, the information was significantly decreased and the generalization performance was greatly improved.
URL: https://globals.ieice.org/en_transactions/information/10.1587/e78-d_2_163/_p
Copy
@ARTICLE{e78-d_2_163,
author={Ryotaro KAMIMURA, Toshiyuki TAKAGI, Shohachiro NAKANISHI, },
journal={IEICE TRANSACTIONS on Information},
title={Improving Generalization Performance by Information Minimization},
year={1995},
volume={E78-D},
number={2},
pages={163-173},
abstract={In the present paper, we attempt to show that the information about input patterns must be as small as possible for improving the generalization performance under the condition that the network can produce targets with appropriate accuracy. The information is defined with respect to the hidden unit activity and we suppose that the hidden unit has a crucial role to store the information content about input patterns. The information is defined by the difference between uncertainty of the hidden unit at the initial stage of the learning and the uncertainty of the hidden unit at the final stage of the learning. After having formulated an update rule for the information minimization, we applied the method to a problem of language acquisition: the inference of the past tense forms of regular and irregular verbs. Experimental results confirmed that by our method, the information was significantly decreased and the generalization performance was greatly improved.},
keywords={},
doi={},
ISSN={},
month={February},}
Copy
TY - JOUR
TI - Improving Generalization Performance by Information Minimization
T2 - IEICE TRANSACTIONS on Information
SP - 163
EP - 173
AU - Ryotaro KAMIMURA
AU - Toshiyuki TAKAGI
AU - Shohachiro NAKANISHI
PY - 1995
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E78-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 1995
AB - In the present paper, we attempt to show that the information about input patterns must be as small as possible for improving the generalization performance under the condition that the network can produce targets with appropriate accuracy. The information is defined with respect to the hidden unit activity and we suppose that the hidden unit has a crucial role to store the information content about input patterns. The information is defined by the difference between uncertainty of the hidden unit at the initial stage of the learning and the uncertainty of the hidden unit at the final stage of the learning. After having formulated an update rule for the information minimization, we applied the method to a problem of language acquisition: the inference of the past tense forms of regular and irregular verbs. Experimental results confirmed that by our method, the information was significantly decreased and the generalization performance was greatly improved.
ER -