Author Search Result

[Author] Sumio WATANABE(5hit)

1-5hit
  • Equations of States in Statistical Learning for an Unrealizable and Regular Case

    Sumio WATANABE  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E93-A No:3
      Page(s):
    617-626

    Many learning machines that have hierarchical structure or hidden variables are now being used in information science, artificial intelligence, and bioinformatics. However, several learning machines used in such fields are not regular but singular statistical models, hence their generalization performance is still left unknown. To overcome these problems, in the previous papers, we proved new equations in statistical learning, by which we can estimate the Bayes generalization loss from the Bayes training loss and the functional variance, on the condition that the true distribution is a singularity contained in a learning machine. In this paper, we prove that the same equations hold even if a true distribution is not contained in a parametric model. Also we prove that, the proposed equations in a regular case are asymptotically equivalent to the Takeuchi information criterion. Therefore, the proposed equations are always applicable without any condition on the unknown true distribution.

  • Generalization Performance of Subspace Bayes Approach in Linear Neural Networks

    Shinichi NAKAJIMA  Sumio WATANABE  

     
    PAPER-Algorithm Theory

      Vol:
    E89-D No:3
      Page(s):
    1128-1138

    In unidentifiable models, the Bayes estimation has the advantage of generalization performance over the maximum likelihood estimation. However, accurate approximation of the posterior distribution requires huge computational costs. In this paper, we consider an alternative approximation method, which we call a subspace Bayes approach. A subspace Bayes approach is an empirical Bayes approach where a part of the parameters are regarded as hyperparameters. Consequently, in some three-layer models, this approach requires much less computational costs than Markov chain Monte Carlo methods. We show that, in three-layer linear neural networks, a subspace Bayes approach is asymptotically equivalent to a positive-part James-Stein type shrinkage estimation, and theoretically clarify its generalization error and training error. We also discuss the domination over the maximum likelihood estimation and the relation to the variational Bayes approach.

  • Statistical Learning Theory of Quasi-Regular Cases

    Koshi YAMADA  Sumio WATANABE  

     
    PAPER-General Fundamentals and Boundaries

      Vol:
    E95-A No:12
      Page(s):
    2479-2487

    Many learning machines such as normal mixtures and layered neural networks are not regular but singular statistical models, because the map from a parameter to a probability distribution is not one-to-one. The conventional statistical asymptotic theory can not be applied to such learning machines because the likelihood function can not be approximated by any normal distribution. Recently, new statistical theory has been established based on algebraic geometry and it was clarified that the generalization and training errors are determined by two birational invariants, the real log canonical threshold and the singular fluctuation. However, their concrete values are left unknown. In the present paper, we propose a new concept, a quasi-regular case in statistical learning theory. A quasi-regular case is not a regular case but a singular case, however, it has the same property as a regular case. In fact, we prove that, in a quasi-regular case, two birational invariants are equal to each other, resulting that the symmetry of the generalization and training errors holds. Moreover, the concrete values of two birational invariants are explicitly obtained, hence the quasi-regular case is useful to study statistical learning theory.

  • A Modified Information Criterion for Automatic Model and Parameter Selection in Neural Network Learning

    Sumio WATANABE  

     
    PAPER-Bio-Cybernetics and Neurocomputing

      Vol:
    E78-D No:4
      Page(s):
    490-499

    This paper proposes a practical training algorithm for artificial neural networks, by which both the optimally pruned model and the optimally trained parameter for the minimum prediction error can be found simultaneously. In the proposed algorithm, the conventional information criterion is modified into a differentiable function of weight parameters, and then it is minimized while being controlled back to the conventional form. Since this method has several theoretical problems, its effectiveness is examined by computer simulations and by an application to practical ultrasonic image reconstruction.

  • Testing Homogeneity for Normal Mixture Models: Variational Bayes Approach

    Natsuki KARIYA  Sumio WATANABE  

     
    PAPER-Information Theory

      Vol:
    E103-A No:11
      Page(s):
    1274-1282

    The test of homogeneity for normal mixtures has been used in various fields, but its theoretical understanding is limited because the parameter set for the null hypothesis corresponds to singular points in the parameter space. In this paper, we shed a light on this issue from a new perspective, variational Bayes, and offer a theory for testing homogeneity based on it. Conventional theory has not reveal the stochastic behavior of the variational free energy, which is necessary for constructing a hypothesis test, has remained unknown. We clarify it for the first time and construct a new test base on it. Numerical experiments show the validity of our results.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.