Author Search Result

[Author] Ryohei SASANO(2hit)

1-2hit
  • Inference Discrepancy Based Curriculum Learning for Neural Machine Translation

    Lei ZHOU  Ryohei SASANO  Koichi TAKEDA  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/10/18
      Vol:
    E107-D No:1
      Page(s):
    135-143

    In practice, even a well-trained neural machine translation (NMT) model can still make biased inferences on the training set due to distribution shifts. For the human learning process, if we can not reproduce something correctly after learning it multiple times, we consider it to be more difficult. Likewise, a training example causing a large discrepancy between inference and reference implies higher learning difficulty for the MT model. Therefore, we propose to adopt the inference discrepancy of each training example as the difficulty criterion, and according to which rank training examples from easy to hard. In this way, a trained model can guide the curriculum learning process of an initial model identical to itself. We put forward an analogy to this training scheme as guiding the learning process of a curriculum NMT model by a pretrained vanilla model. In this paper, we assess the effectiveness of the proposed training scheme and take an insight into the influence of translation direction, evaluation metrics and different curriculum schedules. Experimental results on translation benchmarks WMT14 English ⇒ German, WMT17 Chinese ⇒ English and Multitarget TED Talks Task (MTTT) English ⇔ German, English ⇔ Chinese, English ⇔ Russian demonstrate that our proposed method consistently improves the translation performance against the advanced Transformer baseline.

  • The Effect of Corpus Size on Case Frame Acquisition for Predicate-Argument Structure Analysis

    Ryohei SASANO  Daisuke KAWAHARA  Sadao KUROHASHI  

     
    PAPER-Natural Language Processing

      Vol:
    E93-D No:6
      Page(s):
    1361-1368

    This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicate-argument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.