1-2hit |
Kensuke SUMOTO Kenta KANAKOGI Hironori WASHIZAKI Naohiko TSUDA Nobukazu YOSHIOKA Yoshiaki FUKAZAWA Hideyuki KANUKA
Security-related issues have become more significant due to the proliferation of IT. Collating security-related information in a database improves security. For example, Common Vulnerabilities and Exposures (CVE) is a security knowledge repository containing descriptions of vulnerabilities about software or source code. Although the descriptions include various entities, there is not a uniform entity structure, making security analysis difficult using individual entities. Developing a consistent entity structure will enhance the security field. Herein we propose a method to automatically label select entities from CVE descriptions by applying the Named Entity Recognition (NER) technique. We manually labeled 3287 CVE descriptions and conducted experiments using a machine learning model called BERT to compare the proposed method to labeling with regular expressions. Machine learning using the proposed method significantly improves the labeling accuracy. It has an f1 score of about 0.93, precision of about 0.91, and recall of about 0.95, demonstrating that our method has potential to automatically label select entities from CVE descriptions.
Yuki NOYORI Hironori WASHIZAKI Yoshiaki FUKAZAWA Hideyuki KANUKA Keishi OOSHIMA Shuhei NOJIRI Ryosuke TSUCHIYA
Resource limitations require that bugs be resolved efficiently. The bug modification process uses bug reports, which are generated from service user reports. Developers read these reports and fix bugs. Developers discuss bugs by posting comments directly in bug reports. Although several studies have investigated the initial report in bug reports, few have researched the comments. Our research focuses on bug reports. Currently, everyone is free to comment, but the bug fixing time may be affected by how to comment. Herein we investigate the topic of comments in bug reports. Mixed topics do not affect the bug fixing time. However, the bug fixing time tends to be shorter when the discussion length of the phenomenon is short.