The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Sang-Bum KIM, Hae-Chang RIM, Jin-Dong KIM, "Topic Document Model Approach for Naive Bayes Text Classification" in IEICE TRANSACTIONS on Information,
vol. E88-D, no. 5, pp. 1091-1094, May 2005, doi: 10.1093/ietisy/e88-d.5.1091.
Abstract: The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.
URL: https://globals.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.5.1091/_p
Copy
@ARTICLE{e88-d_5_1091,
author={Sang-Bum KIM, Hae-Chang RIM, Jin-Dong KIM, },
journal={IEICE TRANSACTIONS on Information},
title={Topic Document Model Approach for Naive Bayes Text Classification},
year={2005},
volume={E88-D},
number={5},
pages={1091-1094},
abstract={The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.},
keywords={},
doi={10.1093/ietisy/e88-d.5.1091},
ISSN={},
month={May},}
Copy
TY - JOUR
TI - Topic Document Model Approach for Naive Bayes Text Classification
T2 - IEICE TRANSACTIONS on Information
SP - 1091
EP - 1094
AU - Sang-Bum KIM
AU - Hae-Chang RIM
AU - Jin-Dong KIM
PY - 2005
DO - 10.1093/ietisy/e88-d.5.1091
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2005
AB - The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.
ER -