Paper Title
An Efficient Approach to Reduce Text Dimension for Precise Text Classification
Abstract
In modern society, some famous news websites such as Google and sina server gives information every day for
many users. But nowadays with the continuous development of information technology, the quantity of disorder data is
increasing in volume. Text classification and organization has become a challenge. The traditional manual classification of
news text not only consumes a lot of human and financial resources, but classification is also not achieved quickly. This
paper makes a research about the news text classification. A news text classification model is proposed based on Latent
Dirichlet Allocation (LDA) and Domain Word Filtering. The model reduces the features dimension of the news text
effectively and gets good classification results. This model uses topic model to reduce text dimension and get good features
as the dimension of the news texts is too high.
Keywords - Topic Model, LDA, Domain Word Filtering, News Website, Text Classification