This paper aims at improving the categorization performance of the small number of samples in the imbalance datasets, and dealing with data re-sampling from the perspective of data. The main idea is to make the number of various types of texts by increasing some texts. The experiment indicates that the system has improved the accuracy of text-categorization effectively.
展开▼