搜索结果: 1-12 共查到“Text categorization”相关记录12条 . 查询时间(0.093 秒)
Automatic Text Categorization in Terms of Genre and Author
Automatic Text Categorization Genre Author
2015/8/25
The two main factors that characterize a text are its content and its style, and both can be used as a means of categorization. In this paper we present an approach to text categorization in terms of ...
Feature Selection Based on Term Frequency and T-Test for Text Categorization
feature selection term frequency t-test text classification
2013/6/14
Much work has been done on feature selection. Existing methods are based on document frequency, such as Chi-Square Statistic, Information Gain etc. However, these methods have two shortcomings: one is...
A QUICK SURVEY OF TEXT CATEGORIZATION ALGORITHMS
information retrieval algorithms machine learning text classification
2010/1/11
this paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision ...
Experimental Study on Representing Units in Chinese Text Categorization
byte 3-gram Experimental Study Chinese Text Categorization
2009/1/22
This paper is a comparative study on representing units in Chinese text categorization. Several kinds of representing units, including byte 3-gram, Chinese character, Chinese word, and Chinese word wi...
An Improved k-Nearest Neighbor Algorithm for Text Categorization
text categorization k-Nearest Neighbor algorithm Chinese computing and algorithm design
2009/1/22
k is the most important parameter in a text categorization system based on k-Nearest Neighbor algorithm (kNN).In the classification process, k nearest documents to the test one in the training set are...
Scalable Term Selection for Text Categorization
Scalable Term Selection Text Categorization
2015/1/24
Scalable Term Selection for Text Categorization.
Exploiting Category Information and Document Information to Improve Term Weighting for Text Categorization(图)
Category Information Document Information
2015/1/24
Traditional tfidf-like term weighting schemes have a rough statistic — idf as the term weighting factor, which does not exploit the category information (category labels on documents) and intra-docume...
A Comparison and Semi-Quantitative Analysis of Words and Character-Bigrams as Features in Chinese Text Categorization
Semi-Quantitative Analysis Character-Bigrams
2015/1/24
A Comparison and Semi-Quantitative Analysis of Words and Character-Bigrams as Features in Chinese Text Categorization.
Raising High-Degree Overlapped Character Bigrams into Trigrams for Dimensionality Reduction in Chinese Text Categorization(图)
Trigrams Dimensionality Reduction
2015/1/26
High dimensionality of feature space is a crucial obstacle for Automated Text Categorization. According to the characteristics of Chinese character N-grams, this paper reveals that there exists a kind...
Eliminating High-degree Biased Character Bigrams for Dimensionality Reduction in Chinese Text Categorization(图)
Character Bigrams Dimensionality Reduction
2015/1/26
High dimensionality of feature space is a main obstacle for Text Categorization (TC). In a candidate feature set consisting of Chinese character bigrams, there exist a number of bigrams which are high...
A Study on Feature Weighting in Chinese Text Categorization(图)
Feature Weighting Chinese Text Categorization
2015/1/26
In Text Categorization (TC) based on Vector Space Model, feature weighting and feature selection are major problems and difficulties. This paper proposes two methods of weighting features by combining...
Chinese Text Categorization Based on the Binary Weighting Model with Non-Binary Smoothing(图)
Binary Weighting Non-binary Smoothing
2015/1/26
In Text Categorization (TC) based on the vector space model, feature weighting is vital for the categorization effectiveness. Various non-binary weighting schemes are widely used for this purpose. By ...