Paper Titles

Slicing Objects Using UML State Diagram
p.2538

Traffic Status Prediction and Analysis Based on Mining Frequent Subgraph Patterns
p.2543

Architecture Design of Urban Intelligent Transportation Using Cloud Computing
p.2549

The Application Research of Private Cloud in the Data Centers Colleges of Universities
p.2553

The Research and Application in Intelligent Document Retrieval Based on Text Quantification and Subject Mapping
p.2561

Research on Evaluation Index System of Multimedia Teaching Software
p.2569

Trust Federation of Identity Management in Distributed Environment
p.2574

Study on Teaching Method of Project in the Operating System
p.2579

Study on the Software Trustworthiness Measurement Algorithm Based on the Grey Relational Analysis
p.2583

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 605-607The Research and Application in Intelligent...

The Research and Application in Intelligent Document Retrieval Based on Text Quantification and Subject Mapping

Article Preview

Abstract:

Nowadays, document retrieval was an important way of academic exchange and achieving new knowledge. Choosing corresponding category of database and matching the input key words was the traditional document retrieval method. Using the method, a mass of documents would be got and it was hard for users to find the most relevant document. The paper put forward text quantification method. That was mining the features of each element in some document, which including word concept, weight value for position function, improved weights characteristic value, text distribution function weights value and text element length. Then the word’ contributions to this document would be got from the combination of five elements characteristics. Every document in database was stored digitally by the contribution of elements. And a subject mapping scheme was designed in the paper, which the similarity calculation method based on contribution and association rule was firstly designed, according to the method, the documents in the database would be conducted text clustering, and then feature extraction method was used to find class subject. When searching some document, the description which users input would be quantified and mapped to some class automatically by subject mapping, then the document sequences would be retrieved by computing the similarity between the description and the other documents’ features in the class. Experiment shows that the scheme has many merits such as intelligence, accuracy as well as improving retrieval speed.

You might also be interested in these eBooks

Advanced Designs and Researches for Manufacturing

Info:

Periodical:

Advanced Materials Research (Volumes 605-607)

Pages:

2561-2568

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.605-607.2561

Citation:

Cite this paper

Online since:

December 2012

Authors:

Qin Wang*, Shou Ning Qu, Tao Du, Ming Jing Zhang

Keywords:

Feature Extraction, Intelligent Retrieval, Quantification, Subject Mapping

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Ozge U, I. Burhan T 2007 J. Info. Sci. 177 449–466.

[2] Hanchuan P, Fuhu L, Chris D 2005 Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance and Min-Redundancy J. Trans. on PAT A & M/C Intell. 27 1226–38.

DOI: 10.1109/tpami.2005.159

[3] FORMAN G 2003 An extensive empirical study of feature selection metrics for text classification J. J of M/C L Rech. 3 1289–1305.

[4] Qun L, Sujian L 2002 Word Similarity Computing Method Based on HowNet J. C Linguistics and CN Lang. PROC 22 59–76.

[5] Jungyi J, Renjia L, Shiejue L 2011 A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification Journal Trans. on Knowl. & data EGR 23 335–349.

DOI: 10.1109/tkde.2010.122

[6] Fabrizio S 2002 Machine learning in automated text categorization J. ACM Computing Sur. 34 1–47.

[7] Mladenic D, Brank J, Grobelnik M, Milic-Frayling N. Feature selection using linear classifier weights: Interaction with classification models Proc. of the 27th ACM Int'l Conf. on Research and Development in Information Retrieval (Sheffield, ACM Press) p.234.

DOI: 10.1145/1008992.1009034

[8] Agrawal R, Imielinski T, Swami A 1993 Mining association rules between sets of items in large database (Washington, DC) p.207–216.

DOI: 10.1145/170036.170072

[9] Shouning Q, Qin W 2006 Research and application in supply chain management based on correlation analyze of Association Rule Materials Science Forum (Vols, 532–533) p.1024–1027.

DOI: 10.4028/www.scientific.net/msf.532-533.1024

[10] Hassan N, Rasha O, Ismail H 2011 Clustering Generalised Instances Set Approaches for Text Classification J. Journal of Info. & Knowl. Mamt. 10 91–107.

[11] Makrehchi M, Kamel M 2005 Text classification using small number of features Proc. of the 4th Int'l Conf. on Machine Learning and Data Mining in Pattern Recognition p.580–589.

DOI: 10.1007/11510888_57

[12] Wen Z, Taketoshi Y, Xijin T, Qing W 2010 Text clustering using frequent itemsets J. Knowl. Based Systems 23 379–386.

[13] Krishna S, Bhavani S 2010 An Efficient Approach for Text Clustering Based on Frequent Itemsets J. Euro. Journal of Sci. Rech. 42 412–423.

[14] Yiming Y 1999 An evaluation of statistical approaches to text categorization J. Journal of Info. Retrieval 1 67–88.

[15] Bollegala D, Matsuo Y, Ishizuka M 2007 Measuring Semantic Similarity Between Words Using Web Search Engines Proceedings of International World Wide Web Conference Committee (Banff, Alberta, Canada) p.757–766.

DOI: 10.1145/1242572.1242675

[16] Weimin Q, Junlin Z, SunLe 2003 Research on a topic based Chinese language model J. Journal of C Rech. & DEV 40 1368–1374.

[17] Fernandez J, Montanes E, Diaz I, Ranilla J, Combarro EF 2004 Text categorization by a machine-learning-based term selection Proc. of the Database and Expert Systems Applications (Zaragoza/ Spain) p.253–262.

DOI: 10.1007/978-3-540-30075-5_25