A Method of Recommendation the Most Used XML Tags

Article Preview

Abstract:

Processing of a large data set which is known for today as big data processing is still a problem that has not yet a well-defined solution. The data can be both structured and unstructured. For the structured part, eXtensible Markup Language (XML) is a major tool that freely allows document owners to describe and organize their data using their markup tags. One major problem, however, behind this freedom lies in the big data retrieving process. The same or similar information that are described using the different tags or different structures may not be retrieved if the query statements contains different keywords to the one used in the markup tags. The best way to solve this problem is to specify a standard set of the markup tags for each problem domain. The creation of such a standard set if done manually requires a lot of hard work and is a time consuming process. In addition, it may be hard to define terms that are acceptable by all people. This research proposes a model for a new technique, XML Tag Recommendation (XTR) that aims to solve this problem. This technique applies the idea of Case Base Reasoning (CBR) by collecting the most used tags in each domain as a case. These tags come from the collection of related words in WordNet. The WordCount that is the web site to find the frequency of words is applied to choose the most used one. The input (problem) to the XTR system is an XML document contains the tags specified by the document owners. The solution is a set of the recommended tags, which is the most used tags, for the problem domain of the document. Document owners have a freedom to change or not change the tags in their documents and can provide feedback to the XTR system.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 931-932)

Pages:

1353-1359

Citation:

Online since:

May 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Extensible Markup Language (XML) 1. 0 on http: /www. w3. org/TR/1998/REC-xml-19980210 [Accessed: October 6, 2013].

Google Scholar

[2] S. Vacharaskunee, S. Intakosum, An Approach to XML Tag Recommendation, the 8th International Conference on Information Technology: New Generations, Las Vegas, 2011, pp.18-23.

DOI: 10.1109/itng.2011.11

Google Scholar

[3] WordNet http: /wordnet. princeton. edu [Accessed: October 6, 2013].

Google Scholar

[4] WordCount on http: /www. wordcount. org [Accessed: October 6, 2013].

Google Scholar

[5] Namespaces in XML on http: /www. w3. org/TR/1999/REC-xml-names-19990114/ [Accessed: October 6, 2013].

Google Scholar

[6] XSL Transformations (XSLT) Version 1. 0 on http: /www. w3. org/TR/xslt [Accessed: October 6, 2013].

Google Scholar

[7] W3C XML Schema Definition Language (XSD) 1. 1 on http: /www. w3. org/TR/xmlschema11-1 [Accessed: October 6, 2013].

Google Scholar

[8] Document Type Definition on http: /www. w3. org/TR/html4/sgml/dtd. html [Accessed: October 6, 2013].

Google Scholar

[9] J. Madhavan, P. Bernstein, E. Rahm, Generic Schema Matching with Cupid, the 27th International Conference on Very Large Data Bases, Roma, 2001, pp.49-58.

Google Scholar

[10] H. Do, E. Rahm, COMA: a System for Flexible Combination of Schema Matching Approaches, the 28th International Conference on Very Large Data Bases, Hong Kong, 2002, pp.610-621.

DOI: 10.1016/b978-155860869-6/50060-3

Google Scholar

[11] D. Aumueller, H. Do, S. Massmann, E. Rahm, Schema and Ontology Matching with COMA++, the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, 2005, pp.906-908.

DOI: 10.1145/1066157.1066283

Google Scholar

[12] P. Bernstein, S. Melnik, J. Churchill, Incremental Schema Matching, the 32nd International Conference on Very Large Data Bases, Seoul, 2006, pp.1167-1170.

Google Scholar

[13] S. Vacharaskunee, S. Intakosum, XML Element Recommendation by Semantic Ranking, the 2nd International Conference on Computer and Automation Engineering, Singapore, 2010, pp.244-248.

DOI: 10.1109/iccae.2010.5451204

Google Scholar

[14] S. Vacharaskunee, S. Intakosum, XML Document Recommendation by Using Case-based Reasoning, the 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, London, 2010, pp.121-126.

DOI: 10.1109/snpd.2010.29

Google Scholar

[15] S. Vacharaskunee, S. Intakosum, XML path matching for different hierarchy order of elements in XML documents. the 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, London, 2010, pp.82-86.

DOI: 10.1109/snpd.2010.22

Google Scholar

[16] MetaQuerier: Exploring and Integrating the Deep Web on http: /metaquerier. cs. uiuc. edu [Access: January 29, 2014].

Google Scholar

[17] UW XML Repository on http: /www. cs. washington. edu/research/xmldatasets [Access: January 29, 2014].

Google Scholar

[18] I. Song, J. Paik, U. Kim, Semantic-based Similarity Computation for XML Document, 2007 International Conference on Multimedia and Ubiquitous Engineering, Korean Bible University, Seoul, 2007, pp.796-803.

DOI: 10.1109/mue.2007.188

Google Scholar