An Architecture for Unstructured Data Management

Yao Hu Lin; Xue Lian Lin

doi:10.4028/www.scientific.net/AMR.756-759.1280

Paper Titles

An Improved Method for AMR-WB Speech Codec
p.1259

WoT Based Context-Aware Middleware Design for Integrating Real-World Object to SNS
p.1264

Design and Implementation of a Context-Aware Based Classroom Inquiring System
p.1270

Data Storage Technology and its Development Based on Cloud Computing
p.1275

An Architecture for Unstructured Data Management
p.1280

The Development and Application of Virtual System of Physiological Function Experiments Based on Networks
p.1285

Analysis of Virtual Storage Technology and its Application in the Library
p.1289

Distributed Computing Design Methods for Multicore Application Programming
p.1295

XML Retrieval with Results Clustering on Android
p.1300

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 756-759An Architecture for Unstructured Data Management

An Architecture for Unstructured Data Management

Abstract:

As the information age is coming, there is a vast amount of information available in the Internet. Most of data on Web are unstructured. But the significant data should be organized and stored in a suitable way for future purposes. One of the unsolved problems is the management of unstructured data. The unstructured data such as presentation, spreadsheet, text document, memo, images and web pages are difficult to manage while the data become a large scale and the users have different requirements and interests. In this paper, we proposed an architecture for unstructured data management by integrating source query, data collection and data management to solve these problems. The data collection layer extracts the data we care about, we use the existing tools to extract automatic and we can also add the data to the repository manually. The data management layer manage all the collection data by classifying the data, selecting nodes to store and managing centralized as index. The source query layer allows users to query and get the data diversity according the adaptive query service and recommendation service. Finally, we implemented a prototype system OCourse based on this system architecture to show its feasible and efficient.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 756-759)

Pages:

1280-1284

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.756-759.1280

Citation:

Cite this paper

Online since:

September 2013

Authors:

Yao Hu Lin, Xue Lian Lin

Keywords:

Classification, Storage, Unstructured Data

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] IDC, TOP 10 PREDICTIONS, IDC, pp.1-26, (2011).

Google Scholar

[2] Diane Berry, Coveo, Unstructured data: Challenge or asset, ZDNet, http: /www. zdnet. com/news/unstructured-data-challenge-or-asset/6356681, (2012).

Google Scholar

[3] Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo, RoadRunner: Towards Automatic Data Extraction from Large Web Sites, VLDB Conference, pp.624-624, (2001).

DOI: 10.1145/564691.564778

Google Scholar

[4] Freitag, Dayne, Information extraction from HTML: Application of a general machine learning approach, AAAI, pp.517-523, (1998).

Google Scholar

[5] B. Adelberg, NoDoSE: A Tool For Semi-Automatically Extracting Structured And Semi-Structured Data From Text Documents, SIGMOD Record, vol. 27(2), pp.283-294, (1998).

DOI: 10.1145/276305.276330

Google Scholar

[6] T. Chartrand, Ontology-Based Extraction of Rdf Data From The World Wide Web, Brigham Young University, (2003).

Google Scholar

[7] Chun-Nan Hsu, Ming-Tzung Dung, Generatingfinite-statetransducers for semi-structured dataextraction from the Web, Information Systems, vol. 23(8), p.521–538, (1998).

Google Scholar

[8] Alberto H.F. Laender, Berthier Ribeiro-Neto, Altigran S. da Silva, DEByE – DataExtraction By Example, Data & Knowledge Engineering, vol. 40(2), pp.121-154, (2002).

DOI: 10.1016/s0169-023x(01)00047-7

Google Scholar

[9] Dayal, Umeshwar, Hwang, Hai-Yann, View Definition and Generalization for Database Integration in a Multidatabase System, IEEE Transactions on Software Engineering, vol. 10(6), pp.628-645, (2009).

DOI: 10.1109/tse.1984.5010292

Google Scholar

[10] W. -S. Li, V. S. Batra, V. Raman, W. Han, and I. Narang, QoS-based data access and placement for federated systems, Proceedings of the 31st international conference on Very large data bases, pp.1358-1362, (2005).

Google Scholar