Automatically Extracting University Scholar Names Information and Classification

Chang Su; Wen Qiang Jia; Feng Jun Shang

doi:10.4028/www.scientific.net/AMM.496-500.2065

Paper Titles

The Application and Analysis of Chroma Keying by Using Adobe Ultra Software
p.2049

Another Kind of Crack on .NET Programs
p.2053

An Application Development Based on Android Platform － The Design and Realization of the Mood Release System
p.2057

The Method of Calculate Exponential Function
p.2061

Automatically Extracting University Scholar Names Information and Classification
p.2065

Researches on Data Correctness and Completeness in Cloud Computing Based on Game Theory
p.2069

Automatic Target Segmentation and Tracking in Intelligent Video Surveillance
p.2073

Application of Flash in the Analysis of Making Webpages and Mechanical Motiones
p.2078

A Study on Constructing On-line Seminar Based on the Mind Map
p.2082

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 496-500Automatically Extracting University Scholar Names...

Automatically Extracting University Scholar Names Information and Classification

Abstract:

High-tech talent is one of the important social resources such as energy and material, and introducing high-tech talent is an important strategy for the development of national science and technology. To extract high-tech talent information of variety research fields from massive websites. Firstly, we study the principles of Web crawler and Web data Extraction in the paper. Then taking the U.S universities as an example, we propose an intelligent method and procedure which can extract scholars name information from websites. Finally, we apply a classification algorithm to identify Chinese scholars working at overseas and verify the validity of the method in the experimental system. The accuracy of the classification algorithm is higher than 90%, the average accuracy of result information is higher than 77%.

You might also be interested in these eBooks

Frontiers of Manufacturing and Design Science IV

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 496-500)

Pages:

2065-2068

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.496-500.2065

Citation:

Cite this paper

Online since:

January 2014

Authors:

Chang Su, Wen Qiang Jia*, Feng Jun Shang

Keywords:

Name Classification, Name Recognition, Web Crawler, Web Data Extraction

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Page L, Brin S, Motwani R, et al. The PageRank citation ranking: bringing order to the web[J]. (1999).

Google Scholar

[2] Johnson J, Tsioutsiouliklis K, Giles C L. Evolving strategies for focused web crawling[C]/ICML. 2003: 298-305.

Google Scholar

[3] Feldman R, Sanger J. The text mining handbook: advanced approaches in analyzing unstructured data[M]. Cambridge University Press, (2007).

DOI: 10.1017/cbo9780511546914

Google Scholar

[4] Mohr G, Stack M, Rnitovic I, et al. Introduction to heritrix[C]/4th International Web Archiving Workshop. (2004).

Google Scholar

[5] Miller R. Websphinx, a personal, customizable web crawler[J]. 2011-02-12]. http: /www. cs. cmu. edu/~ rcm/websphinx, (2002).

Google Scholar

[6] Finkel J R, Grenager T, Manning C. Incorporating non-local information into information extraction systems by gibbs sampling[C]/Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 363-370.

DOI: 10.3115/1219840.1219885

Google Scholar

[7] Lian Li, Aihong Zhu, TaoSu. Research and implementation of an improved text similarity algorithm based on the vector space, Computer Applications and Software Vol. 29(2), 2012, pp.282-284.

Google Scholar