Formulate Neighborhood for Multi-Relational Data by Cell Accumulating

Article Preview

Abstract:

This paper presents an algorithm to develop neighborhood, and the first time applies it into multi-relational (MR) data. The proposed algorithm is inspired by the idea of Locality Sensitiveness Hashing, whose idea is cell accumulating. The heuristics of parameterization are given, which are customized to MR data. Experiments demonstrate the proposed method behaves better than its peers on both MR data and common data.

You might also be interested in these eBooks

Info:

Periodical:

Key Engineering Materials (Volumes 460-461)

Pages:

165-171

Citation:

Online since:

January 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] S. Dˇzeroski, Multi-Relational Data Mining: An Introduction. ACM SIGKDD Explorations Newsletter, Vol. 5, Issue 1, pp.1-16, (2003).

DOI: 10.1145/959242.959245

Google Scholar

[2] T. Seidl, H.P. Kriegel: Optimal Multi-step k-Nearest Neighbor Search, in: Proc. ACM SIGMOD International Conference on Management of Data, pp.154-165, (1998).

DOI: 10.1145/276305.276319

Google Scholar

[3] P.S. Bradley, U.M. Fayyad: Refining Initial Points for K-Means Clustering, in: Proc. 15th International Conf. on Machine Learning, pp.91-99, (1998).

Google Scholar

[4] T. Kohonen: Self Organization Maps. Springer-Verlag. New York. (2001).

Google Scholar

[5] B. Georgescu, I. Shimshoni, P. Mee: Mean Shift Based Clustering in High Dimensions: A Texture Classification Example, in: Proc. of Inter. Conference on Computer Vision, pp.456-463, (2003).

DOI: 10.1109/iccv.2003.1238382

Google Scholar

[6] E. Novak, K. Ritter: The Curse of Dimension and a Universal Method for Numerical Integration. Multivariate Approximation and Splines, (1998).

DOI: 10.1007/978-3-0348-8871-4_15

Google Scholar

[7] A. Gionis, P. Indyk, R. Motwani: Similarity Search in High Dimensions via Hashing, in: Proc. of Inter. Conference on Very Large Data Bases, pp.518-529, (1999).

Google Scholar

[8] D. Comaniciu, P. Meer: Mean shift: A robust approach toward feature space analysis. IEEE Transactions on pattern analysis and machine intelligence. Vol. 24, No. 5, pp.603-619, (2002).

DOI: 10.1109/34.1000236

Google Scholar

[9] M. Datar, N Immorlica, P Indyk: Locality-sensitive hashing scheme based on p-stable distributions, in: Proc. of 20th annual symposium on Computational geometry, pp.253-262, (2004).

DOI: 10.1145/997817.997857

Google Scholar

[10] A. Denmark: Near-Optimal Lower Bounds on the Multi-Party Communication Complexity of Set Disjointness, in: 18th Annual IEEE Conference on Computational Complexity, (2003).

DOI: 10.1109/ccc.2003.1214414

Google Scholar

[11] R.R. Coifman, M.V. Wickerhauser: Entropy-based algorithms for best basis selection, IEEE Transactions on Information Theory, Vol. 38, No. 2, pp.713-718, (1992).

DOI: 10.1109/18.119732

Google Scholar

[12] J.H. Friedman: Flexible Metric Nearest Neighbor Classification. Tech. Report, Dept. of Statistics, Stanford University, (1994).

Google Scholar

[13] http: /www. uncc. edu/knowledgediscovery.

Google Scholar

[14] T. Hastie, R. Tibshirani, Discriminant Adaptive Nearest Neighbor Classification. IEEE Trans. on Pattern Analysis and Machine Intelligence. Vol. 18(6), pp.607-615, (1996).

DOI: 10.1109/34.506411

Google Scholar

[15] C. Domeniconi, J. Peng, D. Gunopulos: An Adaptive Metric Machine for Pattern Classification. Advances in Neural Information Processing Systems. Vol. 13. (2001).

Google Scholar

[16] T. J. Hastie, R. J. Tibshirani, Classification by Pairwise Coupling, in: M. I. Jordan, M. J. Kearns (eds. ): Advances in Neural Information Processing Systems, Vol. 10, pp.507-513, (1998).

Google Scholar

[17] R. Kohavi: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, in: Proc. of IJCAI, pp.1137-1143, (1995).

Google Scholar

[18] http: /www. cs. cmu. edu/afs/cs/project/theo-11/www/naive-bayes. html.

Google Scholar

[19] R. Weber, H. Schek, and S. Blott: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces, in: Proc. 24th Inter Conference on Very Large Data Bases, pp.194-205, (1998).

Google Scholar

[20] J.R. Quinlan: C4. 5: Programs for Machine Learning. Morgan-Kaufmann Publishers, (1993).

Google Scholar