Paper Titles

Analysis of Process Interactions in Dynamic System Using Frequency Dependent RGA
p.895

An Edge-Based Algorithm for Text Extraction in Images and Video Frame
p.900

Multiresolution LBP Correlogram for Texture Image Indexing and Retrieval
p.908

Paper Currency Recognition System Using Characteristics Extraction and Negatively Correlated NN Ensemble
p.915

Rule Extraction from Privacy Preserving Neural Network: Application to Banking
p.920

An ID-Based Signature Scheme from Bilinear Pairing Based on Ex-K-Plus Problem
p.929

A Distributed Method for Two Simultaneous Events Detection in WSNs
p.935

Efficiency Enhancement through Decision Support Based on Data Mining
p.942

Bluetooth Based 12- Channel Temperatures Data Acquisition System
p.948

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 403-408Rule Extraction from Privacy Preserving Neural...

Rule Extraction from Privacy Preserving Neural Network: Application to Banking

Article Preview

Abstract:

In the last two decades in areas like banking, finance and medical research privacy policies restrict the data owners to share the data for data mining purpose. This issue throws up a new area of research namely privacy preserving data mining. In this paper, we proposed a privacy preservation method by employing Particle Swarm Optimization (PSO) trained Auto Associative Neural Network (PSOAANN). The modified (privacy preserved) input values are fed to a decision tree (DT) and a rule induction algorithm viz., Ripper for rule extraction purpose. The performance of the hybrid is tested on four benchmark and bankruptcy datasets using 10-fold cross validation. The results are compared with those obtained using the original datasets where privacy is not preserved. The proposed hybrid approach achieved good results in all datasets.

You might also be interested in these eBooks

MEMS, NANO and Smart Systems

Info:

Periodical:

Advanced Materials Research (Volumes 403-408)

Pages:

920-928

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.403-408.920

Citation:

Cite this paper

Online since:

November 2011

Authors:

Nekuri Naveen, V. Ravi, C. Raghavendra Rao

Keywords:

Auto-Associative Neural Network (AANN), Bankruptcy, Classification, Particle Swarm Optimization Algorithm (PSO), Particle Swarm Optimization Auto-Associative Neural Network (PSOAANN), Privacy Preservation, Rule Extraction from Privacy Preservation

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] R. Agrawal and R. Srikant, Preserving Privacy in Data Mining, ACM SIGMOD International Conference on Management of Data, May-(2000).

DOI: 10.1145/342009.335438

[2] Y. Lindell and B. Pinkas, Privacy Preserving in Data Mining, Proceeding of the 20th annual cryptology conference in advances on Cryptology, 2000, pp.36-54.

DOI: 10.1007/3-540-44598-6_3

[3] W. U. Xiao-dan, Y. U. E. Dian-min, L. I. U. Feng-li, W. Yun-feng, and C. H. Chao-Hsien, Privacy Preserving Data Mining Algorithms by Data Distortion, Management Science and Engineering, 2006, pp.223-228.

DOI: 10.1109/icmse.2006.313871

[4] F. M. Behlen, S. B. Johnson, Multicenter Patient Records Research: Security Policies and Tools, J Am Med Inform Assoc. Vol. 6, No. 6, 1999, pp.435-43.

[5] J. J. Berman, Confidentiality Issues for Medical Data Miners, Artificial Intelligent Med. Vol. 26, No. 1-2, 2002, pp.25-36.

[6] B. Thuraisingham, Web Data Mining and its Applications in Business Intelligence and Counter-terrorism, CRC Press, (2003).

DOI: 10.1201/9780203499511

[7] S. E. Fienberg, Homeland insecurity: Data mining, terrorism detection, and confidentiality, Australian Bureau of Statistics, 55th Session of the International Statistical Institute (ISI). Sydney, (2005).

[8] L. Sweeney, Privacy-Preserving Bio-terrorism Surveillance, AAAI Spring Symposium, AI Technologies for Homeland Security, (2005).

[9] S. R. M. Oliveira and O. R. Zaiane, A privacy-preserving clustering approach toward secure and effective data analysis for business collaboration, Journal of Computer and Security, Vol. 26, 2007, pp.81-83.

DOI: 10.1016/j.cose.2006.08.003

[10] C. Boyens, R. Krishnan and R. Padman, On privacy-preserving access to distributed heterogeneous healthcare information, System Sciences, 2004. Proceedings of the 37th Annual Hawaii International Conference on.

DOI: 10.1109/hicss.2004.1265352

[11] E. Bertino, A Framework for Evaluating Privacy Preserving Data Mining Algorithms", Data Mining and Knowledge Discovery, Vol. 11, 2005, p.121.

DOI: 10.1007/s10618-005-0006-6

[12] J. Vaidya, C. Clifton and M. Zhu, Privacy Preserving Data Mining, ISBN: 978-0-387-25886-7, Advances in Information Security, Springer, 19, (2006).

[13] G. Crises, Non-Perturbative Methods for Microdata Privacy in Statistical Databases, http: /citeseer. ist. psu. edu/crises04nonperturbative. html, (2004).

[14] B. Pinkas, Cryptographic techniques for privacy-preserving data mining, SIGKDD Explorations, 4, (2002).

DOI: 10.1145/772862.772865

[15] K. Ramu and V. Ravi, Privacy preservation in data mining using hybrid perturbation methods: an application to bankruptcy prediction in banks", International Journal Data Analysis Techniques and Strategies, Vol. 1, No. 4, 2009, pp.313-331.

DOI: 10.1504/ijdats.2009.027509

[16] Paramjeet, V. Ravi, N. Naveen and C. Raghavendra Rao, Privacy Preserving Data Mining using Particle Swarm Optimization trained Auto-Associative Neural Network: an Application to Bankruptcy Prediction in Banks, (Accepted International Journal of Data Mining Modeling and Management).

DOI: 10.1504/ijdmmm.2012.045135

[17] J. R. Quinlan, C4. 5: Programs for Machine Learning, Morgan Kaufmann Publishers, SanMateo, (1992).

[18] W. W. Cohen, Fast Effective Rule Induction, From Machine Learning Proceedings of the Twelfth International Conference (ML95), (1995).

[19] J. Kennedy and R. C. Eberhart, Particle Swarm Optimization, Proceeding of IEEE International conference on Neural Networks, Piscataway, NJ, USA, 1995, p.1942-(1948).

[20] H. Hruschka and M. Natter, Comparing performance of feedforward neural nets and K-means for cluster-based market segmentation, European Journal of Operational Research, Vol. 114, 1999, pp.346-353.

DOI: 10.1016/s0377-2217(98)00170-2

[21] M. A. Kramer, Nonlinear principal component analysis using auto associative neural networks, AIChE Journal, Vol. 37, No. 2, 1991, p.233–243.

DOI: 10.1002/aic.690370209

[22] V. Ravi and C. Pramodh, Non-linear principal component analysis-based hybrid classifiers: an application to bankruptcy prediction in banks, International Journal of Information and Decision Sciences, Vol. 2, No. 1, 2010, p.50 – 67.

DOI: 10.1504/ijids.2010.029903

[23] S. Canbas, A. Caubak and S. B. Kilic, Prediction of commercial bank failure via multivariate statistical analysis of financial structures: The Turkish case, European Journal of Operational Research, Vol. 166, 2005, pp.528-546.

DOI: 10.1016/j.ejor.2004.03.023

[24] Olmeda and E. Fernandez, Hybrid classifiers for financial multicriteria decision making: The case of Bankruptcy prediction, Computational Economics, Vol. 10, 1997, pp.317-335.

[25] M. J. Beynon and M.J. Peel, Variable precision rough set theory and data discretisation: an application to corporate failure prediction, Omega, Vol. 29, 2001, p.561–576.

DOI: 10.1016/s0305-0483(01)00045-7

[26] E. Rahimian, S. Singh, T. Thammachote and R. Virmani, Bankruptcy prediction by Neural network" in R. R. Trippi and E. Turban (Eds. ) Neural Networks in Finance and Investing, Irwin Professional Publishing, Burr Ridge, USA, 1996. Appendix Rules generated by Decision Tree (C4. 5) IRIS DATASET Rule 1: If PW<= 0. 505359 and SL <= 0. 443342 then IRIS- VERSICOLOR (coverage =100%) Rule 2: If PW<= 0. 505359 and SL > 0. 443342 then IRIS- VIRGINICA (coverage = 90. 90%) Rule 3: If PW> 0. 505359 then IRIS-SETOSA (coverage = 90. 90%) WBC DATASET Rule 1: If clumpthickness <=0. 350595 then BENIGN (coverage = 100%) Rule 2: If clumpthickness >0. 350595 then MALIGNANT (coverage = 94. 00%) NEW THYROID DATASET Rule 1: If SThyroxin <=0. 307997 and TSH <= 0. 160963 then NORMAL (coverage = 100%) Rule 2: If SThyroxin <=0. 307997 and TSH > 0. 160963 then HypoThyroid (coverage = 85. 71%) Rule 3: If SThyroxin >0. 307997 then HyperThyroid (coverage = 87. 50%) WINE DATASET Rule 1: If Ash <=0. 538132 and Alcalinity of ash <= 0. 455098 and Nonflavanoidphenols<=0. 402591 then CLASS B (coverage = 90. 90%) Rule 2: If Ash<=0. 538132 and Alcalinity of ash > 0. 455098 and Ash <=0. 528514 and Alcalinity of ash<=0. 47219 and Hue<=0. 369992 then CLASS C (coverage = 100%) Rule 3: If Ash <=0. 538132 and Alcalinity of ash>0. 455098 and Ash <=0. 528514 Alcalinity of ash<=0. 47219 and Hue >0. 369992 then CLASS C (coverage =0%) Rule 4: If Ash <=0. 538132 and Alcalinity of ash>0. 455098 and Ash <=0. 528514 and Alcalinity of ash>0. 47219 then CLASS C (coverage = 80. 00%) Rule 5: If Ash <=0. 538132 and Alcalinity of ash>0. 455098 and Ash >0. 528514 then CLASS B (coverage = 0%) Rule 6: If Ash >0. 538132 then CLASS A (coverage = 4. 34%) SPANISH DATASET Rule 1: If (Current assets-cash/total assets) <= 0. 431644 then NonBankrupt (coverage = 75. 00%) Rule 2: If (Current assets-cash/total assets) > 0. 431644 then Bankrupt (coverage = 75. 00%) TURKISH DATASET Rule 1: If (Share holders' equity + total income)/(total assets + contingencies and commitments) <=0. 973129 then Bankrupt (coverage = 100%) Rule 2: If (Share holders, equity + total income)/(total assets + contingencies and commitments) > 0. 973129 then NonBankrupt (coverage = 66. 66%) US DATASET Rule 1: If (Earnings before interest and taxes/total assets) <= 0. 794781 then Bankrupt (coverage = 78. 57%) Rule 2: If (Earnings before interest and taxes/total assets) > 0. 794781 then NonBankrupt (coverage = 81. 81%) UK DATASET Rule 1: If (Current assets/current liabilities) <=0. 204983 then NonBankrupt (coverage = 66. 66%) Rule 2: If (Current assets/current liabilities) >0. 204983 and (Current assets/current liabilities) <=0. 207137 then Bankrupt (coverage = 100%) Rule 3: If (Current assets/current liabilities) >0. 204983 and (Current assets/current liabilities) >0. 207137 and (Funds flow/total liabilities) <= 0. 326856 then NonBankrupt (coverage = 6. 66%) Rule 4: If (Current assets/current liabilities) >0. 204983 and (Current assets/current liabilities) >0. 207137 and (Funds flow/total liabilities) > 0. 326856 then Bankrupt (coverage = 100%) Rules generated by Ripper. IRIS DATASET Rule 1: If PL<=0. 364484 then Iris-setosa (coverage = 100%) Rule 2: If PL<=0. 422152 then Iris-versicolor (coverage = 90. 00%) Rule 3: else Iris-Viriginca (coverage = 90. 90%) WBC DATASET Rule 1: If Clumpthickness>=0. 376957 then Malignant (coverage = 95. 65%) Rule 2: If Clumpthickness>=0. 351184 and Clumpthickness <= 0. 368407 then Malignant (coverage = 100%) Rule 3: else BENIGN (coverage = 34. 32%) WINE DATASET Rule 1: If Alcalinity of ash>=0. 46734 and Proanthocyanins <= 0. 328308 then Class C (coverage = 100%) Rule 2: If Proanthocyanins <=0. 318861 then Class C (coverage = 100%) Rule 3: If Ash >=0. 539347 and Hue >=0. 365639 then Class A (coverage = 100%) Rule 4: If Proanthocyanins >=359059 and Alcalinity of ash >=0. 446665 then Class A (coverage = 100%) Rule 5: else Class B (coverage = 41. 17%) NEW THYROID DATASET Rule 1: If TD>=0. 175793 and Sthyroxin<=0. 296417 then HypoThyroid (coverage = 71. 42%) Rule 2: If SThyroxin >=0. 310244 then HyperThyroid (coverage = 100%) Rule 3: else NORMAL (coverage = 81. 08%) SPANISH DATASET Rule 1: If (Current assets-cash/total assets) <= 0. 431934 then NonBankrupt (coverage = 60. 00%) Rule 2: else Bankrupt (coverage = 71. 42%) TURKISH DATASET Rule 1: If (Interest income/interest expenses) >=0. 415229 then Bankrupt (coverage = 100%) Rule 2: else NonBankrupt (coverage = 80%) US DATASET Rule 1: If (Earnings before interest and taxes/total assets) >= 0. 794884 then NonBankrupt (coverage = 81. 81%) Rule 2: else Bankrupt (coverage = 78. 57%) UK DATASET Rule 1: If (Current liabilities/total assets) <=0. 515514 then NonBankrupt (coverage = 83. 33%) Rule 2: else Bankrupt (coverage = 83. 33%).