research centers


Search results: Found 1

Listing 1 - 1 of 1
Sort by

Article
Protecting User’s Information Based on Clustering Method in Data Mining
حماية معلومات المستخدم بالاعتماد على طريقة التجمع في استخلاص البيانات

Author: Heba Adnan Raheem هبة عدنان رحيم
Journal: Albahir journal مجلة الباهر ISSN: 23125721 Year: 2015 Volume: 2 Issue: 3,4 Pages: 23-34
Publisher: AL-Abbas Holy Shrine العتبة العباسية المقدسة

Loading...
Loading...
Abstract

ABSTRACT Privacy preserving data mining is a latest research area in the field of data mining. It is defined as “protecting user’s information”. Protection of privacy has become important in data mining research because of the increasing ability to store personal data about users and the development of data mining algorithms to infer this information. The main goal in privacy preserving data mining is to develop a system for modifying the original data in some way, so that the private data and knowledge remain private even after the mining process. In this paper we propos a system that used PAM (partitioning around medoid) clustering algorithm in health datasets in order to generate set of clusters, then we suggest protecting the sensitive attributes in each cluster in order to increasing the privacy of users information. Protecting the sensitive attributes is done by using privacy techniques through modifying the data values (attributes) in the dataset. We suggest using randomization techniqueData copying (which is a new suggested technique in this paper) to prevent attacker from concluding users privacy information. After modification, the same clustering algorithm is applied to modified data set to verify whether the sensitive attributes are hidden or not. Experimental results on these proposed techniques prove that the PAM algorithm is efficient for clustering in all data sets and the selected clusters are protected efficiently by using Data Copying technique. This technique is applied to Wisconsin breast cancer and diabetes data set. Finally the results of the proposed system prove that the distortion of data can be reduced when the privacy ratio was increased. These are important issues in PPDM, therefore the proposed system is highly successful in achieving the protection of privacy.

ﺍﳋﻼﺻﺔ ﲪﺎﻳﺔ ﻣﻌﻠﻮﻣﺎﺕ ”ﺍﳊﻔﺎﻅ ﻋﲆ ﺧﺼﻮﺻﻴﺔ ﺗﻨﻘﻴﺐ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻫﻮ ﺃﺣﺪﺙ ﳎﺎﻝ ﺑﺤﻮﺙ ﺍﻟﺘﻨﻘﻴﺐ ﻋﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ. ﻭﺗﻌﺮﻑ ﺑﺄﳖﺎ . ﺃﺻﺒﺤﺖ ﲪﺎﻳﺔ ﺍﳋﺼﻮﺻﻴﺔ ﺫﺍﺕ ﺃﳘﻴﺔ ﰲ ﳎﺎﻝ ﺍﻟﺒﺤﻮﺙ ﻭﺗﻨﻘﻴﺐ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺑﺴﺒﺐ ﺯﻳﺎﺩﺓ ﺍﻟﻘﺪﺭﺓ ﻋﲆ ﲣﺰﻳﻦ ﺑﻴﺎﻧﺎﺕ “ﺍﳌﺴﺘﺨﺪﻡ ﺷﺨﺼﻴﺔ ﻋﻦ ﺍﳌﺴﺘﺨﺪﻣﲔ، ﻭﺗﻄﻮﻳﺮ ﺧﻮﺍﺭﺯﻣﻴﺎﺕ ﺍﻟﺘﻨﻘﻴﺐ ﻋﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻟﻼﺳﺘﺪﻻﻝ ﻋﲆ ﻫﺬﻩ ﺍﳌﻌﻠﻮﻣﺎﺕ. ﺍﳍﺪﻑ ﺍﻟﺮﺋﻴﺲ ﰲ ﺍﳊﻔﺎﻅ ﻋﲆ ﺧﺼﻮﺻﻴﺔ ﺗﻨﻘﻴﺐ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻫﻮ ﺗﻄﻮﻳﺮ ﻧﻈﺎﻡ ﻟﺘﻌﺪﻳﻞ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻷﺻﻠﻴﺔ ﺑﻄﺮﻳﻘﺔ ﻣﺎ، ﺑﺤﻴﺚ ﺃﻥ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﳋﺎﺻﺔ ﰲ PAMﻭﺍﳌﻌﺮﻓﺔ ﺗﺒﻘﻰ ﴎﻳﺔ ﺣﺘﻰ ﺑﻌﺪ ﺍﻧﺘﻬﺎﺀ ﻋﻤﻠﻴﺔ ﺍﻟﺘﻌﺪﻳﻦ.ﰲ ﻫﺬﺍ ﺍﻟﺒﺤﺚ ﺍﻗﱰﺣﻨﺎ ﻧﻈﺎﻣﺎ ﻳﺴﺘﺨﺪﻡ ﺧﻮﺍﺭﺯﻣﻴﺔ ﺍﻟﺘﺠﻤﻊ ﳎﻤﻮﻋﺎﺕ ﺑﻴﺎﻧﺎﺕ ﻃﺒﻴﺔ ﻟﻐﺮﺽ ﺗﻮﻟﻴﺪ ﳎﻤﻮﻋﺔ ﻣﻦ ﺍﻟﻌﻨﺎﻗﻴﺪ ، ﺛﻢ ﺃﻗﱰﺣﻨﺎ ﲪﺎﻳﺔ ﺍﳌﻌﻠﻮﻣﺎﺕ ﺍﳊﺴﺎﺳﺔ ﰲ ﻛﻞ ﻛﺘﻠﻪ ﻟﻐﺮﺽ ﺯﻳﺎﺩﺓ ﴎﻳﺔ ﻣﻌﻠﻮﻣﺎﺕ ﺍﳌﺴﺘﺨﺪﻣﲔ.ﺃﻥ ﲪﺎﻳﺔ ﺍﳌﻌﻠﻮﻣﺎﺕ ﺍﳊﺴﺎﺳﻪ ﺗﺘﻢ ﺑﺎﺳﺘﻌﲈﻝ ﺗﻘﻨﻴﺎﺕ ﺍﻟﴪﻳﺔ ﻭﻣﻦ ﺧﻼﻝ ﺗﻌﺪﻳﻞ ﻗﻴﻢ ﺍﻟﺒﻴﺎﻧﺎﺕ )ﺍﻟﺼﻔﺎﺕ( ﰲ ﻗﺎﻋﺪﺓ ﺍﻟﺒﻴﺎﻧﺎﺕ. ﺛﻢ ﺃﻗﱰﺣﻨﺎ ﺃﺳﺘﺨﺪﺍﻡ ﺗﻘﻨﻴﺎﺕ ﺍﻟﺒﻌﺜﺮﺓ ﺍﻟﻌﺸﻮﺍﺋﻴﺔ ﻧﺴﺦ ﺍﻟﺒﻴﺎﻧﺎﺕ )ﻭﻫﻲ ﻃﺮﻳﻘﺔ ﺟﺪﻳﺪﺓ ﻣﻘﱰﺣﺔ ﰲ ﻫﺬﺍ ﺍﻟﻌﻤﻞ( ﳌﻨﻊ ﺍﳌﻬﺎﲨﲔ ﻣﻦ ﺃﺳﺘﻨﺘﺎﺝ ﻣﻌﻠﻮﻣﺎﺕ ﺍﻷﻓﺮﺍﺩ. ﺑﻌﺪ ﺍﻟﺘﻌﺪﻳﻞ ﻧﻔﺲ ﺧﻮﺍﺭﺯﻣﻴﺔ ﺍﻟﺘﺠﻤﻊ ﺗﻄﺒﻖ ﻋﲆ ﻗﺎﻋﺪﺓ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﳌﺤﺪﺛﺔ ﻟﻠﺘﺤﻘﻖ ﻣﻦ ﺃﻥ ﺍﳌﻌﻠﻮﻣﺎﺕ ﺍﳊﺴﺎﺳﺔ ﳐﻔﻴﺔ ﺃﻡ ﻻ. ﺍﻟﻨﺘﺎﺋﺞ ﺍﻟﺘﺠﺮﻳﺒﻴﺔ ﻋﲆ ﻫﺬﻩ ﺍﻟﺘﻘﻨﻴﺎﺕ ﺍﳌﻘﱰﺣﺔ ﺃﺛﺒﺘﺖ ﺃﻥ ﺍﳋﻮﺍﺭﺯﻣﻴﺔ ﻓﻌﺎﻟﺔ ﻟﻠﺘﺠﻤﻴﻊ ﰲ ﲨﻴﻊ ﳎﻤﻮﻋﺎﺕ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻭﺃﻥ ﺍﻟﻜﺘﻠﺔ ﺍﳌﺤﺪﺩﺓ ﺗﻢ ﲪﺎﻳﺘﻬﺎ ﺑﻜﻔﺎﺀﺓ ﺑﺎﺳﺘﺨﺪﺍﻡ ﺗﻘﻨﻴﺎﺕ )ﻧﺴﺦ ﺍﻟﺒﻴﺎﻧﺎﺕ(. PAM ﻫﺬﻩ ﺍﻟﺘﻘﻨﻴﺎﺕ ﺗﻢ ﺗﻄﺒﻴﻘﻬﺎ ﻋﲆ ﺑﻴﺎﻧﺎﺕ ﴎﻃﺎﻥ ﺍﻟﺜﺪﻱ، ﳎﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺍﻟﺴﻜﺮﻱ. ﺃﺧﲑﺍ ﻧﺘﺎﺋﺞ ﺍﻟﻨﻈﺎﻡ ﺍﳌﻘﱰﺡ ﺃﺛﺒﺘﺖ ﺃﻥ ﺗﺸﻮﻳﻪ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻳﻤﻜﻦ ﺃﻥ ﳜﻔﺾ ﻋﻨﺪﻣﺎ ﻧﺴﺒﺔ ﺍﳋﺼﻮﺻﻴﺔ ﺗﺰﺩﺍﺩ. ﻫﺬﻩ ﺍﻟﻘﻀﺎﻳﺎ ﻣﻬﻤﻪ ﰲ ﻋﻤﻠﻴﺔ ﺣﻔﻆ ﺍﳋﺼﻮﺻﻴﺔ )ﺍﻟﴪﻳﺔ( ﰲ ﺗﻌﺪﻳﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ، ﻟﺬﺍ ﻓﺄﻥ ﺍﻟﻨﻈﺎﻡ ﺍﳌﻘﱰﺡ ﻧﺎﺟﺢ ﺟﺪﺍ ﰲ ﲢﻘﻴﻖ ﲪﺎﻳﺔ ﺍﻟﴪﻳﺔ.

Listing 1 - 1 of 1
Sort by
Narrow your search

Resource type

article (1)


Language

English (1)


Year
From To Submit

2015 (1)