Achieving High Data Utility K-Anonymization Using Similarity-Based Clustering Model

Mohammad Rasool SARRAFI AGHDAM, Noboru SONEHARA

  • Full Text Views

    0

  • Cite this

Summary :

In data sharing privacy has become one of the main concerns particularly when sharing datasets involving individuals contain private sensitive information. A model that is widely used to protect the privacy of individuals in publishing micro-data is k-anonymity. It reduces the linking confidence between private sensitive information and specific individual by generalizing the identifier attributes of each individual into at least k-1 others in dataset. K-anonymity can also be defined as clustering with constrain of minimum k tuples in each group. However, the accuracy of the data in k-anonymous dataset decreases due to huge information loss through generalization and suppression. Also most of the current approaches are designed for numerical continuous attributes and for categorical attributes they do not perform efficiently and depend on attributes hierarchical taxonomies, which often do not exist. In this paper we propose a new model for k-anonymization, which is called Similarity-Based Clustering (SBC). It is based on clustering and it measures similarity and calculates distances between tuples containing numerical and categorical attributes without hierarchical taxonomies. Based on this model a bottom up greedy algorithm is proposed. Our extensive study on two real datasets shows that the proposed algorithm in comparison with existing well-known algorithms offers much higher data utility and reduces the information loss significantly. Data utility is maintained above 80% in a wide range of k values.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.8 pp.2069-2078
Publication Date
2016/08/01
Publicized
2016/05/31
Online ISSN
1745-1361
DOI
10.1587/transinf.2015INP0019
Type of Manuscript
Special Section PAPER (Special Section on Security, Privacy and Anonymity of Internet of Things)
Category

Authors

Mohammad Rasool SARRAFI AGHDAM
  School of Multidisciplinary, Informatics Department
Noboru SONEHARA
  School of Multidisciplinary, Informatics Department,National Institute of Informatics (NII)

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.