Manage the Tradeoff in Data Sanitization

Peng CHENG, Chun-Wei LIN, Jeng-Shyang PAN, Ivan LEE

  • Full Text Views

    0

  • Cite this

Summary :

Sharing data might bring the risk of disclosing the sensitive knowledge in it. Usually, the data owner may choose to sanitize data by modifying some items in it to hide sensitive knowledge prior to sharing. This paper focuses on protecting sensitive knowledge in the form of frequent itemsets by data sanitization. The sanitization process may result in side effects, i.e., the data distortion and the damage to the non-sensitive frequent itemsets. How to minimize these side effects is a challenging problem faced by the research community. Actually, there is a trade-off when trying to minimize both side effects simultaneously. In view of this, we propose a data sanitization method based on evolutionary multi-objective optimization (EMO). This method can hide specified sensitive itemsets completely while minimizing the accompanying side effects. Experiments on real datasets show that the proposed approach is very effective in performing the hiding task with fewer damage to the original data and non-sensitive knowledge.

Publication
IEICE TRANSACTIONS on Information Vol.E98-D No.10 pp.1856-1860
Publication Date
2015/10/01
Publicized
2015/07/14
Online ISSN
1745-1361
DOI
10.1587/transinf.2014EDL8250
Type of Manuscript
LETTER
Category
Artificial Intelligence, Data Mining

Authors

Peng CHENG
  Harbin Institute of Technology
Chun-Wei LIN
  Harbin Institute of Technology
Jeng-Shyang PAN
  Harbin Institute of Technology
Ivan LEE
  University of South Australia

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.