Sharing data might bring the risk of disclosing the sensitive knowledge in it. Usually, the data owner may choose to sanitize data by modifying some items in it to hide sensitive knowledge prior to sharing. This paper focuses on protecting sensitive knowledge in the form of frequent itemsets by data sanitization. The sanitization process may result in side effects, i.e., the data distortion and the damage to the non-sensitive frequent itemsets. How to minimize these side effects is a challenging problem faced by the research community. Actually, there is a trade-off when trying to minimize both side effects simultaneously. In view of this, we propose a data sanitization method based on evolutionary multi-objective optimization (EMO). This method can hide specified sensitive itemsets completely while minimizing the accompanying side effects. Experiments on real datasets show that the proposed approach is very effective in performing the hiding task with fewer damage to the original data and non-sensitive knowledge.
Peng CHENG
Harbin Institute of Technology
Chun-Wei LIN
Harbin Institute of Technology
Jeng-Shyang PAN
Harbin Institute of Technology
Ivan LEE
University of South Australia
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Peng CHENG, Chun-Wei LIN, Jeng-Shyang PAN, Ivan LEE, "Manage the Tradeoff in Data Sanitization" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 10, pp. 1856-1860, October 2015, doi: 10.1587/transinf.2014EDL8250.
Abstract: Sharing data might bring the risk of disclosing the sensitive knowledge in it. Usually, the data owner may choose to sanitize data by modifying some items in it to hide sensitive knowledge prior to sharing. This paper focuses on protecting sensitive knowledge in the form of frequent itemsets by data sanitization. The sanitization process may result in side effects, i.e., the data distortion and the damage to the non-sensitive frequent itemsets. How to minimize these side effects is a challenging problem faced by the research community. Actually, there is a trade-off when trying to minimize both side effects simultaneously. In view of this, we propose a data sanitization method based on evolutionary multi-objective optimization (EMO). This method can hide specified sensitive itemsets completely while minimizing the accompanying side effects. Experiments on real datasets show that the proposed approach is very effective in performing the hiding task with fewer damage to the original data and non-sensitive knowledge.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.2014EDL8250/_p
Copy
@ARTICLE{e98-d_10_1856,
author={Peng CHENG, Chun-Wei LIN, Jeng-Shyang PAN, Ivan LEE, },
journal={IEICE TRANSACTIONS on Information},
title={Manage the Tradeoff in Data Sanitization},
year={2015},
volume={E98-D},
number={10},
pages={1856-1860},
abstract={Sharing data might bring the risk of disclosing the sensitive knowledge in it. Usually, the data owner may choose to sanitize data by modifying some items in it to hide sensitive knowledge prior to sharing. This paper focuses on protecting sensitive knowledge in the form of frequent itemsets by data sanitization. The sanitization process may result in side effects, i.e., the data distortion and the damage to the non-sensitive frequent itemsets. How to minimize these side effects is a challenging problem faced by the research community. Actually, there is a trade-off when trying to minimize both side effects simultaneously. In view of this, we propose a data sanitization method based on evolutionary multi-objective optimization (EMO). This method can hide specified sensitive itemsets completely while minimizing the accompanying side effects. Experiments on real datasets show that the proposed approach is very effective in performing the hiding task with fewer damage to the original data and non-sensitive knowledge.},
keywords={},
doi={10.1587/transinf.2014EDL8250},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Manage the Tradeoff in Data Sanitization
T2 - IEICE TRANSACTIONS on Information
SP - 1856
EP - 1860
AU - Peng CHENG
AU - Chun-Wei LIN
AU - Jeng-Shyang PAN
AU - Ivan LEE
PY - 2015
DO - 10.1587/transinf.2014EDL8250
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2015
AB - Sharing data might bring the risk of disclosing the sensitive knowledge in it. Usually, the data owner may choose to sanitize data by modifying some items in it to hide sensitive knowledge prior to sharing. This paper focuses on protecting sensitive knowledge in the form of frequent itemsets by data sanitization. The sanitization process may result in side effects, i.e., the data distortion and the damage to the non-sensitive frequent itemsets. How to minimize these side effects is a challenging problem faced by the research community. Actually, there is a trade-off when trying to minimize both side effects simultaneously. In view of this, we propose a data sanitization method based on evolutionary multi-objective optimization (EMO). This method can hide specified sensitive itemsets completely while minimizing the accompanying side effects. Experiments on real datasets show that the proposed approach is very effective in performing the hiding task with fewer damage to the original data and non-sensitive knowledge.
ER -