Documente Academic
Documente Profesional
Documente Cultură
FEBRUARY 2014
INTRODUCTION
. The authors are with the Data Intensive Analysis and Computing Lab,
Ohio Center of Excellence in Knowledge Enabled Computing, Department
of Computer Science and Engineering, Wright State University, Dayton,
OH 45435.
Manuscript received 12 Mar. 2012; revised 8 Oct. 2012; accepted 2 Dec. 2012;
published online 28 Dec. 2012.
Recommended for acceptance by E. Ferrari.
For information on obtaining reprints of this article, please send e-mail to:
tkde@computer.org, and reference IEEECS Log Number TKDE-2012-03-0167.
Digital Object Identifier no. 10.1109/TKDE.2012.251.
1041-4347/14/$31.00 2014 IEEE
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
323
324
FEBRUARY 2014
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
2.
3.
325
n
2
1X
xij x^ij ;
n i1
^ j . The
which is equivalent to the variance: var Xj X
square root of MSE (RMSE) represent the uncertainty of
the estimationfor an estimated value x^, the original value
x could be in the range (^
x -RMSE, x^ RMSE). Thus, the
length of the range, 2 RMSE, also represents the accuracy
of the estimation.
326
FEBRUARY 2014
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
4.2
327
328
FEBRUARY 2014
4.3
scan of the entire data set. The result of second stage will
return the exact range query result to the proxy server,
which significantly reduces the postprocessing cost that the
proxy server needs to take. It is very important to the cloudbased service, because low postprocessing cost requires low
in-house investment.
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
329
330
We show that
Proposition 3.
high
i
low
mid
=2 i
i
high
high
sj;max =2.
We show that
FEBRUARY 2014
EXPERIMENTS
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
331
Fig. 6. The cost distribution of the full RASP scheme. Data: Adult
(20K records, 5-9 dimensions).
332
FEBRUARY 2014
Fig. 8. Performance comparison on uniform data. Left: data size versus cost of query; Middle: data dimensionality versus cost of query; Right: query
range (percentage of the domain) versus cost of query.
Fig. 9. Performance comparison on Adult data. Left: data size versus cost of query. Middle: data dimensionality versus cost of query. Right: query
range (percentage of the domain) versus cost of query.
Fig. 10. Performance and result precision for different setting of the
k; -range algorithm for 2D data.
TABLE 1
Wall Clock Cost Distribution (Milliseconds) and Comparison
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
333
TABLE 2
Per-Query Performance Comparison (Milliseconds) between
Linear Scan on the Original Nonperturbed Data and Index-Aided
kNN-R Processing on Perturbed Data
Fig. 12. The impact of cloaking-box size on precision for Casper for the
NE data.
RELATED WORK
334
CONCLUSION
FEBRUARY 2014
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
XU ET AL.: BUILDING CONFIDENTIAL AND EFFICIENT QUERY SERVICES IN THE CLOUD WITH RASP DATA PERTURBATION
335