Secure Distributed DeduplicationSystems with Improved Reliability(2015) |
ABSTRACT:
Data deduplication is a technique for eliminating duplicate copies of data, and has been widely used in cloud storage to reduce storage space and upload bandwidth. However, there is only one copy for each file stored in cloud even if such a file is owned by a huge number of users. As a result, deduplication system improves storage utilization while reducing reliability. Furthermore, the challenge of privacy for sensitive data also arises when they are outsourced by users to cloud. Aiming to address the above security challenges, this paper makes the first attempt to formalize the notion of distributed reliable deduplication system. We propose new distributed deduplication systems with higher reliability in which the data chunks are distributed across multiple cloud servers. The security requirements of data confidentiality and tag consistency are also achieved by introducing a deterministic secret sharing scheme in distributed storage systems, instead of using convergent encryption as in previous deduplication systems. Security analysis demonstrates that our deduplication systems are secure in terms of the definitions specified in the proposed security model. As a proof of concept, we implement the proposed systems and demonstrate that the incurred overhead is very limited in realistic environments.
EXISTING SYSTEM:
- A number of deduplication systems have been proposed based on various deduplication strategies such as client-side or server-side deduplications, file-level or block-level deduplications.
- Bellare et al. formalized this primitive as message-locked encryption, and explored its application in space efficient secure outsourced storage. There are also several implementations of convergent implementations of different convergent encryption variants for secure deduplication.
- Li addressed the key-management issue in block-level deduplication by distributing these keys across multiple servers after encrypting the files.
- Bellare et al. showed how to protect data confidentiality by transforming the predicatable message into a unpredicatable message.
DISADVANTAGES OF EXISTING SYSTEM:
- Data reliability is actually a very critical issue in a deduplication storage system because there is only one copy for each file stored in the server shared by all the owners.
- Most of the previous deduplication systems have only been considered in a single-server setting.
- The traditional deduplication methods cannot be directly extended and applied in distributed and multi-server systems.
PROPOSED SYSTEM:
- In this paper, we show how to design secure deduplication systems with higher reliability in cloud computing. We introduce the distributed cloud storage servers into deduplication systems to provide better fault tolerance.
- To further protect data confidentiality, the secret sharing technique is utilized, which is also compatible with the distributed storage systems. In more details, a file is first split and encoded into fragments by using the technique of secret sharing, instead of encryption mechanisms. These shares will be distributed across multiple independent storage servers.
- Furthermore, to support deduplication, a short cryptographic hash value of the content will also be computed and sent to each storage server as the fingerprint of the fragment stored at each server.
- Only the data owner who first uploads the data is required to compute and distribute such secret shares, while all following users who own the same data copy do not need to compute and store these shares any more.
- To recover data copies, users must access a minimum number of storage servers through authentication and obtain the secret shares to reconstruct the data. In other words, the secret shares of data will only be accessible by the authorized users who own the corresponding data copy.
- Four new secure deduplication systems are proposed to provide efficient deduplication with high reliability for file-level and block-level deduplication, respectively. The secret splitting technique, instead of traditional encryption methods, is utilized to protect data confidentiality. Specifically, data are split into fragments by using secure secret sharing schemes and stored at different servers.
ADVANTAGES OF PROPOSED SYSTEM:
- Distinguishing feature of our proposal is that data integrity, including tag consistency, can be achieved.
- To our knowledge, no existing work on secure deduplication can properly address the reliability and tag consistency problem in distributed storage systems.
- Our proposed constructions support both file-level and block-level deduplications.
- Security analysis demonstrates that the proposed deduplication systems are secure in terms of the definitions specified in the proposed security model. In more details, confidentiality, reliability and integrity can be achieved in our proposed system. Two kinds of collusion attacks are considered in our solutions. These are the collusion attack on the data and the collusion attack against servers. In particular, the data remains secure even if the adversary controls a limited number of storage servers.
- We implement our deduplication systems using the Ramp secret sharing scheme that enables high reliability and confidentiality levels. Our evaluation results demonstrate that the new proposed constructions are efficient and the redundancies are optimized and comparable with the other storage system supporting the same level of reliability.
MODULES:
· System Model
· Data Deduplication
· File level Deduplication Systems
· Block level Deduplication systems
MODULES DESCRIPTION:
System Model
In this first module, we develop two entities: User and Secure-Cloud Service Provide.
User: The user is an entity that wants to outsource data storage to the S-CSP and access the data later. In a storage system supporting deduplication, the user only uploads unique data but does not upload any duplicate data to save the upload bandwidth. Furthermore, the fault tolerance is required by users in the system to provide higher reliability.
S-CSP: The S-CSP is an entity that provides the outsourcing data storage service for the users. In the deduplication system, when users own and store the same content, the S-CSP will only store a single copy of these files and retain only unique data. A deduplication technique, on the other hand, can reduce the storage cost at the server side and save the upload bandwidth at the user side. For fault tolerance and confidentiality of data storage, we consider a quorum of S-CSPs, each being an independent entity. The user data is distributed across multiple S-CSPs.
Data Deduplication:
Data Deduplication involves finding and removing of duplicate datas without considering its fidelity.
Here the goal is to store more datas with less bandwidth.
Files are uploaded to the CSP and only the Dataowners can view and download it.
The Security requirements is also achieved by Secret Sharing Scheme.
Secret Sharing Scheme uses two algorithms, share and recover.
Datas are uploaded both file and block level and the finding duplication is also in the same process.
This is made possible by finding duplicate chunks and maintaining a single copy of chunks.
File Level Deduplication Systems:
To support efficient duplicate check, tags for each file will be computed and are sent to S-CSPs.
To upload a file F , the user interacts with S-CSPs to perform the deduplication.
More precisely, the user firstly computes and sends the file tag ?F = TagGen(F) to S-CSPs for the file duplicate check.
If a duplicate is found the user computesand sendsit to a server via a secure channel.
Otherwise if no duplicate is found the process continues,i.e secret sharing scheme runs and the user will upload a file to CSP.
To download a file the user will use the secret shares and download it from the SCSP’s .
This approach provides fault tolerance and allows the user to remain accessible even if any limited subsets of storage servers fail.
Block Level Deduplication Systems:
In this module we will show to achieve fine grained block-level distributeddeduplication systems.
In a block-level deduplication system, the user also needs to firstly perform the file-level deduplication before uploading his file.
If no duplicate is found, the user divides this fileinto blocks and performs block-level deduplication.
The System setup is similar to the file level deduplication except the parameter changes.
To download a block the user gets the secret shares and download the blocks from CSP.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
System : Pentium IV 2.4 GHz.
Hard Disk : 40 GB.
Floppy Drive : 1.44 Mb.
Monitor : 15 VGA Colour.
Mouse : Logitech.
Ram : 512 Mb.
SOFTWARE REQUIREMENTS:
Operating system : Windows XP/7.
Coding Language : JAVA
IDE : Netbeans 7.4
Database : MYSQL
REFERENCE:
Jin Li, Xiaofeng Chen, Xinyi Huang, Shaohua Tang and Yang Xiang Senior Member, IEEE and Mohammad Mehedi Hassan Member, IEEE and Abdulhameed Alelaiwi Member, IEEE, “Secure Distributed Deduplication Systems with Improved Reliability”, IEEE Transactions on Computers, 2015.
No comments:
Post a Comment