As businesses continue to move towards D2D backup, VTL, and virtualization, the cost of rapidly growing data stores that require backup & remote replication for disaster recovery can quickly get out of control. Data de-duplication technology is one way IT departments can reduce and control the costs of keeping and backing up data. In addition, de-duplication helps reduce the bandwidth costs associated with replicating data over a WAN. Businesses who take advantage of data deduplication:
As more organizations move into D2D (disk-to-disk) backup for nearline storage as well as remote replication for disaster recovery, and technologies such as VTL (virtual tape library), many are concerned with the cost of these new disk-based data protection solutions. Although low-cost disk technologies like SATA have reduced the cost of storing data on disk, rapidly growing capacity requirements continue to keep disk-based data protection out of the reach of small- to medium-sized businesses with modest IT budgets.
Dedupe is a method of reducing or eliminating redundant files or blocks of data, to ensure that only unique data is stored to disk. This technology is also sometimes referred to as capacity optimized protection, and addresses rapidly growing capacity needs due to “capacity bloat” by reducing the capacity required at the backup site.
For example, if an employee emails out a Word attachment to 10 co-workers oftentimes, a copy is saved for every employee it was sent to, increasing the capacity requirement of the file by a factor of 10 on the messaging data volume. De-duplication technology eliminates the redundant files, replacing them with ‘pointers’ to the original data after it has been confirmed that all copies are identical. Ideally, data de-duplication technology should operate transparently.
Benefits of data de-duplication include:
There are two main methods of implementing deduplication: inline or offline. Inline de-duplication is performed at the host application, or on an appliance sitting on the data path, which minimizes disk capacity requirements, thereby maximizing cost-savings achieved by reduced disk requirements. Inline dedupe has its disadvantages, as performance is negatively impacted by having de-duplication performed on the data path.
Offline de-duplication performs the process at the backup system or appliance, which requires more disk capacity, but maximizes performance by having the process reside outside of the data path, after the backup job is complete.Some of our data deduplication solutions provide policy-based de-duplication so each backup job can utilize a de-duplication method appropriate for the workload requirements.