Data deduplication

What is deduplication?

As businesses continue to move towards D2D backup, VTL, and virtualization, the cost of rapidly growing data stores that require backup & remote replication for disaster recovery can quickly get out of control. Data de-duplication technology is one way IT departments can reduce and control the costs of keeping and backing up data. In addition, de-duplication helps reduce the bandwidth costs associated with replicating data over a WAN. Businesses who take advantage of data deduplication:

  • experience massive reduction of storage capacity requirements
  • reduce WAN resources dedicated to backup
  • dramatically shorten backup windows
  • extend data retention timelines

As more organizations move into D2D (disk-to-disk) backup for nearline storage as well as remote replication for disaster recovery, and technologies such as VTL (virtual tape library), many are concerned with the cost of these new disk-based data protection solutions. Although low-cost disk technologies like SATA have reduced the cost of storing data on disk, rapidly growing capacity requirements continue to keep disk-based data protection out of the reach of small- to medium-sized businesses with modest IT budgets.

Data De-Duplication

Dedupe is a method of reducing or eliminating redundant files or blocks of data, to ensure that only unique data is stored to disk. This technology is also sometimes referred to as capacity optimized protection, and addresses rapidly growing capacity needs due to “capacity bloat” by reducing the capacity required at the backup site.

For example, if an employee emails out a Word attachment to 10 co-workers oftentimes, a copy is saved for every employee it was sent to, increasing the capacity requirement of the file by a factor of 10 on the messaging data volume. De-duplication technology eliminates the redundant files, replacing them with ‘pointers’ to the original data after it has been confirmed that all copies are identical. Ideally, data de-duplication technology should operate transparently.

Benefits of data de-duplication include:

  • reduced capacity requirements for cost-savings
  • increased capacity for other backup data, leading to longer retention period with minimal media management
  • more reliability and improved RTO
  • lowers the cost and bandwidth barriers associated with WAN-based remote replication

Inline de-duplication v.s. offline de-duplication

There are two main methods of implementing deduplication: inline or offline. Inline de-duplication is performed at the host application, or on an appliance sitting on the data path, which minimizes disk capacity requirements, thereby maximizing cost-savings achieved by reduced disk requirements. Inline dedupe has its disadvantages, as performance is negatively impacted by having de-duplication performed on the data path.

Offline de-duplication performs the process at the backup system or appliance, which requires more disk capacity, but maximizes performance by having the process reside outside of the data path, after the backup job is complete.Some of our data deduplication solutions provide policy-based de-duplication so each backup job can utilize a de-duplication method appropriate for the workload requirements.

North American Systems has been providing IT solutions, sevices and hardware for over 15 years.

If you want learn more about what we can do for your IT, please contact us at 800-927-7474, or send us an email at to get in touch with one of our experienced account executives.

Want to find out more about what North American Systems has to offer?

Fill out the form below, and one of our account executives will follow up with you promptly

Feel free to contact us at 800-927-7474, or email