Danny Moran

How to setup Data Deduplication on Windows Server

Published May 13, 2023 by Danny Moran

Table of Contents
PAGE CONTENT

Introduction

Learn how to install and enable data deduplication on Windows Server. In this example, I show you how to install the feature, setup volumes so that they are deduplicated, and manage the automated deduplication tasks.

Video

Enabling the Data Deduplication Role

You can enable the Data Deduplication feature by running the following PowerShell command:

Install-WindowsFeature -Name FS-Data-Deduplication

Or, you can follow these steps to install it using Server Manager:

  1. Open Server Manager.

  2. Press Manage and then Add Roles and Features.

  3. Select Next on the Before You Begin page.

  4. Select Role-based or feature-based installation and press Next.

  5. Select the server you want to install the feature onto and press Next.

  6. Tick Data Deduplication under File and Storage Services > File and iSCSI Services.

  7. Press Add Features on the popup window. Any additional features will be automatically selected if they are required.

  8. Press Next on the Server Roles page.

  9. Press Next on the Features page.

  10. Press Install on the Confirmation page to install Data Deduplication.

Evaluate Data Deduplication Benefits

After enabling the Data Deduplication feature, you can run the ddpeval command against a drive or a file share and it will calculate how much storage can be saved by enabling Data Deduplication.

You can analyse an entire drive by running the below command, replacing E with the letter of the drive you want to analyse.

ddpeval.exe E:\

You can analyse a specific share by running the below command, replacing E:\Shares with the file path of the share you want to analyse.

ddpeval.exe E:\Shares

Configuring Data Deduplication

After Data Deduplication has been installed, you can now enable it on your data volumes.

  1. Within Server Manager, press File and Storage Services on the left hand panel.

  2. Select Volumes.

  3. Right-click the volume you want to enable Data Deduplication for and select Configure Data Deduplication.

  4. Change Data Deduplication from Disabled to General purpose file server.

  5. By default, files older than 3 days will be deduplicated if possible. You can amend this as required.

  6. Specific file types can be excluded from Data Deduplication if required. Enter the file types in a comma separated format such as pdf,csv,docx if exclusions are required.

  7. Press Set Deduplication Schedule.

  8. Tick Enable background optimization and Enable throughput optimization.

  9. You can amend the time the throughput optimization starts and how long it runs for if required. It is best to enable this option to run during the night as when it runs, the resource requirements are not capped and can cause performance issues if run when the files are being accessed frequently.

  10. Press OK.

  11. Press Apply to enable the Data Deduplication policy.

Viewing the Data Deduplication Service Status

You can view the status of the Data Deduplication by running the following command. This returns information such as the drive capacity, free space, used space, savings percent, files deduplicated, and other useful information to see how the Data Deduplication service is performing.

Get-DedupStatus

For a more detailed view, you can run:

Get-DedupStatus | fl

Checking the Data Deduplication Schedules

By running the Get-DedupSchedule PowerShell command, this will return the Optimization, GarbageCollection, and Scrubbing schedules.

Get-DedupSchedule

Scheduled Tasks

When you enable Data Deduplication, scheduled tasks are automatically created in the Task Scheduler to run the Background Optimization, ThroughputOptimization, Weekly Garbage Collection, and the Weekly Scrubbing. These tasks can be found within Task Schduler:

Task Scheduler Library\Microsoft\Windows\Deduplication\BackgroundOptimization
Task Scheduler Library\Microsoft\Windows\Deduplication\ThroughputOptimization
Task Scheduler Library\Microsoft\Windows\Deduplication\ThroughputOptimization-2
Task Scheduler Library\Microsoft\Windows\Deduplication\WeeklyGarbageCollection
Task Scheduler Library\Microsoft\Windows\Deduplication\WeeklyScrubbing

Manually running Data Deduplication

You can manually initiate a Data Deduplication job by running the below command replacing E with the drive letter of the volume you want to run the task for.

Start-DedupJob -Type Optimization -Volume E:\

View Currently Running Jobs

To view a list of active jobs, you can run the following command. This command only returns results if there is a job currently running or a job that has finished within the last 10 seconds.

Get-DedupJob