FreeBSD ZFS Deduplication

Time to prove or disprove the ZFS dedup capabilities. First its important to understand that as with compression, if we change the settings on the storage block then only new data written to that area will be de-duped or compressed. So the first task is the clear down what we have running and check the initial settings and baseline with:

jtest# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
DATA  5.44T  1.36M  5.44T     0%  1.00x  ONLINE  -

jtest# zfs get all DATA
NAME  PROPERTY              VALUE                  SOURCE
DATA  type                  filesystem             -
DATA  creation              Sun Oct 16 23:27 2011  -
DATA  used                  730K                   -
DATA  available             3.56T                  -
DATA  referenced            30.6K                  -
DATA  compressratio         1.00x                  -
DATA  mounted               yes                    -
DATA  quota                 none                   default
DATA  reservation           none                   default
DATA  recordsize            128K                   default
DATA  mountpoint            /DATA                  default
DATA  sharenfs              off                    default
DATA  checksum              on                     default
DATA  compression           gzip                   local
DATA  atime                 on                     default
DATA  devices               on                     default
DATA  exec                  on                     default
DATA  setuid                on                     default
DATA  readonly              off                    default
DATA  jailed                off                    default
DATA  snapdir               hidden                 default
DATA  aclmode               discard                default
DATA  aclinherit            restricted             default
DATA  canmount              on                     default
DATA  xattr                 off                    temporary
DATA  copies                1                      local
DATA  version               4                      -
DATA  utf8only              off                    -
DATA  normalization         none                   -
DATA  casesensitivity       sensitive              -
DATA  vscan                 off                    default
DATA  nbmand                off                    default
DATA  sharesmb              off                    default
DATA  refquota              none                   default
DATA  refreservation        none                   default
DATA  primarycache          all                    default
DATA  secondarycache        all                    default
DATA  usedbysnapshots       0                      -
DATA  usedbydataset         30.6K                  -
DATA  usedbychildren        699K                   -
DATA  usedbyrefreservation  0                      -
DATA  logbias               latency                default
DATA  dedup                 on                     local
DATA  mlslabel                                     -
DATA  sync                  standard               default
DATA  refcompressratio      1.00x                  -

Notice i’ve got dedup and compression on here. I’m going to switch off compression to see the difference the dedup makes on its own.

jtest# zfs set compression=off DATA

jtest# zfs get compression,dedup DATA
NAME  PROPERTY     VALUE          SOURCE
DATA  compression  off            local
DATA  dedup        on             local

I have my 1GB test file in my home directory for testing, which I’ll copy into 3 new directories:

jtest# mkdir /DATA/test1
jtest# mkdir /DATA/test2
jtest# mkdir /DATA/test3
jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test1/
jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test2/
jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test3/

Let’s see what the (not so) trusty du command reports:

jtest# cd /DATA
jtest# du -d1 -h
1.0G	./test3
1.0G	./test2
1.0G	./test1
3.0G	.

Well thats not sup rising really. To see the dedup performance use this:

jtest# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
DATA  5.44T  1.43G  5.44T     0%  3.21x  ONLINE  -

Just for kicks, let try with another 3 identical files.

jtest# mkdir /DATA/test4
jtest# mkdir /DATA/test5
jtest# mkdir /DATA/test6
jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test4/
jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test5/
jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test6/
jtest# du -d1 -h
1.0G	./test4
1.0G	./test3
1.0G	./test2
1.0G	./test5
1.0G	./test1
1.0G	./test6
6.1G	.
jtest# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
DATA  5.44T  1.43G  5.44T     0%  6.42x  ONLINE  -

Blimey! looks they are free now! I’ll put 3 more on – just to be sure

jtest# du -d1 -h
1.0G	./test4
1.0G	./test9
1.0G	./test3
1.0G	./test7
1.0G	./test2
1.0G	./test8
1.0G	./test5
1.0G	./test1
1.0G	./test6
9.1G	.
jtest# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
DATA  5.44T  1.43G  5.44T     0%  9.64x  ONLINE  -
This entry was posted in FreeBSD Administration, ZFS and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *