Time to prove or disprove the ZFS dedup capabilities. First its important to understand that as with compression, if we change the settings on the storage block then only new data written to that area will be de-duped or compressed. So the first task is the clear down what we have running and check the initial settings and baseline with:
jtest# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT DATA 5.44T 1.36M 5.44T 0% 1.00x ONLINE - jtest# zfs get all DATA NAME PROPERTY VALUE SOURCE DATA type filesystem - DATA creation Sun Oct 16 23:27 2011 - DATA used 730K - DATA available 3.56T - DATA referenced 30.6K - DATA compressratio 1.00x - DATA mounted yes - DATA quota none default DATA reservation none default DATA recordsize 128K default DATA mountpoint /DATA default DATA sharenfs off default DATA checksum on default DATA compression gzip local DATA atime on default DATA devices on default DATA exec on default DATA setuid on default DATA readonly off default DATA jailed off default DATA snapdir hidden default DATA aclmode discard default DATA aclinherit restricted default DATA canmount on default DATA xattr off temporary DATA copies 1 local DATA version 4 - DATA utf8only off - DATA normalization none - DATA casesensitivity sensitive - DATA vscan off default DATA nbmand off default DATA sharesmb off default DATA refquota none default DATA refreservation none default DATA primarycache all default DATA secondarycache all default DATA usedbysnapshots 0 - DATA usedbydataset 30.6K - DATA usedbychildren 699K - DATA usedbyrefreservation 0 - DATA logbias latency default DATA dedup on local DATA mlslabel - DATA sync standard default DATA refcompressratio 1.00x -
Notice i’ve got dedup and compression on here. I’m going to switch off compression to see the difference the dedup makes on its own.
jtest# zfs set compression=off DATA jtest# zfs get compression,dedup DATA NAME PROPERTY VALUE SOURCE DATA compression off local DATA dedup on local
I have my 1GB test file in my home directory for testing, which I’ll copy into 3 new directories:
jtest# mkdir /DATA/test1 jtest# mkdir /DATA/test2 jtest# mkdir /DATA/test3 jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test1/ jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test2/ jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test3/
Let’s see what the (not so) trusty du command reports:
jtest# cd /DATA jtest# du -d1 -h 1.0G ./test3 1.0G ./test2 1.0G ./test1 3.0G .
Well thats not sup rising really. To see the dedup performance use this:
jtest# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT DATA 5.44T 1.43G 5.44T 0% 3.21x ONLINE -
Just for kicks, let try with another 3 identical files.
jtest# mkdir /DATA/test4 jtest# mkdir /DATA/test5 jtest# mkdir /DATA/test6 jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test4/ jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test5/ jtest# cp /home/dan/FreeBSD-8.2-RELEASE-amd64-memstick.img /DATA/test6/ jtest# du -d1 -h 1.0G ./test4 1.0G ./test3 1.0G ./test2 1.0G ./test5 1.0G ./test1 1.0G ./test6 6.1G . jtest# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT DATA 5.44T 1.43G 5.44T 0% 6.42x ONLINE -
Blimey! looks they are free now! I’ll put 3 more on – just to be sure
jtest# du -d1 -h 1.0G ./test4 1.0G ./test9 1.0G ./test3 1.0G ./test7 1.0G ./test2 1.0G ./test8 1.0G ./test5 1.0G ./test1 1.0G ./test6 9.1G . jtest# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT DATA 5.44T 1.43G 5.44T 0% 9.64x ONLINE -