Steve Modica, CTO, Small Tree
Disk fragmentation is a common issue when it comes to storage — especially if the quantity of media you need to keep on spinning disk is approaching the capacity of the storage volumes you have available. Fortunately, fragmentation can be fairly easily avoided, and even if it has become a problem, solutions don't have to be complicated. Small Tree CTO Steve Modica offered to fill us in on the subject.
 
What is disk fragmentation, and how does it impact post-production?
 
Disk fragmentation is a process whereby files (and free space) on your storage get broken up into small pieces rather than stored in large, linear chunks. It has a much larger impact on post production facilities because files tend to be very large. The larger the files, the more they can be impacted by fragmentation.   (Obviously, a 1 block file can’t be fragmented at all.) While disk fragmentation doesn’t have as much of an impact on SSDs as it does on spinning disks, it definitely does have an impact and should be avoided when possible.
 
What causes disk fragmentation?
 
Disk fragmentation occurs when a disk or storage system gets too full — usually, somewhere in the 80-85% range, but this will depend on the size of the files you are writing. In a perfect world, our computers would simply write new files to the end of our disks and just keep going until they filled up. However, in the real world, we’re constantly deleting files and that leaves holes all over the place. Our computers try to fill those holes up when we write new data using a “best-fit” algorithm. When our disks get full, there isn’t enough space to write anymore, so our algorithms start to break up our writes into tinier and tinier chunks.  
 
Why is disk fragmentation a problem?
 
When reading data or streaming video from spinning disks, it’s always best when the file blocks are in order. The disk heads can read a block and read the next one as the disk spins underneath. There’s no seeking involved. Products designed to deliver good streaming performance will try to configure systems and filesystems for this purpose, so each read pulls in as much data as efficiently as possible. The result is large, fast, efficient IO that streams video smoothly. In the presence of disk fragmentation, those large IOs still get issued, but the underlying OS has to break them up to read in all the scattered blocks. Things slow down very quickly.
 
What do I do to guard against the effects of disk fragmentation?
 
First and foremost, you should avoid filling up your disk subsystems. As long as you keep 15 to 20% free, fragmentation won’t occur. If you’ve been above that mark, but then gotten back below it, any residual fragmentation will clear itself up over time if you’re using a modern filesystem. ZFS rewrites files into better configurations as they are modified, for example. XFS has a tool called xfs_fsr (short for file system reorganizer) that should run periodically and will clear up fragmentation as it goes. Apple’s HFS+ will attempt to rewrite files that are fragmented when you access them.
 
What do I do if disk fragmentation is creating performance problems?
 
If you have storage performance issues and you don’t have the option of deleting a lot of files, your best move would be to increase your storage capacity. If you were to copy or rsync your files to a larger storage device, that copy will automatically unfragment everything. There are several tools that will measure and correct fragmentation but, in general, a copy restore to a larger file system is good recovery/restore practice and probably just as fast.  
 
Steve Modica is CTO of Small Tree: www.small-tree.com