Scaling large, parallel file system backups
TimeTuesday, July 246:30pm - 8:30pm
DescriptionHigh performance, multi-petabyte file systems with large numbers of files present performance challenges for data backups, and restores. For a file system where new, or modified files are backed up daily, the rate at which new data is created must be matched with the rate at which the data can be backed up so that backup windows are less than one day. Scanning meta-data to identify files for backup on file systems with large numbers of files is another limiting factor in backup performance. Tape is still the most cost effective storage medium for backups, but exacerbates backup performance challenges. In this poster we describe the systems, and software to support daily backups for a file system with 1.5PB of data in 230 million files; restoration procedures in the case of a catastrophic file system failure; methods for parallelization of backups, diagrams showing hardware details, and data movement for backups. Performance analysis of the backups will be provided, details about performance limitations for parallel backups will be described, and ideas for future work will be presented.