A bit more than a year ago the folks behind the LTO tape format standards, primarily IBM, with some contributions from HP and Quantum, added the Linear Tape File System to LTO’s feature list. While some niche markets, primarily the media and entertainment business, have adopted LTFS, it won’t live up to it’s promise without support from archiving and eDiscovery vendors.
LTFS divides a tape into two partitions, one that holds the file system metadata, and another that holds file data. With a little bit of software on a server, LTFS tapes look to the server like disks, and any application can write files to a tape just like they can write to a disk. LTFS isn’t the first attempt to make tape look like disk, I remember the Backup Exec group at Seagate Software showing me a tape file system in the ’90s.
The difference is that LTFS is standardized, making LTFS tapes a standardized data storage and exchange medium. Now you and I have switched from mailing floppy disks and DVD-R disks to using Dropbox, Sugar Sync and Yousendit, but when you need to move many gigabytes of data from one place to another, it’s hard to beat the effective bandwidth of a box of tapes.
A box of 20 LTO-5 tapes holding 24TB of data will take roughly 12 hours to get from New York to San Francisco via overnight courier. That works out to an effective transfer rate of 2TB/hr or 4.4Gbps. If we allow 12 hours to spool the data to tape, which is about how long it would take to move from a disk stage to tape using a 6-drive tape library, the effective bandwidth is still 2.2Gbps.
Even if you were getting 20:1 data reduction through data deduplication and compression, you’d need a 100Mbps link to match the bandwidth of that small box of tapes replicating that amount of data across a network. Twenty-to-one data reduction may be achievable for backup data, but archives don’t have nearly as much duplicate data as backup repositories, since each archive has just one copy of each data object. Archives of rich media, be they check images, photos from insurance claims adjuster’s digital cameras, or medical images, don’t reduce much at all, making that Fedex box even more attractive.
Without LTFS you’d have to be running the same backup application at both sites to send data via tape as each backup, or archiving, application writes data to tape in its own proprietary format.
In addition to providing a standard interchange format, LTFS promises big advantages to storing data in a standard format over a long period of time. If your archive program stores each object it archives as a native file in an LTFS file system, you’re not dependent on a single vendor for the data mover, indexer, search engine and litigation hold functions. If your archiving vendor discontinues your current product, like EMC did with Disk Extender a few years ago, you can switch to another product and have it index the existing data without having to regurgitate it to disk and ingest it into a new archive. If you have trouble locating data, you could point a Google appliance at the LTFS repository and use Google search to find the relevant data.
We as customers should start pressuring our archiving vendors to support native LTFS as a repository option. Some vendors will respond that they support LTFS, since they support any NAS storage, but most archiving solutions store their data on disk in proprietary container files. While compressed and single-instanced containers may have made sense on disk, the lower-cost-per-GB of tape makes the flexibility of a standard storage format worth the extra storage space it takes up.