Learn something new every day More Info... by email
Hierarchical storage management (HSM) provides a means for organizations to organize data storage and retrieval into separate tiers for cost management and storage space efficiency. The technique is also occasionally called tiered storage. It works something like a cache, but at a much greater scale, where frequently used data is stored on faster disk drives up front and archived on slower media at the lower tiers. As certain files are more often used, they reside on the first tier and are then moved to the lower tiers as they fall out of use.
The backbone of hierarchical storage management is the software. Very detailed logic is required to catalog the data and watch for frequently used files that should reside on the upper tier. The software is also responsible for managing the requests to the library tier and ensuring those requests happen in a somewhat timely manner.
There are a number of differing features to the various hierarchical storage management software implementations available. Some software may allow for a backup tier, where all of the data, regardless of whether it is frequently accessed or in archival, is also sent to additional long term storage media. Other features can include integration with computer systems that use the HSM. Here, data is pulled from other servers or workstations on the network to the primary HSM and further organized down to the disk tier or storage tier, or to a full backup.
Hierarchical storage management implementations may vary by use case as well. In some situations, a portion of a large file will sit on a high-speed disk and be linked to the remainder of the file on the storage media. As a user request comes in, the first part of the file is read from disk, while the remainder is retrieved from the storage media. This technique is often used in large media streaming implementations, such as Internet video.
A few drawbacks to hierarchical storage management also exist. Most notably is the time it takes to retrieve less often used data from the storage tier. In the case of many small files, for example, it can take upwards of hours or even days for the robotics to pull together the request which may be spread across multiple disks in the jukebox. In these cases, systems administrators typically recommend the user wrap up their large quantities of smaller files into single archive format files. The storage tier then only has to go looking for a single file, typically stored on a single media in the library.