Much of modern storage management focuses on efficient use and allocation of storage capacity. “Thin provisioning” is a primary mechanism for this, allocating just enough space to match each consumer‘s current needs, while promising more capacity when needed. Thin provisioning is an effective tool for allocation of storage capacity. When latency, bandwidth and IOPS are more important, thin provisioning makes these performance results nondeterministic. That is, without knowing in advance which chunk of allocated storage will be accessed or whether an access spans more than one such chunk, there is no way to predict, to predetermine, how long that access will take.
Many servers are deployed with far more capacity than they ever need, whether that capacity is direct-attached physical storage or virtual storage either locally or on a share or on a SAN. And since that storage can be expensive to purchase and manage and expand and run, it often makes sense to offer the physical or virtual servers more space than is currently allocated for them. While some servers will eventually need all of what they originally requested, many will never use that much space and so, statistically, it is usually safe for a storage system to offer more capacity than is physically available. That is the basis of thin provisioning.
The Wikipedia article on Thin provisioning observes, “This allows flexibility in growth of storage volumes, without having to predict accurately how much a volume will grow. Instead, block growth becomes sequential. Physical storage capacity on the array is only dedicated when data is actually written by the application, not when the storage volume is initially allocated. The servers, and by extension the applications that reside on them, view a full size volume from the storage but the storage itself only allocates the blocks of data when they are written.”
As noted earlier, there is a cost to this convenience and efficiency; there is a cost when storage is provisioned “without having to predict accurately how much a volume will grow.” Consider an array (or multiple arrays) of disks making up a storage pool from which virtual disks are thinly provisioned. Various servers using that storage make requests that expand their physical allocation at various times. The result is that a single virtual volume in use by a server is not allocated contiguously on those physical disks and later blocks may not even reside on the same physical disks as earlier blocks. The result is that even for a long sequential access by an application on that server, the latency and bandwidth are non-deterministic. Even with performance measurement, it becomes quite difficult to predict the latency of any particular pattern of I/O.
What are the alternatives? The first is that application managers and storage managers need to work together to accurately predict storage requirements for performance sensitive volumes and ensure they they are fully provisioned from the outset. Full provisioning is often called “thick provisioning” or “fat provisioning”. This storage can then be fully allocated on contiguous storage with known, deterministic, performance.
The second choice is to locate such volumes on storage composed of arrays of SSDs. Solid State drives are not mechanical devices with physical latency measuring in the milliseconds. On an SSD (array), a seek to a distant block because of subsequent allocation of blocks to a thinly provisioned volume is about the same as a seek to the “next” block of storage. On an SSD array, it does not matter whether capacity is allocated contiguously. Latency, bandwidth and IOPS (Input/Output Operations per Second) are predictable, deterministic, across the entire array.
The best choice might be to take both steps for applications that are performance sensitive: fully provision the requirement on all-SSD storage. Since SSD storage is still more expensive that spinning disks, that step would require cost justification in some organizations. And accurately determining actual storage capacity requirements probably takes some heavy lifting in most groups, as well as some teamwork from departments that do not always collaborate. The result, though, is worthwhile for some applications: low latency and high IOPS that are predictable across the volume.
One more nondeterministic aspect of thin provisioned storage occurs when a write to that storage requires more storage than was currently allocated requiring the storage subsystem and its underlying layers to allocate more storage to the volume. This may cause latency for this one I/O to be 10s or 100s of times the typical latency.
Such events are, hopefully, rare, but for applications that must determine responsiveness in all situations, these events must be considered.