Pages

Search Here

From magnetic to solid state, spin-free: What a long, strange storage trip it’s turning out to be

Flash-memory-based solid-state drives have recently stirred up the staid storage industry, and their initial success stories foretell a potentially stellar future. Consider, for example, how rapidly they’ve taken over the formerly robust market for 1.8-in. hard-disk drives. Also consider their significant influence on smaller-form-factor hard-disk drives’ lackluster initial unveilings. A notable percentage of netbook, tablet, and other alternative mobile computers, especially those running Linux operating-system variants, contain solid-state drives instead of hard-disk drives. Thin and light conventional notebook PCs running Windows and OS X are also well along the conversion path.

Enterprise-computing applications might at first glance seem to be poor candidates for solid-state technology, given its substantially higher cost than the magnetic alternative at the high capacities that this market segment requires. Yet, by virtue of its low energy consumption, increased reliability, and ultrafast read rates, the technology is making notable progress in conquering the corporate world. Consider that, to maximize hard drives’ performance, IT departments have long formatted only the platters’ fastest-access portion, wasting the rest of the drive and thereby blunting the argument that hard drives cost less than their nonmagnetic counterparts. Consider, too, that a number of applications, including smartphones, PDAs (personal digital assistants), digital still cameras, and videocameras, that might have formerly gone with—and, in some cases, in initial product generations, did go with—hard drives have migrated en masse to solid-state storage.

Proponents of both approaches dispute the extent of solid-state technology’s potential to obsolete its predecessor, and the industry has yet to determine a winner. These debates illustrate a number of fundamental misrepresentations of solid-state drives’ strengths and shortcomings. EDN readers’ feedback to past editorial coverage reveals similar misunderstanding (Reference 1). This article attempts to clear up at least some of that confusion.

Cost, power consumption, and any other all-important comparative factors aside, the differences between hard-disk and solid-state drives boil down to a few fundamental points. First, solid-state drives, especially those in applications with random-access patterns, read data substantially faster than hard drives can, 

assuming the absence of any storage-to-system-interface bottlenecks. In contrast, solid-state drives, especially those in applications with random-access patterns, write data much slower than hard drives do. Also, once the solid-state drive has depleted its inventory of spare capacity, block-erase delays become a larger percentage of the total write latency. Further, unlike with a hard-disk drive, flash memory is not fully bit-alterable. Although flash memory can change ones to zeros on a bit-by-bit basis, converting even a single zero back into a one requires erasing the entire block containing the bit.

Another difference is that flash memory eventually “wears out” after extended erase cycles. However, it tends to do so on a block-by-block basis and in a predictable manner that the media controller can easily detect far in advance and compensate for in a variety of ways. Many hard-drive failures, in contrast, are abrupt and systemic. Further, because solid-state drives are semiconductor-based, they are notably more immune to the effects of abrupt shock, sustained vibration, and other types of jarring. They’re also comparatively impervious to environmental interference, such as from magnetic fields.

Most of today’s flash memory uses a conceptually common floating-gate cell structure (Figure 1). Fast-random-read-access NOR and slower—albeit less expensive on a per-bit basis—NAND technologies differ predominantly in their cell-to-cell interconnect schemes. In its default erased state, the transistor turns on—that is, outputs a one—when the memory device’s integrated address-decoding circuitry activates the necessary array row and column lines to select it. With the transistor programmed by means of additional electrons stored on its floating gate, it cannot turn on when address inputs select it and therefore outputs a zero to read attempts.


Figure 1: Most single-transistor flash-memory cells operate in a conceptually similar fashion (a) regardless of whether they interconnect in a NOR (b) or a NAND (c) scheme. Bit-by-bit or more efficient page-by-page programming places incremental charge on the floating gate (d), thereby counteracting an applied turn-on voltage during subsequent reads. Block-by-block erasure (e) removes this charge surplus (courtesy the Wikipedia Foundation).

Altering the stored value of a flash-memory transistor involves the application of higher-than-normal voltages to various transistor junctions, thereby creating the necessary electric fields to affect electron flow onto or off the floating gate. Initial flash-memory generations required off-chip generation of these voltages; nowadays, most devices employ on-die high-voltage pumps for this function. Theoretically, you could alter a transistor’s value—both for one-to-zero programming and for zero-to-one erasure—on a bit-by-bit basis, as is the case with EEPROMs (electrically erasable programmable read-only memories), FRAMs (ferroelectric random-access memories), MRAMs (magnetic RAMs), and battery-backed SRAMs (static RAMs). The necessary signal-routing, isolation, and other circuitry, however, would use too much die area and would therefore be too expensive at the IC capacities that bulk-storage applications require.

With modern NAND flash memory, on the other hand, you can erase blocks in approximately 512-kbyte increments. Bit-by-bit programming is possible; from an efficiency standpoint, however, solid-state drives and the NAND chips within them prefer to write data in approximately 4-kbyte chunks. This preference reflects the cost-versus-performance sizes of the RAM buffers on the flash-memory die. The bulk-alteration requirement differentiates flash memory not only from other nonvolatile semiconductor-storage technologies but also from hard-disk drives. The repeated electron flow across the thin silicon layer between the flash-memory transistor’s substrate and floating gate and through incremental program and erase cycles stresses the oxide. At first, electrons inadvertently become trapped in the oxide lattice, impeding the flow of other electrons and slowing subsequent program and erase operations. Eventually, the oxide breaks down by rupturing, for example, leading to fundamental transistor failure. This demise tends to disrupt the function of the entire erase block that contains the affected transistor.

Modern flash memories come in both SLC (single-level-cell) and MLC (multilevel-cell) variants; today’s MLC variants are primarily 2-bit-per-cell devices (Figure 2). With SLC flash memories, the voltage-sensing circuitry that connects to the array transistors’ outputs can be relatively simple because it needs to discern only one voltage threshold and because the transistor’s one and zero output voltages have substantial margin to this threshold. However, with a 2-bit-per-cell MLC flash memory, three voltage thresholds—that is, four levels—require discernment during reads. The programming operation for placing the necessary amount of electron charge onto the transistor’s floating gate is similarly precise, and the effects of supply-voltage and operating-temperature variation and cycling further complicate this operation. Spansion claims that its MirrorBit MLC technology approach somewhat reduces the need for precise electron placement. Nonetheless, the concept remains largely relevant.


Figure 2: Whereas conventional 1-bit-per-cell flash memory has plenty of margin within the threshold-voltage envelope between a sensed zero and a sensed one (a), 2-, 3-, and 4-bit-per-cell technologies are more challenging to reliably implement across supply- voltage, temperature, and erase-cycling ranges (b) (courtesy Micron Technology).

It’s probably no surprise that MLC reads and writes—that is, program=—are substantially slower than their SLC counterparts and that the maximum block-cycling specifications for MLC memories are on average an order of magnitude less than those of SLC chips. These fundamental trade-offs are necessary for obtaining a lower per-bit cost for MLC storage devices (Table 1).


Now, consider Intel and Micron Technology’s new 3-bit-per-cell memories and that Sandisk recently began producing 4-bit-per-cell X4 devices (Reference 2). The difference between any two sequential voltage levels and, hence, decoded-bit combinations with these new devices is on the order of 100 or so electrons or fewer in some cases. This situation represents a profound challenge for semiconductor-process and -product engineers. By potentially hampering both performance and data dependability, it calls into question the chips’ suitability for applications requiring highly reliable storage. Then again, folks not too long ago were saying the same thing about 2-bit-per-cell MLC flash memory.


Controller choices

The media controller may be a hardware-centric device, a software-fueled CPU, or any combination thereof. The hardware-versus-software choice of a controller involves trading off cost, performance, and power consumption versus flexibility and the ability to upgrade. Whatever its composition, the media controller acts as a bridge between the flash memories and the conventional hardware and software interfaces that the CPU, core-logic chip set, and other subsystems expect. The controller also manages the data stored in the single-component or multicomponent merged flash-memory array to avoid “hot-spot” overcycling of any erase blocks in the array, ideally as a background function that is invisible to the host both in access time and in any other regard. The controller leverages flash memory’s strengths and mitigates its read- and write-speed weaknesses. The result is, with any luck, at least on par with—and, ideally, much faster than—the hard-drive alternative.

One perhaps obvious way of boosting effective solid-state-drive performance at the expense of incurring higher power consumption is to access multiple components in parallel using several address-, data-, and control-line channels between the controller and the flash memories. You can then not only simultaneously read, program, or erase multiple array elements, but also juggle multiple operations with different ICs. For example, you could read from one while writing or erasing another if the system’s access profiles justify this added level of controller complexity. In choosing a multichannel scheme, however, you also multiply the granularity of the solid-state drive’s capacity and the effective sizes of program pages and erase blocks. This situation might warrant the choice of a flexible controller design that can run in either single-channel or multichannel mode.

Modern flash memories exhibit significant disparities between program-page and erase-block sizes and between program and erase times. The controller should, therefore, manage the media in such a way that background-erase operations for wear-leveling purposes—which manufacturers also commonly call housekeeping, garbage collection, and merging—on a component or a block within that component don’t collide with system-write-request-initiated foreground programming operations on that same component or block or, for that matter, foreground system-read requests. Embedding a large RAM cache on the solid-state drive, much like the buffers on modern hard drives, can also be an effective collaborator to system-side buffering in mitigating any perceived decrease in performance that these housekeeping tasks incur. The trade-off of this approach, however, is that it requires more parts.

Reads and writes were traditionally the only required storage functions because, unlike with flash memory, you could fully overwrite hard-drive media on a bit-by-bit basis. A file-deletion request causes the file system to update its internal tables accordingly, but it historically didn’t pass that information to the drive. Hence, the solid-state-drive controller is unaware that it can do background cleanup to free up the relevant pages and blocks containing them for future writes. The necessary erase and program operations occur only after the file system requests an explicit overwrite of the LBAs (logical-block addresses) associated with the drive’s now-invalid PBAs (physical-block addresses). These operations are then unfortunately in the foreground where they adversely affect perceived read and write speed.

Manufacturers typically ship solid-state drives from the factory with spare “fresh” capacity, which is invisible to the operating system. The controller uses this capacity to delay the inevitable onset of the noted performance-strapping scenarios. However, good news is on the way in the form of the “trim” command, which the T13 Technical Committee of the INCITS (International Committee for Information Technology Standards) is now standardizing as part of the ATA (advanced-technology-attachment) command set. At press time, the T10 Technical Committee had not yet revealed its plans for the SCSI (small-computer-system-interface) command set. Before a system uses the trim command, it interrogates the drive to determine rotation speed. If it encounters a 0-rpm response, the system assumes that it is dealing with a solid-state disk and does further queries to determine whether trim support exists along with other relevant parameters. The trim command informs the drive that pages stored within the array are no longer valid and are therefore candidates for housekeeping. Deletion of a file within a trim-cognizant operating system results in the sending of relevant information for the corresponding LBAs to the drive’s controller.

Although the trim command can dramatically improve sustained solid-state-drive performance in applications requiring many file deletions, it’s ineffective in cases in which file updates occur, such as when you open a document for editing and then save the updated version or with Microsoft Outlook’s PST database format. More generally, it exposes the weakness inherent in the strong linkage between LBAs and PBAs in FFSs (flash file systems). Sandisk’s venerable FFS, along with the FTL (flash-translation-layer) technology the company obtained when it acquired M-Systems in mid-2006, strives to provide PBA independence for frequently updated files, such as the Windows Registry and the FAT (file-allocation table).

The company unveiled its ExtremeFFS at January’s CES (Consumer Electronics Show) and both implements it in its products and makes it available for licensing. ExtremeFFS further severs explicit LBA-to-PBA linkage, thereby claiming to boost random write speeds by a factor as great as 100 times. ExtremeFFS makes less efficient use of the flash-memory media to accomplish this objective; Sandisk declines to provide specifics. Given the burgeoning capacities available with lithographies such as Intel and Micron’s latest 34-nm process, however, the incremental ExtremeFFS overhead will over time become less of a practical issue.


System optimizations

Solid-state units are currently shoehorning themselves into legacy hard-drive designs to leverage that technology’s huge market, thereby jump-starting the solid-state-storage ramp-up (Figure 3). But as solid-state drives become more prevalent and presumably, therefore, a from-the-start implementation choice, a carefully crafted software/hardware implementation can optimally benefit from flash memory’s unique capabilities. Consider, for example, common file-system operations, such as periodic automatic disk defragmentation, prefetching, file-location optimization, and system-side caching. These features all aim to compensate for hard drives’ head-relocation and platter-rotation latencies, which cause slow random read accesses. Neither of these latencies is a factor with solid-state drives. Eliminating such workarounds can consequently improve system cost, power consumption, and other key variables and can reduce flash-media cycling.


Figure 3: Most of today’s solid-state drives, such as Intel’s X25 units, use conventional storage interfaces and will benefit from those interfaces’ evolutionary performance improvements (a). More revolutionary approaches migrate to alternative system interfaces with closer proximity to the CPU, such as Fusion-io’s approach with PCIe (b) and Spansion’s approach with DRAM (c).

Microsoft’s latest Windows 7 operating system makes such adjustments when it detects a solid-state drive’s presence in a system using the same scheme it uses for assessing trim support (Reference 2). Trim cognizance extends beyond simple file-deletion operations to encompass the full range of related functions, such as partition formats and system snapshots. Avoiding random-location writes whenever possible to boost performance can benefit both hard-disk- and solid-state-drive technologies. Integrated file-compression support for flash-memory-housed data can reduce the per-bit cost gap between the technologies. Windows 7 and its peers also are more scrupulous about, for example, ensuring that partition- and file-location endpoints align with—rather than overlap—flash memory’s write-page boundaries. And solid-state drives may provide an opportunity to reduce the system DRAM’s budget requirement to less than it was in the hard-drive-centric past, thanks to solid-state’s fast-access—at least for reads—virtual-memory-paging scheme.

System-hardware interfaces provide another opportunity for optimization. Except perhaps with extremely high-rotations-per-minute, enterprise-tailored units, hard drives tap the bandwidth capability of modern storage interfaces, such as 3-Gbps SATA (serial ATA) and SAS (serial attached SCSI), only when doing transfers to and from the drive’s RAM buffer. Solid-state drives conversely can make more meaningful use of the performance potential of SATA and SAS, and the two technologies’ performance gap will only increase in the upcoming 6-Gbps serial-storage-interface generation (Reference 3 and Reference 4). Similar disparities are likely with the upcoming 4.8-Gbps USB (Universal Serial Bus) Version 3 and with Intel’s embryonic Light Peak optical-interface technology.

But why restrict yourself to a legacy storage interface at all? Companies such as Fusion-io have figured out that PCIe (Peripheral Component Interconnect Express)-based add-in cards can boost performance by moving the solid-state drive closer to the CPU with which it’s interacting. Giving the flash memory, either on a module or directly attached to the system board, a dedicated interface to the chip set affords an even closer linkage. Intel uses this approach with its Turbo memory cache, for example. The approach incurs a trade-off, however, in that it makes it more difficult for the end user to later alter the system-memory allocation. Alternatively, you can use the DRAM bus, as Intel’s 28F016XD flash memory attempted to do in the mid-1990s and as Spansion’s EcoRAM does today. In such a configuration, you might even be able to dispense with a dedicated flash-memory-controller chip, instead employing software running on the host CPU or circuitry within the chip set’s logic.
Author Information
By Brian Dipert, Senior Technical Editor, EDN
You can reach Senior Technical Editor Brian Dipert at 1-916-760-0159, bdipert@edn.com, and www.bdipert.com

References
1. Dipert, Brian, “Solid-state drives challenge hard disks,” EDN, Nov 13, 2008, pg 25.
2. “Engineering Windows 7,” Microsoft Corp, 2009.
3. Dipert, Brian, “Speedy simplicity: serial-storage interfaces,” EDN, Jan 22, 2004, pg 33, .
4. Dipert, Brian: “Interface overkill? Is eSATA necessary for your next system design?” EDN, May 10, 2007, pg 48.

No comments:

Post a Comment