On the Horizon: 64-Bit Computing

The Next Big Thing, or Not for Everybody?

Dozens of technologies have started off in high-end, expensive systems and servers only to trickle down to mainstream use. Someday, chipmakers are betting, it’ll happen with 64-bit computing — so far exclusively for enterprise servers and scientific supercomputers, but lately buoyed by increased Windows support and AMD’s announcement of a 64-bit desktop processor.

Could a 64-bit CPU be the next must-have PC feature? It’s possible. But before you bet on these chips, consider that for most users and applications, the deck is still stacked in favor of today’s 32-bit systems.

It sounds like an irresistible sales pitch: a CPU able to process 64 bits of data at one time (in one clock cycle), twice as many as a 32-bit processor. By itself, however, “Not Just 32 Bits — 64 Bits!” may not prove much more compelling an ad slogan than “Not Just 1GHz — 2GHz!” has been.

There’s no guarantee that a 64-bit chip will run any and all applications twice as fast as a 32-bit chip; in fact, studies have shown that recompiling 32-bit programs with relatively unwieldy 64-bit addresses actually bulks up code size and increases data cache miss rates. The main appeal of 64-bit computing isn’t that it helps existing applications, although it’s already working for large-scale Web hosting and computer-aided design. It’s that it permits entirely different types of applications.

Monumentally Massive Memory

What’s universally known as x86 architecture began with the Intel 8086, introduced in 1978 — the ancestor of the 286, 386, 486, Pentium, and compatible CPUs. The 8086 was a 16-bit processor with extensions that let it address or work with up to 1MB of system memory — more than PC pioneers imagined would ever be necessary, which is why DOS stopped at 640K, leaving the remaining 384K for BIOS and video information.

In 1985, Intel’s 386 processor ushered in the 32-bit address bus used today, allowing theoretical access to up to 4GB of memory. At the time, that too seemed like more than any software could ever want — and today, the very top tier of the desktop market is just hitting the 1GB level.

Even in 2005, market researchers at Dataquest predict, the average PC will ship with 1GB of memory, as DRAM price pressure limits 2GB or 2.5GB systems to the elite workstation segment (although gigabit DRAM chips could cut costs by fitting 2GB on a single DIMM module by then).

But users accessing colossal databases or conducting massive simulations — the kind of modeling involved in forecasting global weather or the stock market — are already bumping into the 4GB ceiling. For these data-warehousing and scientific research applications, 64-bit addressing theoretically offers support for over 17 billion gigabytes (or, in more exotic lingo, almost 17 million terabytes, or 16 exabytes), although today’s platforms stop well short of such unimaginable spans. (Microsoft’s 64-bit Windows can address 64GB of physical and 16TB of virtual memory.)

Today, Unix; Tomorrow, Two Very Different Windows?

Right now, most 64-bit computers use Alpha, HP PA-RISC, IBM Power4, or Sun UltraSparc CPUs and some variant of Unix (such as Sun’s Solaris, IBM’s AIX, or HP/UX). Sun, the current market leader, released its 1.05GHz UltraSparc III processor — which delivers about 15 percent more performance than its 900MHz predecessor — this past summer.

Built on an 0.15-micron process, the UltraSparc III features a 32K instruction and 64K data Level 1 cache, with onboard controller to support 1MB, 4MB, or 8MB of external Level 2 cache and address up to 16GB of memory per processor. The company offers the chip in one- and two-way workstations and workgroup servers, but like most 64-bit chips, it’s used mainly in midrange and high-end servers. For example, the Sun Fire 3800 server supports up to eight UltraSparc IIIs and 64GB of memory, while the Sun Fire 6800 scales to 24 CPUs and 192GB. The Sun Fire 15K datacenter server allows a mind-boggling 106 processors and over half a terabyte of RAM.

The most prominent — but still decidedly server- rather than desktop-oriented — alternative to 64-bit Unix solutions is Intel Corp.’s Itanium family, developed in cooperation with HP (which offers a version of HP-UX for it) and introduced last year after a good seven years’ anticipation. This year, Intel went from a market toehold to a foothold with the Itanium 2 (codenamed “McKinley”), which runs at 1GHz and boasts a larger cache — 32K Level 1, 256K Level 2, and either 1.5MB or 3MB of onboard Level 3 cache. A successor codenamed “Madison” will move the chip from 0.18- to 0.13-micron process architecture and hike the L3 cache to 6MB.

Unrecognizable Under the Hood

Itanium makes a radical (Intel would say necessary and overdue) break with x86 (what Intel calls IA-32) architecture to introduce IA-64 and EPIC — Explicitly Parallel Instruction Computing, Intel’s special spin on VLIW (very long instruction word) program design. EPIC uses a number of ways to improve performance, including speculation, which enables the compiler to schedule load instructions ahead of branches and stores to reduce memory latency; predicated execution, which eliminates branches and associated misprediction penalties; and parallelism, which delivers higher performance and scalability by enabling the compiler to provide more information to the processor allowing it to execute multiple operations simultaneously.

To take advantage of its 64-bit performance and ability to process scads of instructions simultaneously, however, software needs to be written specifically to run on Itanium. Although Intel’s 64-bit chip provides a compatibility mode to run 32-bit code, it’s a major shift (rather like flushing or rebooting the CPU in mid-program) that results in a substantial performance hit. This makes Itanium a bad bet for running a mix of new and legacy (64- and 32-bit) applications.

Although Intel has lined up a number of vendors to write IA-64 software, the selection is very enterprise-oriented. Red Hat, Turbolinux, and Caldera Systems each shipped Itanium-compatible versions of Linux when the chip debuted, and Microsoft offers both Windows Advanced Server, Limited Edition — a sort of prerelease version of the 64-bit Windows .Net Server 2003 — and a 64-bit edition of Windows XP that’s preinstalled on some Itanium workstations. Microsoft has also released a beta-test 64-bit version of its SQL Server 2000 database optimized for Itanium 2.

Among the other applications that support Itanium are such enterprise managerial wares as BEA Weblogic, i2 Supply Chain and Factory Planner, IBM DB2 and Websphere, Oracle9i Database and Application Server, Reuters’ financial services platforms, SAP R/3 and APO with LiveCache, and SAS v9.0. And while Itanium servers are readily available, workstations are relatively scarce — there’s IBM’s IntelliStation Z Pro and HP’s Workstation i2000, but Dell, for example, quietly dropped its Itanium 1 workstation after a few months’ slow sales.

AMD’s Drive for the Desktop

Unlike Intel’s start-from-scratch IA-64, AMD’s x86-64 architecture reaps 64-bit rewards by tweaking the company’s 32-bit CPU designs and code base. By mid-2003, the company plans to offer not only multiprocessing server-oriented versions of its eighth-generation “Hammer” processor (called Opteron), but a single-processor “ClawHammer” to be marketed as the next Athlon CPU.

The 0.13-micron ClawHammer will feature a dual-channel DDR memory controller, dual Level 1 caches (reportedly 64K each) and an onboard L2 cache (reportedly 256K). It’s supposed to run 32-bit code at a 20- to 25-percent performance gain over today’s Athlon XP — i.e., much faster than Intel’s Itanium — as well as seamlessly running new 64-bit programs, without sacrificing performance in either mode.

While 64-bit mode also requires new software, AMD insists the transition will be far easier than writing code for Itanium, because x86-64 architecture simply extends the venerable x86 instruction set with a “long” mode that identifies code segments as 64- rather than 32-bit and adds a couple of new 64-bit instructions (as well as addressing) while keeping many identical opcodes and instruction sequences. Compared to EPIC’s clean slate, it’s more like the slipstream rewrites that software vendors made to take advantage of the 386 and its 32-bit addressing when that chip succeeded the 286.

32, 64, or Both?

Since nearly all existing 64-bit applications are big-iron-oriented, AMD is gambling that multimedia and game developers will create a new set of super-potent titles to take advantage of ClawHammer, while telling buyers and IT managers that x86-64 is the more compatible, future-proof, or painless way to go.

The biggest hurdle was cleared when Microsoft pledged to support Hammer with a 64-bit Windows XP last April, and in August Red Hat promised a version of its Linux Advanced Server with concurrent support for 32- and 64-bit applications.

So two questions are undecided — one, whether the move to 64 bits will (so to speak) come from the top down or bottom up; the other, just how quickly or slowly it’ll reach the mainstream, as 32-bit computing remains amply adequate for most tasks short of monster database jobs, analytic modeling, and scientific specialties. As with most things to do with computing, it’ll be users’ choice of applications and software that determine the fate of the hardware.