Sorting Out the Opteron, Athlon 64, and Athlon 64 FX

In October 1999, AMD announced a bold alternative to the proprietary EPIC or Itanium 64-bit processor architecture chosen by Intel: The CPU underdog promised to extend the industry-standard x86 architecture to 64 bits, keeping native compatibility with existing 32-bit software while adding a 64-bit mode for new operating systems and applications to enjoy supercomputing-class memory addressing and performance. The extended architecture would be called x86-64, feature a fast new system bus dubbed Lightning Data Transport, and debut in AMD’s eighth-generation processor, codenamed “Hammer.”

In April 2003, AMD delivered on its promise, albeit with new nomenclature — AMD64 for the architecture, HyperTransport for the bus, and Opteron for the series of CPUs that challenge Intel’s Itanium family in the high-end server and workstation arenas.

In September, AMD completed its evolutionary shift into a new, yet backwards-compatible, realm of 64-bit computing with the Athlon 64 3200+ and Athlon 64 FX-51 processors for desktop computers. The architecture once reserved for servers is now available for enthusiast and mainstream PCs — and soon, at least in Athlon 64 guise, for mobile systems.

Significant platform advancement is always welcome news, but also brings some confusion about buying decisions and where new CPUs fit into the market matrix. The three AMD64 processor families share a common lineage, but have notable differences in design, features, and platform support, as well as the usual segmentation in price and performance levels. So let’s put the Opteron, Athlon 64, and Athlon 64 FX into context.

Core Design

All of the new AMD processors are built with 0.13-micron silicon-on-insulator (SOI) process technology — the move to 0.09-micron or 90-nanometer fabrication is expected sometime in 2004 — and feature 128K of Level 1 and 1MB of Level 2 on-chip cache. (AMD watchers have speculated about Athlon 64s with a smaller 512K L2 cache for the value and mobile market segments, but none have appeared yet.)

Compared to the seventh-generation Athlon and Athlon XP, each member of the Hammer clan features enhancements such as two more pipeline stages, enhanced branch-prediction algorithms, and support for the SSE2 streaming multimedia instructions that debuted in Intel’s Pentium 4 — as well as a memory controller integrated into the CPU instead of located in the Northbridge part of the system chipset.

The Opteron incorporates a robust dual-channel memory controller — a memory path that’s doubled from 64 to 128 bits. It’s not technically a true dual-channel design like that of Nvidia’s nForce2 chipset, with simultaneous data transfers along two distinct paths, but has the same effect of doubling the data pathways and creating a true 128-bit connection (plus a 16-bit error-correction code or ECC link) to system memory.

Offered in a 940-pin package, the Opteron is available at core speeds ranging from 1.4GHz to 2.0GHz in each of three configurations — the 100 Series for uniprocessor workstations, 200 Series for 2-way or dual-processor workstations and servers, and 800 Series for 4- or 8-way servers or colossal clusters. Its model numbers indicate series and clock speed, such as the 1.6GHz Opteron 142 and 242 and the 1.8GHz Opteron 844.

The Athlon 64 FX — the first of which has the model number FX-51; the next-higher-speed successor will be the FX-53, with an FX-55 presumably after that — is almost a straight transition of the 940-pin Opteron 100 Series to the performance desktop, bringing this high-end architecture to the enthusiast and gaming audience. This announcement was really no surprise, since the Opteron 100 was designed for workstation use, and a few enterprising manufacturers have been offering borderline workstation/gamer Opteron systems for months.

One difference between the Opteron and Athlon 64 FX-51 that pays off in higher performance numbers is a higher clock speed — 2.2GHz, or a nice 200MHz jump from the fastest Opteron model. The Athlon 64 FX is also limited (on paper, at least) to 1-way configurations, as AMD is looking to keep its desktop and server business separate.

The most mainstream AMD64 processor, the Athlon 64 3200+, has the same 2.0GHz clock speed and L1 and L2 cache as the Opteron 146. However, the Athlon 64 uses a smaller 754-pin CPU package, and its integrated memory controller supports just a single, 64-bit path to system memory, yielding half the bandwidth of the Opteron or Athlon FX.

Memory Support and Features

It’s new technology to desktop shoppers, but server managers are familiar with registered or buffered memory — and since the Opteron requires this higher-priced, higher-data-security type of DDR, so does the Athlon 64 FX-51. By contrast, the Athlon 64 works with standard, unbuffered DDR.

Memory speeds are another issue: While the Opteron’s dual-channel controller supports registered DDR in a dual-module configuration at speeds up to 333MHz — yielding memory bandwidth of 5.3GB/sec, a nice increase over the bandwidth of the old Athlon MP — the Athlon FX-51 adds support for DDR400, boosting bandwidth to a whopping 6.4GB/sec. The Athlon 64 also supports 400MHz memory speeds, but its single-channel memory controller cuts bandwidth to 3.2GB/sec with DDR400.

The differences in actual performance aren’t that great, due in part to the incredibly low memory latencies offered by the integrated memory controller. This design bypasses many conventional system bottlenecks, making the CPU the traffic cop, and lowers the system cycles needed for memory access. This is borne out in system benchmarks — which place the Athlon 64 FX-51 at the top of the food chain, with the Athlon 64 not too far back.

Of course, peak performance doesn’t come cheap, and the reliance of the Athlon 64 FX-51 on server-style registered DDR adds some overhead to system cost. The move to 90 nanometers is widely expected to bring a revised, 939-pin Athlon 64 FX that works with conventional, unbuffered memory, but so far AMD isn’t talking.

The HyperTransport Link

HyperTransport is one of the most important features of the AMD64 processors, and represents a high-performance alternative to current system bus technologies. HyperTransport offers a series of data paths using point-to-point links and a high overall speed of 1.6GHz for up to 6.4GB/sec of bandwidth. In current AMD64 designs, HyperTransport is used to link the CPU to the system memory, the North- and Southbridge chipsets, and also to other CPUs in a multiprocessor Opteron configuration.

While the technology is shared among AMD64 models, we once again see some differences in implementation — more or fewer HyperTransport links, akin to one or two memory channels, in the otherwise similar diagrams above. Opterons have three HyperTransport links, adding up to a maximum system bandwidth of 19.2GB/sec, with coherent HyperTransport links used for multiprocessing and shared memory — the 800 Series has three coherent links (uniting up to four CPUs), while the 200 Series combines one coherent (to join two CPUs) with two noncoherent links. The Opteron 100 Series has three noncoherent HyperTransport links.

As uniprocessor designs, the Athlon 64 and Athlon 64 FX have no need for coherent links. Instead, they incorporate just one HyperTransport link, limiting total system bandwidth to 6.4GB/sec peak. HyperTransport is one of the main advantages of the AMD64 platform, and even a single 6.4GB/sec link dwarfs current Intel desktop bus specifications and provides ample bandwidth for both current interconnects and upcoming technologies like PCI Express.

Platform Options

The platform selection is quite standard, with mainstream options including motherboard chipsets from Nvidia and VIA as well as AMD. The Opteron and Athlon 64 FX-51 share the same 940-pin platform base (for uniprocessor motherboards, anyway), while the Athlon 64 requires a 754-pin motherboard design. The AMD 8000 series chipsets cover the high-end multiprocessor segment, while Nvidia’s nForce3 Pro (Opteron, Athlon 64 FX) and nForce3 (Athlon 64) and VIA’s K8T800 (all AMD64 chips) fight it out in the workstation and desktop markets.

The key difference between platforms is cost, with Opteron/Athlon FX 940-pin motherboards being the most expensive, while 754-pin Athlon 64 motherboards are surprisingly inexpensive, even undercutting comparable Pentium 4 platforms. As evidenced by the VIA K8T800, the AMD64 processors’ shared, integrated functions make it easy to create a cross-platform chipset design. And given the CPUs’ integrated memory controllers, the new war will be fought on the system bus. The chipset that provides the fastest HyperTransport performance may find itself sitting on top of the sales chart.

A New 64-Bit World

It took AMD some time to finally get its AMD64 ducks in a row, but now the company has a nicely segmented line of processors. While 64-bit software is still scarce (with AMD64 versions of Windows XP and Windows Server 2003 stuck in beta until the second half of 2004), the Opteron serves the high-end workstation and server markets skillfully, while the Athlon 64 FX-51 hits the deep-pocketed enthusiast/gamer segment hard and the almost-as-fast, mainstream-priced Athlon 64 is arguably the real gem of the bunch. All good things come to those who wait, and that’s been proven once again with AMD64.

Categories: Technology