Chip Giant Gambles on CPU Optimized for Multitasking

The standard practice with processor marketing has been to introduce a new CPU, then ratchet up the clock speed with future models. There are usually incremental benefits along the way, such as larger cache sizes or smaller cores, but for the most part, speed is king. That’s why AMD made a big noise when the Athlon was first to break the 1GHz barrier, and Intel did the same when the Pentium 4 reached 2GHz.

By contrast, this month’s release of the 3.06GHz Pentium 4 brought barely a squawk about passing the 3GHz mark. The main attraction of Intel’s new flagship isn’t that it has a high clock speed, but that it’s the first desktop processor to support what Intel calls Hyper-Threading (HT) technology — a more elegant, less brute-force approach that debuted in the company’s Xeon server CPUs last February.

The goal of Hyper-Threading is to make more efficient use of processor resources, allocating them in an intelligent manner to optimize performance with multithreaded software and operating systems. An earlier CPU Planet article provided an introduction to multithreading and multicore processors; in this story, we’ll take a closer look at how HT technology works and its real-world results in the Pentium 4/3.06.

Hyper-Threading 101

The first key to understanding Hyper-Threading is to understand what it isn’t. The 3.06GHz Pentium 4 does not incorporate two processor cores on a single CPU die, nor does it split resources right down the middle into dual 1.5GHz zones. Instead, HT creates two logical processors within one physical core, thereby providing a 3.06GHz processor for standard or single-threaded applications that also has the ability to intelligently partition its resources to handle dual instruction threads.

There are many ways of illustrating the Hyper-Threading process. One of Intel’s promotional videos represents the CPU as a factory with a single supply path for raw materials. With Hyper-Threading, a second supply channel (thread) is added, thereby increasing the plant’s overall efficiency: The factory’s physical resources, the number of workers, and their performance remain the same, but the added influx of raw materials allows unused work cycles to be more fully utilized.

Even boosted to 3.06GHz, the P4 still has a finite number of clock cycles, and the two logical processors must share an unchanged supply of physical resources such as registers and cache. Intel is not giving away free CPUs, but merely offering a segmented method of handling instructions.

Here is another Intel diagram, like the one above, that illustrates how single-threaded processing compares to Hyper-Threading and demonstrates the piggy-back effect of HT technology. The “time saved” graphic refers to how Hyper-Threading can compress the processing time, not via additional CPU power, but by making better use of the gigahertz that are already there.

Hyper-Threading answers a common criticism of today’s high-end CPUs — do we really need this much power for running Word and Excel and surfing the Net? — by shifting the focus from single applications to multitasking environments like Windows. Without ever quite admitting that owners of, say, 750MHz systems deem their desktops perfectly fast enough for popular programs, Intel argues that (according to a survey) nearly half the users of three-year-old or older PCs say they don’t trust their systems to handle two demanding tasks at one time, such as burning a CD while playing a game.

With HT technology, Intel says, the system is less likely to bog down, because the CPU is built to handle multiple threads and partition resources according to demand. For example, when working in Word while processing a media file in the background, the Pentium 4/3.06 dynamically allocates resources to both applications, ensuring a smoother multitasking environment. When playing a demanding 3D game with no other programs running, the HT processor allocates resources differently, to provide performance equal to (or even higher than) a standard Pentium 4 at the same clock speed. (More on HT software optimization in a minute.)

Naturally, a higher clock speed is an asset to Hyper-Threading, but not a required element (although, realistically speaking, the chances of seeing a Celeron HT processor in the near future look slim). George Alfs of Intel’s desktop platform marketing group says, “Actually, HT Technology can help [at] any speed. We originally planned to launch it to the desktop with our Prescott processor, scheduled for the second half of 2003, but the infrastructure came together sooner than we expected and the performance boosts are significant. So the timing worked out to launch it with 3GHz.”

Hyper-Threading Requirements

There are three basic requirements for a Hyper-Threading desktop: a 3.06GHz (or faster) Pentium 4 CPU, an HT-compatible chipset, and an operating system that supports HT Technology. The processor component easy to grasp, but things get a bit hazier with the other two. Currently, only Intel chipsets fully support Hyper-Threading; these include the recent 845PE/GE and 850E products along with older 845E and 845G platforms as well. Both SiS and VIA have HT-compatible Pentium 4 chipsets on the horizon, but for now it’s an Intel playground.

Along with a chipset for hardware support, an updated system BIOS is also required. The latter recognizes the processor’s HT capabilities and communicates this to the operating system. The BIOS performs two main duties related to Hyper-Threading: the logical processor startup sequence, and enabling or disabling HT technology. The former is simply the process of sequencing the logical processors, thereby ensuring high performance and compatibility for both single- and multithreaded software, while the last is a switch that turns HT support on or off.

As for operating-system support for Hyper-Threading, it’s limited to Microsoft Windows XP Home Edition and Windows XP Professional, along with a number of versions of Linux (using kernel 2.4 and above). The older Microsoft operating systems are not fully compatible — Windows 2000 supports multiple CPUs, but misidentifies the HT chip as two physical processors instead of merely two logical ones. There will be no patches adding HT support to Win 2000 or other older platforms, although Microsoft has posted a helpful white paper on how Win XP implements Hyper-Threading.

This is an important piece of the puzzle, since the OS plays a crucial role in thread scheduling, halting idle logical processors, and resolving data-contention issues. Hyper-Threading technology is highly dependent on the operating system to maintain high performance, and without proper support, one of the logical processors can go into an endless, performance-killing loop, or one thread can take over to the exclusion of the other.

Application support for Hyper-Threading is another core requirement for the technology to really take off, and is still a work in progress — indeed, Intel’s “Featured Software” Web page touts the responsiveness and performance users can expect from the Pentium 4/3.06, but actually clicking on many of the listed games and other titles brings you to older ad copy about “the Intel Pentium 4 processor with speeds up to 2.8GHz.”

Intel’s Alfs says, “We have been working with the software and infrastructure industries for over two years getting the industry ready. Obviously threaded applications have been around for a while, but many applications were never written with multiprocessing in mind. Microsoft has been a great help with SP1 [Windows XP Service Pack 1], which takes care of some of these app issues. We will continue to work with the industry on SMP and SMT technologies, and will further refine HT technology with Prescott.”

Two Heads Aren’t Always Better Than One

In many ways, Hyper-Threading technology is superior to conventional multiprocessor systems in terms of getting the most value for your CPU dollar. At first glance, a dual-processor desktop with two 1.5GHz Pentium 4 chips would seem to be equivalent to the new CPU. But in reality, two 1.5GHz processors do not equal 3GHz — most of the time, only a single CPU would be active, and even for highly multithreaded applications, multiprocessing does not yield a 2X performance bonus over a single processor. (See CPU Planet’s “Multiprocessing 101” article.)

With the new Pentium 4, the full 3.06GHz of power is available for standard applications and allocate-able for multithreaded programs, while dynamic load-handling can switch between single- and dual-threaded operation in multitasking environments. This flexibility allows an HT processor to answer virtually any demand, whether a high-end game, a multithreaded image- or video-editing application, or a desktop full of relatively simple multitasking programs.

Of course, there are circumstances where two physical processors can beat a single HT processor. Multiprocessing systems can really go to town on demanding applications with extensive multithreading support; two processors bring double the actual resources, such as cache, instruction units, and registers. In turn, however, there are additional costs associated with multiprocessor platforms like the servers for which Intel designed the Xeon MP. The Pentium 4 with HT is positioned as a one-size-fits-all solution, stressing overall flexibility and value (though, in Intel tradition, its introductory price premium is high above slower Pentium 4s or the Athlon XP).

The Proof is in the Threading

The advent of Hyper-Threading will shake up the world of PC benchmark fanatics. While some speed tests simulate multitasking workloads, the potential exists for conventional performance-based metrics to actually decrease due to poor coding, inefficient CPU partitioning, or contention for resources such as processor cache.

In our hands-on testing, however, the 3.06GHz Pentium 4 exhibited none of these potential shortcomings: While improvements were minimal in some cases, every single application, subsystem, and game benchmark we tried performed at least fractionally better with HT enabled than with the feature switched off via the system BIOS.

One predictable but telling test result showed that HT technology works better when multitasking different types of programs, rather than similar applications that call on the same functions. Performing divergent tasks such as processing a video and audio stream simultaneously resulted in significant gains (in the area of 10 to 15 percent), but running multiple instances of the same media encoding programs, thereby putting a strain on the same CPU resources, showed only nominal performance gains.

Overall, the advent of Hyper-Threading technology should be welcome news for all levels of the PC market, as it not only offers a lossless technology for existing programs but many potential benefits for the future. By this time next year, the Intel-versus-AMD performance battle should pit a variety of Hyper-Threaded 32-bit applications against 64-bit versions running on AMD’s “Hammer” — and that’s a race where consumers are guaranteed to win.

Categories: Technology