Chip czar Intel has once again set new benchmark records with its latest CPUs, the Core i7 series. CHIP reveals the technical innovations inside them.
Intel’s development model for processors is known as the “Tick Tock” cycle. Every alternate year, they focus is on miniaturizing the existing production technology for CPUs (known as a process shrink—“Tick”), while in the next year a new architecture will be introduced, based on this process (“Tock”). The system has been functioning well for four years now. The Core i7 architecture, formerly known by its codename “Nehalem”, was introduced in November 2008, after the original Core architecture was shrunk to 45 nm around the end of 2007 (products codenamed “Penryn”). The new design brings a series of changes with it, all aimed at optimizing performance, power consumption and reliability.
The last time Intel changed its processor package was in 2004, when it went from 478 contact pins to 775 pads. Since then, the package and matching socket has remained the same despite many CPU refreshes, but now Nehalem requires a radical turnabout. The new CPU requires about 600 more pins for all its new functions. Core i7 CPUs won’t fit into older motherboards since they now have 1,366 contact pads instead of 775. Even if they did fit physically, nothing would work since there are many new elements on the CPU which need to be connected to the motherboard and the rest of the computer’s components. The transition is understandable since it’s been a long time and there are genuine needs and advantages, but anyone who wants to use the new Intel technology must buy a new motherboard.
The most significant innovation with the Nehalem architecture is the obsolescence of the Front Side Bus (FSB), which has been responsible for all communication between CPU and chipset so far. Its successor is known as the QuickPath Interconnect (QPI). The FSB was replaced mainly because its bandwidth was found to be inadequate: QPI provides 20-bit wide, bidirectional links resulting in a maximum data rate of 25.6 GB/s. This is immediately twice the speed of what an FSB at its highest possible rating of 1,600 MHz could offer. QPI is very similar to the HyperTransport technology used by AMD since 2001, which is now at version 3.1 and achieves similar transfer rates.
Intel has chosen to adopt another technique very successfully applied by AMD: a memory controller integrated in the processor package. Intel’s desktop architectures until now have placed the memory controller in the chipset. The specialty of current high end Core i7s is their triple-channel memory controller. Three memory modules can now be ganged up to achieve data transfer rates fast enough to keep the CPU fed with fresh data so that its potential is used optimally. The result is that PCs which make use of this will have 3, 6 or 12 GB of RAM, which is unconventional compared to the progression we’re used to. However, lower-cost Nehalem CPUs which are yet to be launched will feature more traditional dual-channel memory controllers and a different, smaller socket with only 1156 contact pads.
HyperThreading makes a comeback
Since the end of the Pentium 4 generation, HyperThreading disappeared almost completely, but it is now making a comeback. Intel refers to a processor’s ability to process two program threads at the same time as Simultaneous Multi-Threading (SMT). So in addition to the impressive figure of eight CPU cores on a chip in the Windows task manager—four virtual and four real—SMT allows the cores to be utilized more efficiently, with a promised increase in performance of up to 30 percent.
New clock speed tricks
Core i7 processors can run with each individual core at a different clock speed. Turbo mode is especially interesting, because it allows some cores to be overclocked when a non-multithreaded task taxes one or two cores while the others are left idle. Such a situation allows the application to run more efficiently and utilize resources more effectively—and can result in a performance increase of up to 10 percent. On the other hand, a new power saving mode switches idle cores to the C6 state (deep powerdown). In this state, the core is simply disconnected from the power supply. This is taken care of by microcontroller logic which monitors the temperature and power consumption of each core.
New design: Small L2 cache and large common L3 cache
One of the weak points of the cache design on Intel’s previous CPUs was that on a quad-core CPU, each pair of two cores shared a 6 MB L2 cache which was exclusive to them. This was great for fast data exchanges between those two cores, but bad for exchanges between all four, which required the data to travel through the much slower Front Side Bus. In Core i7 CPUs, each core now has its own L2 cache, which is considerablyownsized to 256 KB, but with its speed increased by 50 percent. Like in AMD’s Athlon CPUs, a common 8 MB L3 cache (for the current quad-core models) is added to enable data exchange between the cores. This cache receives all data from the cores’ L1 and L2 caches, which in turn considerably accelerates data processing. This allows each core to be shut down without any risk of losing data that’s in transit between caches.
A CPU design for all applications
The scalability of the Core i7 architecture is quite unique. Nehalem is suitable for desktops, servers and notebooks as well. Thanks to the new cache design and the introduction of the QPI, two, four or eight cores can now be integrated in a single processor die. Furthermore, the high speed of the QPI enables quick communication between several CPUs on one motherboard for high-end and server configurations. When 8-core Nehalem chips are available, power users should be able to gang two of them up for a grand total of 16 cores and 32 virtual CPUs!
At present, three Core i7 models are available in the market, with more to come soon. By the end of the year 2009, lower cost versions of Nehalem (codenamed Lynnfield and Havendale) will hit the market, with many more innovations and performance advantages in store for users.