Why Ivy Bridge Is Still Quad-Coreby Kristian Vättö on December 5, 2011 1:35 PM EST
- Posted in
- Ivy Bridge
A couple days ago, we published our Ivy Bridge Desktop Lineup Overview in which we mentioned that Ivy Bridge will remain a quad-core solution. There are dozens of forum posts with people asking why there's no hex-core Ivy Bridge, so now seems like a good time to address the question. Fundamentally, Ivy Bridge is a die shrink of Sandy Bridge (a "tick" in Intel's world), and that usually means either the core count or frequency is increased due to the lower power consumption of the smaller process node. Thus, instead of hex-core, we get a chip that looks much the same as a year-old Sandy Bridge, only with improved efficiency and some other moderate tweaks to the design. Let's go through some of the elements that influence the design of a new processor, and when we're done we will have hopefully clarified why Ivy Bridge remains a quad-core solution.
If we look at the situation from the marketing standpoint first, having a hex-core Ivy Bridge die would more or less kill the just released Sandy Bridge E. Sure, IVB is about five months away, but I doubt Intel wants to relive the Sandy Bridge vs. Nehalem (i7-9xx) situation--even Bloomfield vs. Lynnfield was quite bad. If Intel created a hex-core IVB die, they would have to also substantially cut the prices of SNB-E. The current cheapest hex-core SNB-E is $555, while IVB hex-core would most likely be priced at $300~$400 since it's aimed at the mainstream; otherwise very few SNB-E systems would be sold. Even then, most consumers would opt for the IVB platform due to cheaper motherboard costs and lower TDP. PCIe 3.0 should also make 16 lanes fine for dual-GPU setups, reducing the market for SNB-E even more.
Differentiating the lineup by keeping Ivy Bridge quad-core allows some market for SNB-E among enthusiast consumers. Ivy Bridge E isn't coming before H2 2012 anyway so SNB-E must please the high-end until IVB-E hits. In the end, we still recommend SNB-E primarily for servers and workstations where the extra memory channels, PCIe lanes, and dual-socket support are more important, but the lack of hex-core IVB parts at least gives the platform a bit more of an advantage.
Evolution from traditional CPU to SoC
There are more than just marketing reasons, though. If we look at the following die shots, we can see that CPUs are becoming increasing similar to SoCs.
Quad-core Kentsfield package (2006)
Quad-core Nehalem die (2008)
Quad core Lynnfield die (2009)
Quad-core Sandy Bridge die (2011)
These four (well, techically three because Kentsfield consists of two dual-core Conroe dies) chips are the only "real" quad-core CPUs from Intel. There are quad-core Gulftown Xeons, and there will soon be quad-core SNB-E CPUs, but they all have more cores on the actual die; some of them have just been disabled. Comparing the die shots, we notice that our definition of CPU has changed a lot in only five years or so. Kentsfield is a traditional CPU, consisting of processing cores and L2 cache. In 2008, Nehalem moved the memory controller onto the CPU die. In 2009, Lynnfield brought on-die PCIe controller, which allowed Intel to get rid of the Northbridge-Southbridge combination and replace it with their Platform Controller Hub. A year and a half later, Westmere (e.g. Arrandale and Clarkdale) brought us on-package graphics--note that it was on-package, not on-die as the GPU was on a separate die. It wasn't until Sandy Bridge that we got on-die graphics. The SNB graphics occupy roughly 25% of the total die area, or the space of three cores if you prefer to look at it that way, and IVB's graphics (a "tock" on the GPU side, as opposed to a "tick") will occupy even more space.
While we don't have a close-up die shot of Ivy Bridge (yet), we do know its approximate die size and the layout should be similar to the Sandy Bridge die as well. Anand estimated the die size to be around 162mm^2 for what appears to be the quad-core die (dual-core SNB with GT2 is 149mm^2, and even with the more complex IGP we wouldn't expect dual-core IVB to be larger). That's a 25% reduction in the die size when compared with quad-core SNB die (216mm^2). A 22nm quad-core SNB die would measure in at 102mm^2 with perfect scaling and assuming all the logic/architecture is the same; however, scaling is never perfect and we know there are a few new additions to IVB, so 162mm^2 for IVB die sounds right. Transistor wise, IVB counts in at around 1.4 billion, a 20.7% increase over quad-core SNB.
To the point, today's CPUs have much more than just CPU cores in them. We could easily have had a hex-core 32nm SNB die at the same die size if the graphics and memory controller were not on-die .We've actually got a pretty good reference point with SNB and Gulftown; accouting for the larger L3 cache and extra QPI link, Gulftown checks in at 240mm^2, though TDP is higher than SNB thanks to the extra cores. The same applies to Ivy Bridge. If Intel took away the graphics, or even kept the same die size as SNB, a hex-core would be more or less given. Instead, Intel has chosen to boost the graphics and decrease the die size.
Subjectively, this is not a bad decision. Intel needs to increase graphics performance, and will do just that in IVB. Intel's IGP solutions account for over 50% of the PC marketshare, yet the graphics are their Achilles' Heel. All modern laptops have integrated graphics (though many still opt to go discrete-only or use switchable graphics), and having more CPU cores isn't that useful if your system will be severely handicapped by a weak GPU. We've also shown in numerous articles how hex-core scaling over quad-core is largely unnecessary on desktop workloads (more on this below). Increasing the graphics' EU count and complexity while also adding CPU cores would have led to a larger than ideal die, not to mention the increased complexity and cost. Remember, Moore's Law was more an observation of the ideal size/complexity relationship of microprocessors rather than pure transistor count, and smaller die sizes generally improve yields in addition to being less expensive.
While six cores is obviously 50% more than four cores, the increase in cores isn't proportional to the increase in performance. More cores put off more heat and hence clock speeds must be lower, unless the TDP is increased. Intel couldn't have achieved the 77W TDP at reasonable clock speeds if Ivy Bridge was hex-core. On top of that, there is still plenty of software that is not fully multithreaded or fails to scale linearly with core count, so you would rarely be using all six cores (plus six more virtual cores thanks to Hyper-Threading). More cores will only help if you can actually use them, while higher frequencies universally improve performance (all other things being equal). We can give some clear examples of this with a few graphs from our Sandy Bridge E review.
Photoshop is a prime example of software that has limited multithreading. We used the older CS4 in our tests, but CS5 isn't any better, unfortunately. Photoshop can actively take advantage of four threads, and thus the hex-core i7-3960X isn't really faster than quad-core i7-2600K. The slight difference is most likely due to the difference in Turbo (3.9GHz vs 3.8GHz) or the quad-channel vs. dual-channel memory configuration. There are also a few peaks where more than four threads are used, thus i7-2600K is faster than i5-2500K thanks to Hyper-Threading, on top of the extra cache and higher Turbo of course.
In general, games are horribly multithreaded. DiRT 3 is an example of a typical game engine, and adding more cores and enabling Hyper-Threading actually hurts the performance. There are only a handful of games that benefit from more cores, although there are still obstacles to overcome even then (see below).
Civilization V fits in the handful of games that can scale across multiple cores. However, you will still be bottlenecked by your GPU in GPU bound scenarios (like in the second graph), which makes the usefulness of more cores questionable in this case. It's irrelevant whether you get 60 or 120 FPS in CPU bound scenarios if the real gaming performance is ultimately bound by your GPU speed.
The above graphs are biased in the sense that they are for tests where SNB-E is roughly on-par with regular quad-core SNB. However, keep in mind that we are comparing 130W hex-core and 95W quad-core; a 77W hex-core part might need lower clock speeds and could perform worse in limited-threaded tasks (depending on the Turbo speeds of course). In general, tasks like video encoding, 3D rendering, and archiving scale well with additional cores, but how many consumers run these tasks on a day-to-day basis? If you know you will be doing a lot of CPU intensive work that can benefit from additional cores, SNB-E (and later IVB-E) will always be an option--though you'll give up Quick Sync and the integrated graphics in the process. For most consumers, higher frequencies will likely prove far more useful due to the limited multithreading of everyday applications.
There is also the AMD point of view. Bulldozer hasn't exactly been a success story and there is no real competition in the high-end CPU market because of that. Intel could skip Ivy Bridge altogether and their position at the top of the performance charts would still hold. With no real competition, there's no need to push the performance much higher. Four cores is enough to keep the performance higher than AMD's, and reducing the TDP as a side effect is a big plus, especially when thinking about the future and ARM. As another point of comparison with AMD, look at Llano: it's a quad-core CPU that focuses more on improved graphics. For example, the now rather "old" Lynnfield i5-750 (quad-core, no Hyper-Threading) is able to surpass the CPU performance of Llano, but that hasn't stopped plenty of people from picking up Llano as an inexpensive solution that provides all the performance needed for most tasks.
When looking at the big picture, there really aren't any compelling reasons why Intel should have gone with hex-core design for Ivy Bridge. Just like the Sandy Bridge vs. Gulftown comparison, IVB vs. SNB-E looks like a good use of market segmentation. Sure, some enthusiasts will argue that having a quad-core CPU is so 2007, but don't let the number of cores fool you. The only thing that 2007 and 2012 quad-cores share is the core count; otherwise they are very different animals (see for example i7-2600K vs Q6600). It also appears that even without additional cores or clock speed improvements, Ivy Bridge will be around 15% faster clock for clock than Sandy Bridge (according to Intel's own tests; a deeper performance analysis will come soon).
Increasing the frequencies and boosting the clock for clock performance yields increased performance in every CPU bound task, and improving the quality of the on-die graphics helps in other areas. In contrast, increasing the core count only helps if the software has proper multithreading and can scale to additional cores--both of which are easier said than done. Given all of the possibilities, it would appear that Intel has done the right thing, and in the process there's no need to try and convince consumers into believing that they need more cores than they actually do.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Hulk - Monday, December 5, 2011 - linkbut I still wish we'd be seeing a hex IB. Video editing and transcoding is one application where the additional cores would definitely be helpful. But of course as you mentioned, Intel doesn't want to savage it's higher end solutions.
I suppose we'll eventually get to the point where the high end parts will be 8 or more cores and then the mainstream parts can go to 6 cores.
Too bad BD wasn't more competitive...
euler007 - Monday, December 5, 2011 - linkThere probably will be a socket 2011 IB-E, but since they just launched the SNB-E cpus and the new chipset they're being hush about it so that the people with deep pockets that want the best and most expensive toys are not tempted to wait for it.
ncrubyguy - Monday, December 5, 2011 - linkYeah... The George Lucas Extreme Collectors Edition with those last 2 cores enabled.
DigitalFreak - Monday, December 5, 2011 - link"Video editing and transcoding is one application where the additional cores would definitely be helpful"
Maybe, but wouldn't you get the same or better performance for these tasks using Quick Sync?
JarredWalton - Monday, December 5, 2011 - linkPotentially, but there's also a risk of a loss in quality, and to my knowledge the major video editing applications (e.g. Adobe Premiere) don't use it.
plonk420 - Monday, December 5, 2011 - link*old* x264 (i.e. a GOOD encoder) docs say "The quality loss from multiple threads is mostly negligible unless using very high numbers of threads (say, above 16)."
6core/12thread may barely hit that limit, but i doubt it's notable. i'll have to try some tests if i feel motivated to do so later...
video *editing* probably doesn't matter.
Sivar - Wednesday, December 7, 2011 - linkx264's quality loss as threads increase is linear. 8 threads will lose about twice as much quality as 4. I suspect that documentation, if fleshed out, would say something like "When encoding video at a set bit rate, the quality loss from multiple threads is nearly imperceptible until one encodes with, say, 16 threads or so."
The loss is somewhat measurable. If you encode a video in quality mode, only the file size increases.
piroroadkill - Monday, December 5, 2011 - linkQuick Sync has to be the least supported piece of technology around.
I own a 2500K and a Z68 board, but only a couple of amateur level commercial products actually support it.
Tell me when regular x264 supports it.
MonkeyPaw - Monday, December 5, 2011 - linkYou mean an Intel's graphics engine has poor support? :-0
Maybe, just maybe, if Intel would leave QS enabled in ALL its CPUs, we might actually see real support for it. But that's not the Intel business model. It will probably take AMD's copying of the idea and using it across their lineup before we see Intel do it, too, just like we did with CnQ/Speedstep and 64bit.
Zink - Monday, December 5, 2011 - linkThere also isn't a huge demand for QS today. Video on mobile devices is becoming more common but cellular data and video acceleration are becoming better so you can either stream a video or play the full quality 1080p file from your PC. I think the % of consumer that has Quick Sync and transcoded videos or rips them from BD is very tiny. There are plenty of people who do video professionally but the quality issues with QS aren't worth it for professional applications. Maybe if QS could be integrated into video editing software to be used to generate the video previews and then use standard encoding for the final product.