Roboticles writes "Tweakers.net paid a visit to Intel's laboratories in the California town of Folsom, the birthplace of the 45nm CPU. We spoke to lead architect Stephen Fisher about the development of the Penryn chip and the day the first A0 version arrived. We were shown the machinery used to test and patch the 45nm processor, which is currently being manufactured in Arizona for release next month."
I thought the "TickTock" process of developing a technology two different ways was a really neat innovation. Few businesses would dare double their research just to reduce their risks. I wonder if a similar method is used in other industries.
Imagine if Microsoft did it? Maybe we wouldn't end up with things like ME or Vista:)
I wonder if there's a competitive spirit between the teams.
Windows-ME *was* a sort of Microsoft "Tick-Tock" (annoying-new-buzzword) development - They had the Conservative development line (Win 95-98-ME) and the "New Tech" team for (NT3,NT3.5,NT4,NT2K,XP,Vista)..
Unfortunately ME development was hindered by the "Ballmer Peak".. http://xkcd.com/323/ [xkcd.com]
As a side thought, how far does light travel between clocks at 7Ghz? I make it about 4cm..
Personally I'd have considered the 9X and ME teams to be the new technology folks, and the NT teams to be the conservative lot. Give how NT had to be the stable, business orientated one. Look at how long it took for DirectX to be supported on the NT platform. Games on NT? Sure...
We all know BillG said NT stands for New Technology, but that was purely a marketing term. Underneath it really is the home user that gets the raw end of the deal when it comes to trying out new technologies.
95/98/ME was just a natural progression and continuation of Win 1.0, Win 2.0, Win 3.1, Win 3.11... NT on the other hand, _WAS_ new technology and is a real fully preemptive multi-cpu operating system. DirectX hasn't been supported on NT because there was no business reason to. NT wasn't targetted for the home market, while workstations and servers don't really need 3D.
Few businesses would dare double their research just to reduce their risks.
It's a matter of affordability, and Intel can afford to have two rival design teams working in parallel on the same project. While this strategy proves to be a hedge on a risky bet, it has its disadvantages. Firstly, the design team can tend to slack off as it realizes that failure will not mean the death of the company. Secondly, they end up competing with each other instead of the competition, which can lead to some unhealthy internal politics, and it DOES! Intel managed to rise above the internal backbi
This has been used across the entire industry, just without a cute catch-phrase for it, and without pushing for this quick an improvement to process technology.
Think about it, you would see the overall design come out, then that same design would be released on an improved process(going from 90 to 65nm for example). The design would be the same, just an improved process that would allow for faster versions of that design.
AMD has done it as well to an extent, but the high-end processors in the K8 generation are still on 90nm while the lower-clocked chips are at 65nm. Intel has more resources, so can throw more resources at fab process improvements while keeping the same number of resources focused on the overall CPU design.
Now, there are some disadvantages to Intel's method of approaching CPU innovation, including not looking for other ways to improve system performance. Think about it, AMD was able to do well due to the integrated memory controller and HyperTransport with a much smaller amount of cache. Even with these elements, will Intel come out with anything really NEW that will improve overall system performance?
So, Intel may hold the lead in terms of performance, or the AMD K10 architecture may allow AMD to catch back up. Either of these are possibilities at this point, and AMD is also working on things like adding some GPU functionality to their processors(Fusion being the first example of this). Even if the GPU power on the CPU is limited in terms of performance, it may add to the graphics processing power of an add-in video card to give an edge in terms of performance. Sure, Intel may be the platform for those who run MS Office, but for those who want some graphics power, AMD may end up with a clear advantage.
Tick-Tock is just an Intel way of saying they will do the same thing they always have, just pushing out improvements faster. AMD is focused more on figuring out ways to do things better because they can't keep up in a straight MHz competition, or on a straight fab process competition.
Tick-tock isn't just a name for something everyone else already does. There was really no such regimentation for how semiconductor companies released their products. tick-tock is very regimented and tightly scheduled. Previously you might have 2-3 major arch changes on the same process before a new process was used, or you would roll out a new process and a new arch at the same time (or very close together).
AMD has introduced fab process improvements and applied them to the "current" designs. Across the industry, it has been unusual to release both a process improvement and a major design change at the same time. Sure it may happen, but you very often see "new design on current process technology", then later down the road the new process technology is used to improve the designs of current products, followed by a new design on the process technology of the time. Tick-Tock speeds the push toward the newer t
Now, there are some disadvantages to Intel's method of approaching CPU innovation, including not looking for other ways to improve system performance.
Ummm... didn't read the article, did you? Tick-Tock was created precisely to address the issue of more aggressive CPU innovation. One ADVANTAGE of Intel's method is CPU innovation. The Nehalem team can look at all sorts of crazy new innovations without wondering if it will fail on actual silicon. The Penryn team will let the Nehalem team know where the trouble
It's not new, except for Intel actually giving it a name and using the name and the procedure for marketing purposes. The DEC Alpha team did it this way, for example.
This is nothing new at Intel or any comparable semiconductor manufacturer who owns their own fabrication plants. Bob Colwell discussed this in a presentation he gave at Stanford. Intel has separate design teams to handle new designs and refinements to existing designs. The later teams are often linked with process technology or fabrication plants because it is very very expensive to have a new process become available for use in production while having nothing available to take advantage of it.
That depends -- front seat or back seat, and were you naked or at least partially clothed, was it a threesome, and, most importantly, were there Eastern European women involved?
I hear the CPU coming, it's rolling off the press And I ain't seen this performance since the 90nm process. I'm stuck in Folsom Labs, and the clocks keep running faster, But that deadline keeps on coming, from that Santa Clara.
When I was just a junior, my mentor told me, "Look, Always be a good engineer, don't ever push your clock"
But I overclocked a CPU just to watch it die. When I heard that core blowing, I hung my head and cried.
I bet those folks at AMD in their fancy die package Are probably overclocking 'til
Wow. This all sounded very cool, and gave me a lot more faith in Intel. Until I realised that they hadn't once mentioned testing on Linux. Do they really ignore every real OS except windows (and probably Mac, I guess?)?:/
Until I realised that they hadn't once mentioned testing on Linux.
Just because one article or press release was light on details, doesn't mean that it didn't happen. Here is what you seek. Intel did mention testing on Linux and some other operating systems.
No, linux users run the gamut, including some lots of purchasers of new servers, and very high end clusters of many processors (hint: more than you can put into any windows box).
> Merom team, that had managed to boot Windows on the A0 version of the Core 2 Duo in under thirty minutes... Penryn worked, but it took six hours to get Windows to boot properly on it
Quite obviously a software problem. Now if they had used Linux...
Most people who visit the Californian town of Folsom, which lies at a two hour drive to the northeast of San Francisco, go there because it is situated close to the beautiful Lake Tahoe and some of the skiing areas in the Sierra Nevada mountain range.
Maybe it looks close if your home is in the Netherlands, but not in actual fact. No one goes to Folsom for the lake or the snow skiing (water skiing is another story). Folsom is almost at sea level, Lake Tahoe is at 6220 something, and 120 miles away.
Intel processors aren't made in China. Look at them some time, they all cite their point of origin. It moves around depending on generation, they'll be upgrading some fabs and as such making no processors there, or they'll retask fabs to other things like embedded processors and so on. However they don't have a single fab in China. A good bit of them are in America, but they also have one in Ireland, a couple in Israel and so on. The one in question here, Fab 32, is located in Chandler, Arizona which is one
ASML in Veldhoven, the Netherlands, holds about 80% of the market and has been manufacturing 45 um wafersteppers for some time already news article 2005 [nsti.org]. Intel is one of their customers, so actually Veldhoven is the birthplace of the 45nm processors... At the moment they are down to 32 nm [asml.com] already...
Scaling is not driven by lithography anymore. It is driven by material science advances. Intel introduced Hafnium based gate dielectrics in 45nm, which is an incredibly impressive feat given that they are 1-2 years beyond all other companies.
Scaling by lithography was in the 80ies and early 90ies.
In non-benchmarks, it's a win because it's a compression for executable code.
It's only a win if your execution is bottlenecked by instruction bus bandwidth. That only happens if you're thrashing your L1 instruction cache, and THAT only happens with horribly bloated software and/or horribly small L1 caches.
While it's a good compression of executable code, it's good compression of x86 code. Other ISAs manage to pack way more into their instructions in the first place. Plus, the random alignment of x86 instructions means that the pipeline is elongated by a couple of stages just to f
"But it doesn't matter that you have to use 8 instructions to perform the same thing other arch's do in 1 opcode, because the microcode is really, really, really fast!!1"
Actually, you have it backwards. The x86 can do a handful of RISC instructions with a single instruction. That instruction might take longer to execute, but since you get more done for that one instruction, you get better instruction cache locality.
If you would like to troll on the failings of x86, there are well documented options for you. You must earn your troll-fu, young grasshoppa.
Actually, you have it backwards. The x86 can do a handful of RISC instructions with a single instruction. That instruction might take longer to execute, but since you get more done for that one instruction, you get better instruction cache locality.
On all the x86 architectures I know, if you use any instruction which gets microcoded, you end up with a huge performance hit. You basically run single pipelined until the microcode ends. These days, both Intel and AMD datasheets highly recommend you use simple form instructions as much as possible.
Yes, there comes a point where instruction cache locality matters more than instructions per clock. Your average bloated GUI app would benefit more from optimise-for-size than optimise-for-speed, for example (
In any case, the point is that other architectures CAN perform far better than x86 without the variable opcode length, prefixing and other nonsense. They don't produce large amounts of executable either.
How many of them can natively run code written for the 16-bit variant of their line that was produced 30 years ago?
While most would claim that native 8086 and heck even 32-bit pmode support really isn't needed in the day and age of "long mode," dropping them would be a huge pain and at that point you might a
I'm not saying it always works. I'm not saying it is a good thing. If I had the free option, I would be using Alpha or Itanium.
Instruction length is, of course, only one factor, and I am guessing a minor factor, considering memory bandwidth, data cache associativity, floating point performance, and performance and flexibility of VM management seem to be more serious problems with current x86 implementations.
Oh stop it. The Grandparent Troll is right. The "CISC instructions running from the instruction cache into a RISC core runs really really fast" crowd always conveniently neglects to mention that the half-assed, non-orthagonal, non aligned instructions required a bus cycle to do the instruction fetch. All those pipeline stages are there to cover for the lame instructin set.
I'm not saying it is good. I'm saying he is wrong. I'm all for a healthy bashing of x86, but lets keep our facts straight, please.
And besides, there is no level playing field for comparison when that RISC core is so lame, anyway. The quality example of that being that 800MHz Alpha's used to wipe the floor with 2 GHz Pentium IV's on floating point.
I don't think there's an architecture out there that deals with unaligned instructions effectively. It's hardly a problem that solely exists with x86. The "encouraged to be used" instructions, however, do run very very fast and really, if the compiler is written to favor them, what's the problem?
AMD's new chip is out - it's nothing terribly special. Or do you mean Phenom? Logic would dictate that since Barcelona isn't anything special (considering it's competing with a 1 year+ older chip), that a desktop chip based on this arch will not dominate since it competes with the same Intel arch on the desktop. I'm not saying Barcelona sucks - it's competitive (at least), but generally you'd kind of want a chip with a 1 year newer arch to beat its competition.
Need a magnifying glass (Score:3, Funny)
Re: (Score:2, Funny)
TickTock (Score:4, Insightful)
Imagine if Microsoft did it? Maybe we wouldn't end up with things like ME or Vista
I wonder if there's a competitive spirit between the teams.
Re: (Score:2, Informative)
Unfortunately ME development was hindered by the "Ballmer Peak".. http://xkcd.com/323/ [xkcd.com]
As a side thought, how far does light travel between clocks at 7Ghz? I make it about 4cm..
Re:TickTock (Score:4, Interesting)
Parent
Re: (Score:2)
Re: (Score:2)
Few businesses would dare double their research just to reduce their risks.
It's a matter of affordability, and Intel can afford to have two rival design teams working in parallel on the same project. While this strategy proves to be a hedge on a risky bet, it has its disadvantages. Firstly, the design team can tend to slack off as it realizes that failure will not mean the death of the company. Secondly, they end up competing with each other instead of the competition, which can lead to some unhealthy internal politics, and it DOES! Intel managed to rise above the internal backbi
Re:TickTock (Score:5, Insightful)
Think about it, you would see the overall design come out, then that same design would be released on an improved process(going from 90 to 65nm for example). The design would be the same, just an improved process that would allow for faster versions of that design.
AMD has done it as well to an extent, but the high-end processors in the K8 generation are still on 90nm while the lower-clocked chips are at 65nm. Intel has more resources, so can throw more resources at fab process improvements while keeping the same number of resources focused on the overall CPU design.
Now, there are some disadvantages to Intel's method of approaching CPU innovation, including not looking for other ways to improve system performance. Think about it, AMD was able to do well due to the integrated memory controller and HyperTransport with a much smaller amount of cache. Even with these elements, will Intel come out with anything really NEW that will improve overall system performance?
So, Intel may hold the lead in terms of performance, or the AMD K10 architecture may allow AMD to catch back up. Either of these are possibilities at this point, and AMD is also working on things like adding some GPU functionality to their processors(Fusion being the first example of this). Even if the GPU power on the CPU is limited in terms of performance, it may add to the graphics processing power of an add-in video card to give an edge in terms of performance. Sure, Intel may be the platform for those who run MS Office, but for those who want some graphics power, AMD may end up with a clear advantage.
Tick-Tock is just an Intel way of saying they will do the same thing they always have, just pushing out improvements faster. AMD is focused more on figuring out ways to do things better because they can't keep up in a straight MHz competition, or on a straight fab process competition.
Parent
Re: (Score:2)
Re: (Score:2)
Tick-Tock speeds the push toward the newer t
Re: (Score:2)
Ummm... didn't read the article, did you? Tick-Tock was created precisely to address the issue of more aggressive CPU innovation. One ADVANTAGE of Intel's method is CPU innovation. The Nehalem team can look at all sorts of crazy new innovations without wondering if it will fail on actual silicon. The Penryn team will let the Nehalem team know where the trouble
Re: (Score:1)
Re: (Score:3, Interesting)
wow, 2 ACs?! (Score:2, Funny)
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Re: (Score:3, Funny)
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
An IPOD is 0.27in thick
0.27in/1 * 0.0254m/1in * 1nm/10^-9m = 0.006858 * 10^9 = 6.9*10^6nm = IPOD NANO
45nm/1 * 1IPOD NANO/6.9*10^6nm = 6.5*10^-6 IPOD NANOS
That would be:0.0000065 IPOD Nanos to the 45 nanometer chip technology
I knew those chemistry conversion factors would come in helpful some day!!!
new prison job (Score:2)
Intel's laboratories in the California town of Folsom, the birthplace of the 45nm CPU
So that's what they make those software CEO's do in prison after back-dating stock options...
No more making license plates I guess!
ObJohnnyCash (Score:2)
"I once overclocked a CPU / just to watch it die..."
Folsom Labs Blues (Score:2)
And I ain't seen this performance since the 90nm process.
I'm stuck in Folsom Labs, and the clocks keep running faster,
But that deadline keeps on coming, from that Santa Clara.
When I was just a junior, my mentor told me, "Look,
Always be a good engineer, don't ever push your clock"
But I overclocked a CPU just to watch it die.
When I heard that core blowing, I hung my head and cried.
I bet those folks at AMD in their fancy die package
Are probably overclocking 'til
No Linux testing? (Score:2)
Re:No Linux testing? Keep looking.. (Score:3, Informative)
Just because one article or press release was light on details, doesn't mean that it didn't happen. Here is what you seek. Intel did mention testing on Linux and some other operating systems.
http://enthusiast.hardocp.com/article.html?art=MTI2OCwxLCxoZW50aHVzaWFzdA== [hardocp.com]
"During a press briefing earlier today, Intel stated that the very first 45nm processor was already up and running and used by the Intel validation team to successfully boot a te
Re: (Score:2)
heh (Score:2)
Postal Service (Score:1)
Get the clip and you'll understand why.
It's not the hardware... (Score:1)
Quite obviously a software problem. Now if they had used Linux...
Geographical correction (Score:2)
Maybe it looks close if your home is in the Netherlands, but not in actual fact. No one goes to Folsom for the lake or the snow skiing (water skiing is another story). Folsom is almost at sea level, Lake Tahoe is at 6220 something, and 120 miles away.
Manufacturing in Arizona? (Score:2)
Re: (Score:3, Informative)
The one in question here, Fab 32, is located in Chandler, Arizona which is one
Re: (Score:2)
There are a lot of wafer fabs in China. SMIC being the biggest company. The number is likely to increase.
nothing new (Score:2)
Re: (Score:2)
Scaling is not driven by lithography anymore. It is driven by material science advances. Intel introduced Hafnium based gate dielectrics in 45nm, which is an incredibly impressive feat given that they are 1-2 years beyond all other companies.
Scaling by lithography was in the 80ies and early 90ies.
Dear Intel and/or AMD (Score:2)
However my new PC is still slow as hell and it doesn't feel any faster than the old one.
Re: (Score:2, Insightful)
Yeah, it only conquered the world. :)
Re: (Score:2)
Re: (Score:2, Insightful)
In non-benchmarks, it's a win because it's a compression for executable code.
It's only a win if your execution is bottlenecked by instruction bus bandwidth. That only happens if you're thrashing your L1 instruction cache, and THAT only happens with horribly bloated software and/or horribly small L1 caches.
While it's a good compression of executable code, it's good compression of x86 code. Other ISAs manage to pack way more into their instructions in the first place. Plus, the random alignment of x86 instructions means that the pipeline is elongated by a couple of stages just to f
Re: (Score:2, Informative)
"But it doesn't matter that you have to use 8 instructions to perform the same thing other arch's do in 1 opcode, because the microcode is really, really, really fast!!1"
Actually, you have it backwards. The x86 can do a handful of RISC instructions with a single instruction. That instruction might take longer to execute, but since you get more done for that one instruction, you get better instruction cache locality.
If you would like to troll on the failings of x86, there are well documented options for you. You must earn your troll-fu, young grasshoppa.
Re: (Score:1)
Actually, you have it backwards. The x86 can do a handful of RISC instructions with a single instruction. That instruction might take longer to execute, but since you get more done for that one instruction, you get better instruction cache locality.
On all the x86 architectures I know, if you use any instruction which gets microcoded, you end up with a huge performance hit. You basically run single pipelined until the microcode ends. These days, both Intel and AMD datasheets highly recommend you use simple form instructions as much as possible.
Yes, there comes a point where instruction cache locality matters more than instructions per clock. Your average bloated GUI app would benefit more from optimise-for-size than optimise-for-speed, for example (
Re: (Score:2)
How many of them can natively run code written for the 16-bit variant of their line that was produced 30 years ago?
While most would claim that native 8086 and heck even 32-bit pmode support really isn't needed in the day and age of "long mode," dropping them would be a huge pain and at that point you might a
Re: (Score:2)
Instruction length is, of course, only one factor, and I am guessing a minor factor, considering memory bandwidth, data cache associativity, floating point performance, and performance and flexibility of VM management seem to be more serious problems with current x86 implementations.
Not that any of it is really on topic
Re: (Score:2)
Re: (Score:2)
And besides, there is no level playing field for comparison when that RISC core is so lame, anyway. The quality example of that being that 800MHz Alpha's used to wipe the floor with 2 GHz Pentium IV's on floating point.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)