Sun Unveils Direct chip-to-chip Interconnect 185
mfago writes "On Tuesday September 23, Sun researchers R. Drost, R. Hopkins and I. Sutherland will present the paper "Proximity Communication" at the CICC conference in San Jose. According to an article published in the NYTimes, this breakthrough may eventually allow chips arranged in a checkerboard pattern to communicate directly with each other at over a Terabit per second using arrays of capacitively coupled transmitters and recievers located on the chip edges. Perhaps the beginning of a solution to the lag between memory and interconnect speed versus cpu frequency?"
No registration (Score:5, Informative)
Link via Google (no Reg. Required) (Score:5, Informative)
Re:IANAEE (I am not an electrical engineer) (Score:3, Informative)
The DEC PDP11/03 aka LSI-11 was implemented as a multi chip (4 + 1 rom) CPU. The 5 chips were placed right next to each other.
This chip set was also setup by others with the UCSD Pascal "p-code" as the instruction set.
Other CPU in the series had MMU, and additional instructions in additional chips.
Re:IANAEE (I am not an electrical engineer) (Score:2, Informative)
from the HyperTransport FAQ [hypertransport.org]
"6. What is the current specification release?
The current HyperTransport Technology Specification is Release 1.05. It is backward compatible to previous releases (1.01, 1.03, and 1.04) and adds 64-bit addressing, defines the HyperTransport switch function, increases the number of outstanding concurrent transactions, and enhances support for PCI-X 2.0 internetworking."
Hard to say what's new here (Score:2, Informative)
The article immediately made me think of multi-chip modules. Multi-chip modules is an idea which never really caught on in the industry (except for IBM), and I'm not sure how Sun's innovation isn't just a take-off along that idea. Multi-chip modules have failed due to costs since much has to go right to get a multi-chip module that works.
Any practical chip-to-chip connectivity scheme had better have a good rework scheme. If it doesn't, it's just boutique technology that will not affect the industry overall.
Having worked on chips with multi-gigabit pins, a huge problem is resynchonizing the signals. Creating a receiver to align one pin's data with 15 neighbors at 3GHz takes a whole lot more logic space on the die than a small driver (or receiver). The auxiliary logic basically makes shrinking the final driver FET almost meaningless.
Modern chip design is a constant trade-off between features and cost. And what's cheap is what everyone has been doing for years (or is an evolution of that).
Re:FINALLY! (Score:3, Informative)
Another problem is that the speed of memory itself isn't that great unless you want to spend a _lot_ of money, to the tune of $50-$100 per megabyte as we see in advanced processor caches, and the faster it is, the more very power inefficient it becomes, maybe to a sizeable fraction of a watt per megabyte.
Re:IANAEE (I am not an electrical engineer) (Score:5, Informative)
Sun's technology is not simply soldering to pins directly together (as you suggest), which is effectively the same thing as wiring through a circuit board. The high speed, low drive strength, low-voltage drivers have to go through pads that convert the internal signal to a slower, high drive strength, high voltage driver, that will yield a reliable connection to the next chip. I'm not an expert in this area, but Physics just gets in the way. There are capacitive issues, and interconnect delay issues.
Sun is claiming to use capacitive coupling (put the pins really close together, but don't physically connect them.) This way they don't have to drive the external load of the pin/board connection, and are claiming they will be able to scale this down to a pad that will be able to switch faster than existing physical wire connected pins. Which means they believe they can make this technology work with lower drive stengths.
They still have a ways to go. Notice that the P4 has faster connections using existing techology. Sun did a proof of concept, and claim they can speed it up 100x. So they haven't _proved_ that this will operate faster yet. They still have many things to overcome to make this viable, including how to make a mass production/assembly process. It's going to be a few years. At least.
Re:IANAEE (I am not an electrical engineer) (Score:2, Informative)
L1 cache typically found on today's processors and DRAM are two different things with different design targets. Pick up a VLSI book.
More on the broader project (Score:4, Informative)
Working prototype computer about six years away, according to the article.
Transputer dusted off and presented as new? (Score:2, Informative)
Re:IANAEE (I am not an electrical engineer) (Score:5, Informative)
Sending fast edges over a bus is difficult because the signal degrades:
If your dataset fits into the cache well, which is often the case for PCs, then a cache can fix most of your problems. If you're dealing with datasets that span gigabytes or terabytes and your application can't be subdivided such that processing and memory can be constrained per cpu then your cache doesn't assist you very much.
Eerm... weren't they called Transputers back then? (Score:2, Informative)
I remember seeing the first Transputers on my very first Cebit visit sometime in the early 90s. The Transputer workstations would crunch full screen fractal grafics in seconds, which was an amazing feat back then. Just plain *everybody* was convinced they would put the then ruling Amiga to rest or - also a popular theory back then - would be adapted by Commodore. There is this Transputing PL Ocam that, as far as I can tell, makes Java, C# and all the rest look like kiddiecrap. Everyone who I know who knows Ocam says it rules and usually also has the skills to prove it.
The overall concept - very much like the one Sun is talking about now - was to stick in a CPU, or 2 or 10 and make the box faster with nearly no decline in perfomance/processor ratio. It actually did work that way.
Transputers never made it though, to expensive and the required software developement was to esotheric back then. It would be really nice to see this concept rise again. Maybe now they actually would be affordable.
Already been done with SERDES (Score:5, Informative)
In fact, multichannel SERDES is the next real interconnect technology. It's used in Infiniband, HyperTransport, PCI Express, Rambus RDRAM and in 10 Gb/s Ethernet (usually as 4x3.125Gbit/s channels as a XAUI interface between optical module and switch fabric silicon with 8b/10b conversion). There are even variants, such as LSI Logic's HyperPHY, that are deployed specifically for numerous high-bandwidth chip-to-chip interconnections. The problem that is cropping up is that the traditional laminate PCBs are becoming the limiting factor in increasing per-channel connectivity, to the extent that 10Gbit/s per channel speeds are next to impossible on these boards due to the lack of signal integrity. There has been some experimentation for very short hops on regular boards, as well as using PTFE resins to manufacture the boards themselves, but it's precarious at best.
As for Sun's technology, it's interesting but I don't know how much it will catch on or how feasible it will be. It creates packaging issues and requires good thermal modelling and 3-D field modelling to account for expansion and contraction through the operating temperature range and the presence of nearby signals, which could affect the integrity of the signals.
Re:Hard to say what's new here (Score:3, Informative)
Definitely. That would be electromagnetic coupling. Sun's using capacitive coupling, using only the E field. Last week we saw an article on a company using inductive coupling (magnetism) for short-distance data links (in their first product, a wireless earset).
EM is long-range (drops according to the inverse square) but very hard to convert to and from electricity.
M is short-range (inverse sixth power), relatively easy to convert, not easy to interfere with, but bulky and directional.
E is short-range (inverse sixth), slightly harder to convert, not bulky, but easily interfered with.
Sun's choice here is perfect: this application doesn't need (or want) the range of EM, and can't afford the mass and volume of an inductor. OTOH, the ease of interference is easily dealt with because once we know the geometry and composition of the board, we know the shapes the e-fields will have.
I really like the fact that we've had two nicely orthogonal stories just so close together.
-Billy
Future Applications (Score:1, Informative)
I/O limitations of traditional chip architectures prevent us from building truly large-scale hardware neural network systems. To achieve the connectivity required to model a net as complex as the human brain's, it's not enough to link up an array of small neural chips,. because you hit a bandwidth bottleneck as soon as you try to go off-chip. This limits neural architectures to simple, regular block-structured models.
These chips of Sun's only meet up at the edges, but (assuming advances in reduced power usage and heat dissipation technologies) imagine if this was extended to provide connectivity on all exterior surfaces of the package? You could build neural networks of arbitrary size that weren't I/O bound.
This would enable truly "brute force" approaches to connectionist AI, and quite possibly something capable of human-level intelligence in real time.
Re:Chip to Chip technology? (Score:3, Informative)
No, a trace is a flat wire stuck to (or etched from) a printed circuit board. This invention (process, really, see below) obviates the need for PCB's between (at least some of the) chips. A lead is a wire, not stuck to a PCB, such as the input connections to most oscilloscopes and test equipment.
I don't get it either. You want to make memory access faster and faster, so you put it closer and closer to the cpu. Eventually the bus length reaches 0, as the two chips are physically adjacent. So what?
As with many great inventions, the difficulty is not so much thinking of what needs to be done, but in actually doing it cost-effectively. System designers have been trying to use the idea of optimized interconnect (sometimes called "integration", as in LSI, VLSI, etc.) but it has remained cost-prohibitive in most cases (notable exceptions include the Pentium Pro and some ATI mobility products, but these are more desperation moves than anything, since margins drop on multi-chip "chips", and they had to do it to get the needed result even though the costs were higher than normally tolerable.)
So, sure, light bulbs are obvious, as are cars, space shuttles, computers, etc. The hard part is making them possible technically and economically.
Hope that helps you two understand why the ASIC-design industry is pretty damn excited and anxious to license this technology (if we really can do this as cheaply as they claim).