Forgot your password?
typodupeerror
Iphone Hardware Apple

iPhone 5 A6 SoC Teardown: ARM Cores Appear To Be Laid Out By Hand 178

Posted by Unknown Lamer
from the museum-of-modern-art-accepting-vlsi-layouts dept.
MrSeb writes "Reverse engineering company Chipworks has completed its initial microscopic analysis of Apple's new A6 SoC (found in the iPhone 5), and there are some rather interesting findings. First, there's a tri-core GPU — and then there's a custom, hand-made dual-core ARM CPU. Hand-made chips are very rare nowadays, with Chipworks reporting that it hasn't seen a non-Intel hand-made chip for 'years.' The advantage of hand-drawn chips is that they can be more efficient and capable of higher clock speeds — but they take a lot longer (and cost a lot more) to design. Perhaps this is finally the answer to what PA Semi's engineers have been doing at Apple since the company was acquired back in 2008..." Pretty picture of the chip after using an Ion Beam to remove the casing. The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.
This discussion has been archived. No new comments can be posted.

iPhone 5 A6 SoC Teardown: ARM Cores Appear To Be Laid Out By Hand

Comments Filter:
  • by Anonymous Coward on Tuesday September 25, 2012 @06:43PM (#41457051)

    That must be a very fine tipped resist pen...

    • by Anonymous Coward on Tuesday September 25, 2012 @06:51PM (#41457173)

      Yeah I bet their ARMs are tired after making that.

      • Wish I had mod points...

        Although I'm not sure which way I'd mod you - I laughed and groaned at the same time.

      • by Taco Cowboy (5327) on Tuesday September 25, 2012 @08:47PM (#41458343) Journal

        The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever.

        No matter how much improvement on VLSI layout software their output can't match that of hand-laid layout by those who know what they are doing.

        The VLSI layout software are like compilers. The final compiled code relies on two factors - the source-code input and the built-in "rules" of the compilers.

        A similar case is in software programming - The source code from a so-so programmer compiled by a very very good compiler will result in a "good-enough" result.

        It's good enough because it gets the job done.

        However, a similar program by an expert Assembly Language programmer would have left "good enough" behind because the assembly language programmer would know how to tweak his code using the most efficient commands, and cut out the 'fats" by optimizing the loops and flows.

        • by CastrTroy (595695) on Tuesday September 25, 2012 @09:30PM (#41458789) Homepage
          I think you underestimate how good compilers have become. Also, the "expert assembly language programmer" probably would work at 1/100 the pace of a programmer in something more high level like C++. It would probably be next to impossible to write an entire modern operating system, web browser, or word processor in assembly language. Sure for some very small sections of code you can optimize at the assembly level, but you can't write a whole program in assembly. Also, if a person can recognize some optimization, then that optimization can be added to the compiler, which means that a compiler can probably always at least come within a very close margin of where the human could get, and it could probably do better, because a compiler can remember a lot more optimizations than any human can.
          • by Taco Cowboy (5327)

            I think you underestimate how good compilers have become

            Nope.

            I know how good compilers have become - especially compilers from the makers of the particular processor the program supposed to be run on.

            But I guess you may have missed the "so-so programmer" I've mentioned.

            Even the top-line compiler can't produce a top-notch program if been fed source code by a so-so programmer.

            There are a lot of ways to write programs.

            From the so-so programmers, the source code read like a bowl of bland noodles.

            But from a top-notch programmer, he or she would know how to structure

          • by Pseudonym (62607) on Tuesday September 25, 2012 @10:16PM (#41459177)

            I think you underestimate how good compilers have become.

            I think you may have misunderstood the realities of what a modern expert assembly language programmer does.

            An expert assembly language programmer knows when to write assembly language and when not to write assembly language. Assuming that raw performance is the metric by which programmers are judged (which isn't necessarily the case), an expert assembly language programmer will still win over any high-level language programmer because they can also use the compiler.

            It's the same with hand-laid-out VLSI. It's not like some team of hardware designers placed every single transistor. That would cause just as much of an unmaintainable mess as writing a large application entirely in assembly language. Rather, the hand-layout designer worked in partnership with the automated tools.

            • by stevew (4845) on Wednesday September 26, 2012 @09:56AM (#41463473) Journal

              Okay - I'm stepping in here because I actually do chip design for a living. The difference between hand laid-out and machine generated chips can be as much as a 5X performance difference. The facts are that physical design isn't the same as compiler writing. It's a harder problem to crack - first it's a multi-dimensional problem. Next, it has to follow the laws of physics, themselves complicated ;-)

              Both processes DO rely on the quality of input. When my designs don't run fast enough, the likely fix is to go back to the source and fix it there instead of trying to come up with some fix within placement and routing. The other simple fact is that in timing a physical design - you have to consider EVERY path that the logic takes in parallel. There is not such thing as the "inner-most" loop of the algorithm for determining where the performance goes. Finally once you have a good architecture for timing, the placement of the physical gates dominates the process.

              A human - with their common sense is always going to give better performance than an algorithm. I mentioned a 5X difference between hand-drawn & compiled hardware. That is about what I see on a daily basis between what my tools can do for me, and what Intel gets out of their hand-drawn designs for a given technology node.

              • by Theovon (109752) on Wednesday September 26, 2012 @12:53PM (#41465603)

                I'm a chip designer too (although probably not as good as you are), and one thing I wanted to mention for the benefit of others is that in today's chips, circuit delays are dominated by wires. It used to be dominated by transistor delays now, but today, a long interconnect in your circuit is something to avoid at all costs. So careful layout of transistors and careful arrangement of interconnects is of paramount importance. Automatic layout tools use AI techniques like simulated annealing to take a poorly laid-out circuit and try to improve it, but they're even now still poor at doing placement while taking into account routing delays. Placement and routing used to be done in two steps, but placement can have a huge effect on possible routing, which dominates circuit delay. Automatic routers try to do their jobs without a whole lot of high-level knowledge about the circuit, while a human can be a lot more intelligent about it, laying out transistors such with a better understanding of the wires that will be required for that gate, along with the wires for gates not let laid out.

                Circuit layout is an NP-hard problem, meaning that even if you had the optimal layout, you wouldn't be able to determine that in any simple manner. Computers use AI to solve this problem. There is no direct way for a computer to solve the problem. So until we either find that P=NP or find a way to capture human intelligence in a computer, circuit layout is precisely the sort of thing that humans will be better at than computers.

                Compilers for software are a different matter. While some aspects of compiling are NP-complete (e.g. register coloring), many optimizations that a compiler handles better are very localized (like instruction scheduling), making it feasible to consider a few hundred distinct instruction sequences, if that's even necessary. Mostly, where compilers beat humans is when it comes to keeping track of countless details. For instance, with static instruction scheduling, if you know something about the microarchitecture of the CPU that informs you about when instruction results will be available, then you won't schedule instructions to execute before their inputs are available (or else you'll get stalls). This is the sort of mind-numbing stuff that you WANT the computer to take care of for you. Compilers HAVE been getting a lot more sophisticated, offering higher-level optimizations, but in many ways, what the compiler has to work with is very bottom-up. You can get better results if the human programmer organizes his algorithms with knowledge of factors that affect performance (cache sizes, etc.). There is only so much whole-program optimization can do with bad algorithms.

                Interestingly, at near-threadhold voltages (an increasingly popular power-saving technique), circuit delay becomes once again dominated by transistors. When lowering supply voltage, signal propagation in wires slows down, but transistors (in static CMOS at least) slow down a hell of a lot more.

              • by Pseudonym (62607)

                I'm not a hardware designer (obviously), but I am a compiler writer by trade, and I have put in a bit of research in the current literature of VLSI design, and share a commute with a VLSI designer with whom I talk about this stuff all the time.

                My assessment, which is worth exactly what you paid for it, is that while I agree that VLSI design isn't the same as compiler writing, I'm not convinced that it's necessarily a harder problem. To be clear, I'm not trying to get into a pissing contest here. My conjectu

          • by the_humeister (922869) on Wednesday September 26, 2012 @12:43AM (#41460113)

            It would probably be next to impossible to write an entire modern operating system, web browser, or word processor in assembly language.

            Here you go [menuetos.net]. It's pretty impressive for something written entirely in assembly .

          • It would probably be next to impossible to write an entire modern operating system, web browser, or word processor in assembly language.

            Very hard, but maybe not next to impossible. MenuetOS is written in assembly, Rollercoaster Tycoon is written too. If we had a large and competent development team in a big software house, I suppose it would be possible to make a full OS, browser or word proc in bare assembly. Not that it would be feasible or anything though.

          • by Quila (201335)

            Remember how people praised Woz's Apple I circuit board as a work of art? Circuits are an art form as much as a science, so while automation can do well, that artistic human touch still does better.

        • by hazydave (96747)

          Well, the real answer is that it's not an either/or scenario. Chip design teams design and layout chips based on off-the-shelf tools, layout expertise, etc. Silicon compilers are also constantly improved, but a completely different set of people are involved. Not sure about today's Apple, but the 80s/90s Apple probably would have done it both ways. In fact, even now I think about it, and the way the Intrinsity guys seem to work, it makes sense.

          This is sometimes done in PCB layout. Sure, some types of layou

        • by TheRaven64 (641858) on Wednesday September 26, 2012 @05:05AM (#41461381) Journal

          Compilers almost always do a much better job than humans if provided with the same input. The advantage that humans have is that they are often aware of extra information that is not encoded in the source language and so can apply extra invariants that the compiler is not aware of. A human is also typically more free to change data formats, for example for better cache usage, whereas a compiler for a language like C is required to take whatever layouts the programmer provided.

          The problem with place-and-route is that the search space is enormous and automated tools typically use purely deterministic algorithms, whereas humans use a lot more backtracking. A simulated annealing approach, for example, can often do a lot better (check the literature, there are a few research systems that do this).

          However, a similar program by an expert Assembly Language programmer would have left "good enough" behind because the assembly language programmer would know how to tweak his code using the most efficient commands, and cut out the 'fats" by optimizing the loops and flows.

          This is, on a modern architecture, complete bullshit. Whoever is generating the assembly needs to be aware of pipeline behaviour, the latency and dispatch timings of every instruction and equivalences between them. Even if you just compare register allocation and use the same instruction selection, humans typically do significantly worse than even mediocre compilers. Instruction selection is just applying a (very large) set of rules: it's exactly the sort of task that computers do better than humans.

  • Costs (Score:5, Informative)

    by girlintraining (1395911) on Tuesday September 25, 2012 @06:54PM (#41457217)

    The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.

    Coding in assembly still remains a superior method of squeezing extra performance out of software. It's just that few people do it because compilers are "good enough" at guessing which optimizations to apply, and where, and usually development costs are the primary concern for software development. But when you're shipping hundreds of millions of units of hardware, and you're trying to pack as much processing power in a small and efficient form factor, you don't go with VLSI for the same reason you don't go with a compiler for realtime code: You need that extra few percent.

    • Why assembly ... (Score:5, Insightful)

      by perpenso (1613749) on Tuesday September 25, 2012 @07:05PM (#41457353)

      The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.

      Coding in assembly still remains a superior method of squeezing extra performance out of software. It's just that few people do it because compilers are "good enough" at guessing which optimizations to apply, and where, and usually development costs are the primary concern for software development. But when you're shipping hundreds of millions of units of hardware, and you're trying to pack as much processing power in a small and efficient form factor, you don't go with VLSI for the same reason you don't go with a compiler for realtime code: You need that extra few percent.

      I like to view things as a little more complicated than just applying optimizations. IMHO assembly gets some of its biggest wins when the human programmer has information that can't quite be expressed in the programming language. Specifically I recall such things in the bad old days when games and graphics code would use fixed point math. The programmer knew the goal was to multiply two 32-bit values, get a 64-bit result and right shift that result back down to 32 bits. The Intel assembly programmer knew this could be done in a single instruction. However there wasn't any real way to convey the bit twiddling details of this fixed point multiply to a C compiler so that it could do a comparable operation. C code could do the calculation but it needed to multiply two 64-bit operands to get the 64-bit result.

      • by loufoque (1400831)

        You don't need to use assembly for this, you can just use built-ins or intrinsics.

        Assembly is only useful if you want to control register usage.

        • by perpenso (1613749)

          You don't need to use assembly for this, you can just use built-ins or intrinsics. Assembly is only useful if you want to control register usage.

          Which are really just alternative ways to write a line or two of assembly code. They are just conveniences, you are still leaving the C/C++ language and dropping into architecture specific assembly.

          • by loufoque (1400831)

            Which are really just alternative ways to write a line or two of assembly code.

            it is vastly different since you leave register allocation to the compiler, which means you can still use inlining and constant propagation.

            • by perpenso (1613749)

              Which are really just alternative ways to write a line or two of assembly code.

              it is vastly different since you leave register allocation to the compiler, which means you can still use inlining and constant propagation.

              An alternative with an advantage but still fundamentally programming at the architecture specific assembly level rather than the C/C++ level.

              • by loufoque (1400831)

                It's very different. Clearly you haven't done significant amount of work with the two to be able to tell.

                • by perpenso (1613749)

                  It's very different. Clearly you haven't done significant amount of work with the two to be able to tell.

                  That is a very bad guess on your part. Intel x86 and PowerPC assembly were a specialty of mine for many years. Whether I was writing a larger function in a standalone .asm file, doing a smaller function or a moderate number of lines as inline assembly in C/C++ code, or only need a line or two of assembly that could be implemented via a compiler intrinsic hardly matters. I was thinking and programming as an assembly programmer not a C/C++ programmer. Standalone .asm, inline assembly or intrinsic are just imp

                  • Standalone .asm, inline assembly or intrinsic are just implementation details

                    True, in the same way that using a quicksort or a bubblesort is just an implementation detail. Using stand-alone assembly means that the compiler has to create an entire call frame and ensure that all caller-safe registers are preserved, even if your asm code doesn't touch any of them. Using inline assembly is a bit better, but the compiler still often has to treat your asm as a full barrier and so can't reorder operations around it to make most efficient use of the pipeline. Using intrinsics means that

                    • by perpenso (1613749)
                      What is the point of all this? Have you jumped to the erroneous conclusion that I don't think any of these three implementation details have their time and place?

                      My original point has nothing to do with how one implement's the needed assembly code. My point is that there are situations where the programmer has knowledge that can not be communicated to the compiler, knowledge that allows the assembly language programmer to generated better code than the compiler could.

                      Using stand-alone assembly means that the compiler has to create an entire call frame and ensure that all caller-safe registers are preserved, even if your asm code doesn't touch any of them.

                      No. The responsibility to preserve re

      • The proper syntax for that is (using x64 types) something like:

        int a,b,z;

        z = (int)(((long long)a * b) >> 32);

        I'm assuming int is 32bit and long long is 64. Even though a is promoted to a larger type and also b, good compilers know that the upper half of those promoted variables are not relevant. They will then use the 32bit multiply, shift the 64bit result and store the part you need. I still do fixed point for control systems and find using 16bit signals and 32bit products is faster in C than f
        • by perpenso (1613749)

          The proper syntax for that is (using x64 types) something like:
          int a,b,z;
          z = (int)(((long long)a * b) >> 32);
          I'm assuming int is 32bit and long long is 64. Even though a is promoted to a larger type and also b, good compilers know that the upper half of those promoted variables are not relevant. They will then use the 32bit multiply, shift the 64bit result and store the part you need.

          Admittedly its been a while since I did fixed point but back in the day when I checked popular 32-bit x86 compilers (MS and gcc) they did not generate the couple of instructions that an assembly language programmer would. My example may be dated.

        • Looks like gcc isn't a good compiler.
          Compiling this at -O3

          int mult(int a, int b)
          {
          return (int)(((long long)a * b) >> 32);
          }

          In x86-64 mode gives

          movslq %esi, %rax
          movslq %edi, %rdi
          imulq %rdi, %rax
          sarq $32, %rax
          ret

          and 32-bit mode gives

          pushl %ebp
          movl %esp, %ebp
          movl 12(%ebp), %eax
          imull 8(%ebp)
          popl %ebp
          movl %edx, %eax
          ret

          On powerpc the 64-bit version is very clean and obvious:

          mulld 4,4,3
          sradi 3,4,32
          blr

          the 32-bit version is a little bit more complicated

          mulhw 9,

    • by pclminion (145572)

      Coding in assembly still remains a superior method of squeezing extra performance out of software.

      I'd say it's more important to be able to read assembly than to write it. I do a lot of performance optimization of C++ code, and mostly it involves looking at what the compiler generates and figuring out how to change the source code to produce better output from the compiler. Judicious use of const, restrict and inline keywords can make a huge difference, as can loop restructuring (although the compiler can

      • by AaronW (33736)

        I agree. I often see output from the compiler and am amazed. It's rare now for me to find cases where I can do better than the compiler. It used to be the case where hand-tuned assembly made a lot of sense, but that's no longer the case, especially if you give hints to the compiler about things like likely/unlikely branch conditions in your code.

        I worked on a Linux 802.11n wireless driver and was able to reduce CPU usage by 10% by adding likely/unlikely wrappers in the data path comparisons and analyzed wh

    • Re:Costs (Score:5, Funny)

      by SageMusings (463344) on Tuesday September 25, 2012 @11:04PM (#41459577) Journal

      You need that extra few percent.

      That's why our compilers go to 11.

    • by tlhIngan (30335)

      But when you're shipping hundreds of millions of units of hardware, and you're trying to pack as much processing power in a small and efficient form factor, you don't go with VLSI for the same reason you don't go with a compiler for realtime code: You need that extra few percent.

      The problem with hand-tuned assembly is the same as hand laying out transistors - it gets complicated quickly and if you're not careful, you end up with a horrible mess.

      You can argue if you're shipping millions of copies of things,

  • by whoever57 (658626) on Tuesday September 25, 2012 @07:04PM (#41457343) Journal

    Today, chips are nearly always laid out using advanced, CAD-like software â" the designer says he wants X cache, Y FPUs, and Z cores, and the software automagically creates a chip. Hand-drawn processors, on the other hand, are painstakingly laid out by chip designers.

    There are a lot of layout methodologies that are between the (frankly mythical) "X cache, Y FPUs, and Z cores" and fully hand layout. The top level may have more or less amounts of hand assembly, some blocks can be hand optimized, etc.. Usually, there is lots of glue logic which must be designed in RTL, synthesized and only then laid-out. And, for most blocks the process to create the logic design (RTL or perhaps gates) is separate from the process of laying-out these blocks. So there is room for manual involvement in each of the steps.

  • Looking closely (Score:5, Informative)

    by taniwha (70410) on Tuesday September 25, 2012 @07:08PM (#41457389) Homepage Journal

    Looking closely I see a bunch of ram - probably half laid out by hand (caches) - and a many may small standard cell blocks almost certainly not laid out by hand - what I don't see is an obviously hand laid out datapath (the first part of your CPU you spend layout engineers on) - look for that diagonal where the barrel shifter(s) would be. There are some very regular structures (8 vertically) that I suspect are register blocks.

    Still what I see is probably someone managing timing by synthesizing small std cell blocks (not by hand), laying those blocks out by hand then letting their router hook them up on a second pass - - it's probably a great way to spend a little extra time guiding your tools into doing a better job to squeeze that extra 20% out of your timing budget and give you a greater gate density (and lower resulting wire delays)

    So - a little bit of stuff being done by hand but almost all the gates being lait out by machine

  • by queazocotal (915608) on Tuesday September 25, 2012 @07:09PM (#41457401)

    This is not by hand.
    To take a programming analogy, it's looking at what the compiler generated, and then giving it hints so the resultant code/chip is laid out as you expect.

    Chips stopped being able to be laid out 'properly' by hand some time ago.

    Doing this has much the same benefits as doing it with code.
    You know stuff the compiler does not.
    You can spot silly stuff it's doing, that is not wrong, but suboptimal, and hold its hand.

    • by Sulphur (1548251) on Tuesday September 25, 2012 @07:44PM (#41457761)

      This is not by hand.
      To take a programming analogy, it's looking at what the compiler generated, and then giving it hints so the resultant code/chip is laid out as you expect.

      Chips stopped being able to be laid out 'properly' by hand some time ago.

      Doing this has much the same benefits as doing it with code.
      You know stuff the compiler does not.
      You can spot silly stuff it's doing, that is not wrong, but suboptimal, and hold its hand.

      Or grab its ARM.

    • Not being a chip expert, the following made me think twice over whether some dextrous East Asian factory workers used tweezers to lay out the circuits of each and every chip rolling down the assembly line:

      "Hand-made chips are very rare nowadays, with Chipworks reporting that it hasn't seen a non-Intel hand-made chip for 'years.'"

      The phrase "hand-made chips" is misleading because it gives the impression that, similar to the way motherboards are still assembled by hand, the production of CPUs involve human fi

    • I always considered the day people stopped using Rubylith [wikipedia.org] to be when we stopped doing layouts "by hand".
    • by adolf (21054)

      This is not by hand.

      Indeed. I used to have a relatively high-end CD player whose analog section was obviously put together by hand: The PCB traces had the gentle arcs and occasional aberrations of a steady and practiced (but somewhat imperfect) human hand aided by simple drawing tools. (The digital section's traces resembled that of any computer part from a few years prior: Obviously done by machine.)

      My example is on a much, much larger physical scale than anything discussed in TFA, but having actually

      • by Alioth (221270)

        In the context of circuits, "by hand" means not autorouted. I laid out my last PCB "by hand" - no, I didn't draw it with a pen, but I placed each trace "by hand" in the CAD program instead of just autorouting it.

  • by Kjella (173770) on Tuesday September 25, 2012 @07:18PM (#41457483) Homepage

    The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever.

    You can teach a small kid to ride a bicycle. The same kid has no chance to program a robot into doing the same motion and balancing. It's the same order of magnitude in difference with VLSI layout, a person can lay out the circuits but it's almost impossible to describe to the computer all the reasons why he'd lay it out that way. It's not easy controlling anything well through a level of indirection, that's true for most things.

    As for being "less expensive", companies don't just have expenses but they have income too. If you can increase revenue because you got a better chip that sells more, they're willing to pay a higher cost. Companies care about profits, not expenses in isolation. Those tiny improvements to the compiler, how valuable are they to Apple in 10 years? 20 years? As opposed to an optimized chip which they know how much is worth right now.

    • by v1 (525388) on Tuesday September 25, 2012 @07:38PM (#41457695) Homepage Journal

      I don't think bicycle riding is a very good analogy to this problem. How about cooking, which is a procedural step-by-step operation? Little hints the recipe can give you like "preheat oven to 350 degrees" can be a tremendous time-saver later. If you didn't know to do that, you'd get your dish ready and then look at the oven (off) and click it on and sit back and wait 20 minutes before placing it in the oven. A dish that was supposed to be 60 minutes start to serve is now going to take 80 minutes due to a lack of process optimization.

      Compilers have the same problem of not knowing what the expectations are down the road, and aren't good at timing things. Good expereinced cooks can manage a 4 course meal and time it so all the dishes are done at the right time and don't dirty as many dishes. Inexperienced cooks are much like compilers, they can get the job done but their timing and efficiency usually have much room for improvement.

      • by mspohr (589790)

        I think we need a car analogy.
        Following iLost maps while drunk driving is like using a compiler.
        On the other hand, following the directions from your mother in law in the back seat is like a fish.
        YMMV

      • by magarity (164372)

        Time to buy a new oven if it takes 20 minutes to heat to 350.

  • News For This Nerd (Score:5, Interesting)

    by History's Coming To (1059484) on Tuesday September 25, 2012 @07:24PM (#41457547) Journal
    Brilliant, this is what I love about Slashdot, I can be the biggest geek in whatever field I pick and I will still get outgeeked! I enjoyed reading the comments above mostly because I have absolutely no idea what the detail is, and I'd never even realised that hand-drawn vs machine was a issue.

    Can anyone supply a concise explanation of the differences and how it's all done? I'm guessing we're talking about people drawing circuits on acetate or similar and then it's scaled down photo-style to produce a mask for the actual chip?

    Yes, I know I can just Google it, and I will, but as the question came up here I thought I'd add something to a real conversation, it beats a pointless click of some vague "like" button any day :)
    • by oji-sama (1151023)
      Sorry, not an expert, but you might find this article (about AMD Steamroller) interesting. At least check the short "Looking Forward: High Density Libraries". They are rebuilding hand-drawn diagrams to be more efficient. http://www.anandtech.com/show/6201/amd-details-its-3rd-gen-steamroller-architecture/2 [anandtech.com]
    • by lexman098 (1983842) on Tuesday September 25, 2012 @08:59PM (#41458453)

      The headline is attention-grabbing bullshit.

      I'd believe that Intel may have in the past done manual placing and routing of custom made cells in certain key parts of their CPUs, but I can almost assure you that Apple did not place all of the standard cells in their ARM core's and then route them together manually, which is what the headline implies.

      What I'm talking about here is literally placing down a hundred thousand rectangles in a CAD tool and then connecting them correctly with more rectangles which is way beyond what Apple would have considered worth the investment for a single iPhone iteration. What's more probable (and pretty standard for digital chip design) is that they placed all of the large blocks in the chip by hand (or at least by coordinates hand-placed in a script), and they probably "guided" their place and route tool as to which general areas to place the various components of the ARM cores. They might have even gone in after the tool and fixed things up here and there.

      Modern chips are almost literally impossible to "lay out by hand".

      • What I'm talking about here is literally placing down a hundred thousand rectangles in a CAD tool

        Well, when you have repetitive structures, 100,000 rectangles isn't really all that difficult.

    • Can anyone supply a concise explanation of the differences and how it's all done? I'm guessing we're talking about people drawing circuits on acetate or similar and then it's scaled down photo-style to produce a mask for the actual chip?

      CPU code is in RTL,verilog,VHDL, whatever-- it's in HDL. Usually these days a synthesis tool or compiler will create chip layout that implements that HDL description in standard cell logic. The standard cells are latches, NAND gates, buffers, SRAM, etc. A software tool will place and route standard cells to implement the HDL in silicon, and then iterate on timing to make sure it's fast enough. Humans don't directly do the placement of standard cells, or route wires between them. In terms of photolitho

    • by slew (2918) on Tuesday September 25, 2012 @09:21PM (#41458685)

      Nobody "draws" chips by "hand" anymore. It's all being done by a computer (there are so many design rules these days humans can't do this anymore in a realistic time frame). Reticles (the photomasks) are all fractured by computer these days because rectangles aren't really rectangles anymore at these small feature sizes (we are now past the diffraction limit so masks must be "phase-shift" masks not binary masks back in the old-days).

      I don't have any specific knowledge about the A6, but what is euphamistically called hand-drawn these days is often still very automated relative to the bad-old-days when people were drawing rectangles on layers to make transitors. That was the real-hand-drawn days, but even way back then you didn't actually draw them by hand, you used a computer program to enter the coordinates for the rectangles.

      Quick background: now days when typical chips go to physical design, they usually go through a system called place-and-route where pre-optimized "cells" (which have 2-4 inputs and 1-3 outputs and implement stuff like and-or-invert, or register flop) are placed down by the computer (typically using advanced heuristic algorithms) and the various inputs and outputs are connected together with many layers of wires which logically match the schematic or netlist (which is the intention of the logical design). Of course this is when physics starts to impose on the "logical" design, so often things need special fixups to make things work. Unfortunatly, the fixups and the worst case wirelengths between cells conspire to limit the performance and power of the design, but just like compiled software, it's usually good enough for most purposes. Highly leveraged regularly structured components of normal designs might have libraries, specialized compilers or even have hand intervention (e.g, rams, fifos, or register files), but not the bulk of the logic.

      As far as I can tell from looking at the pictures the most likely possibility is that just that instead of letting the computer place the design completely out of small cells, some larger blocks (say like ALUs for the ARM SIMD path) were created by a designer and layout engineer who probably used a lower-level tool to put down the same small cells relative to other small cells where they think is a good place to put them and tweak the relative positioning to try to minimize the maximum wire lengths between critical parts of the block. The most common flow for doing this is mostly automated, but tweakable with human intervention (this what passed for "by-hand" these days). In addition to being designed to optimize critical paths, these larger blocks are generally desgined so that they "fit" well with other parts of the design (e.g., port order, wire pitch match, etc) to minimize wire congestion (so they can be connected with mostly straight wires, instead of those that bend). Basically looking at the patterns of whitespace in the presumed CPU, you can see the structure of these larger blocks instead of big rectangles (called partitions) which have rows of cells you get when you let a computer do place-and-route with small cells.

      Just like optimizing a program, there are many levels of pain you can go through and what I described above is probably the limit these days. Say if you wanted less pain, another more automated way to get most of the same benefits is to just develop a flow that hints where to put parts of the design inside the normal rectangular placement region, and let a placement engine use those hints. The designer can just tweak the hints to get better results. Of course with this method, the routing may still have "kinks" in this case because routing is not wire-pitch-matched, but you can often get 80-90% the way there. The advantage of this lesser technique is that you don't need to spend a bunch of time developing big blocks and if there is a small mistake (of course nobody ever makes mistakes), it's much, much easier to fix the mistake w/o perturbing the whole design.

      FWIW, it is highly unlikely that th

    • They used to literally make a mask by hand, but then the features on the chip got smaller (on the scale of nanometers), and the chips got bigger (up to hundreds of square millimeters, holding billions of transistors). To draw the whole thing, you'd need a piece of acetate 360 meters on each side, at least. These days (and for the last couple decades), it's all CAD. The design then gets sent electronically to the fab, where they make the mask using an electron beam - a little bit like how a CRT works, I beli

    • by arkhan_jg (618674)

      I'm not an expert by any stretch, but I did study VLSI chip design at uni, though obviously what we studied was a long way behind the current curve, and that's no doubt only increased since I left; but I can provide an overview of how it used to be done.

      You start with a hardware description language. I used verilog and VHDL. Basically, you describe in code what modules and functions you want the chip to do. It's literally very similar to coding, but you're describing and linking together hardware modules, n

  • Layout by HAL (Score:3, Informative)

    by Anonymous Coward on Tuesday September 25, 2012 @08:00PM (#41457919)

    " The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding."

    I've done PCB layouts, microwave chip and wire circuits, as well as RFIC/MMIC layouts. Anyone who asks the question above has never done a real layout. Many autorouter and layout tools allow complex rules to match delays, keep minimum widths, etc. You can spend as much time on each layout trying to populate these rules for critical sections of a design, but it is like trying to train a 5 year old to do brain surgery. Digital design is rather much different than the analog circuits I work on, but you only have to do a few layouts of any flavor by hand in your life to be able to see just how scary it is to hand a layout to HAL.

    Clearly autorouters and autogenerated layouts, and I don't mean to sound like too much of a luddite... I've witnesses plenty of awful hand layouts to go around as well.

    • by taniwha (70410)

      well a cpu with a 1GHz clock has 1nS to process data between flops - yes it's a bit like laying out microwave stuff -but in the very small - what happens is that it all starts with some layout person/people creating a standard cell library, they'll use spice to simulate and characterise their results - they'll pass this to the synthesis/layout tool makes a good first guess, they'll add in some fudge factor - then a timing tool looks at the 3d layout and extracts real timing, including parasitics to everythi

  • by Wierdy1024 (902573) on Tuesday September 25, 2012 @08:36PM (#41458255)

    When someone buys a design from ARM, they buy one of two things:

    1. A Hard macro block. This is like an mspaint version of a cpu. it looks just like the photos here. The CPU has been laid out partially by hand by ARM engineers. The buyer must use it exactly as supplied - changing it would be neigh-on impossible. In the software world, it's the equivalent of giving an exe file.

    2. Source Code. This can be compiled by the buyer. Most buyers make minor changes, like adjusting the memory controller or caches, or adding custom FPU-like things. They then compile themselves. Most use a standard compiler rather than hand-laying out the stuff, and performance is therefore lower.

    The articles assertion that hand layout hasn't been done for years outside intel as far as I know is codswallop. Elements of hand layout, from gate design to designing memory cells and cache blocks have been present in ARM hard blocks since the very first arm processors. Go look in the lobby at ARM HQ in Cambridge UK and you can see the meticulous hand layout of their first cpu, and it's so simple you can see every wire!

    Apple has probably collaborated with ARM to get a hand layout done with apples chosen modifications. I can't see anything new or innovative here.

    Evidence: http://www.arm.com/images/A9-osprey-hres.jpg [arm.com] (this is a layout for an ARM Cortex A9)

  • Huh? (Score:5, Interesting)

    by Panaflex (13191) <convivialdingo@NospAM.yahoo.com> on Tuesday September 25, 2012 @11:01PM (#41459551)

    Not surprising at all, as PA SEMI was founded by Daniel W. Dobberpuhl.

    Daniel Dobberpuhl had his hand in StrongARM and DEC Alpha design - both hand-drawn cores which to this day command some respect in chip design circles I'm told.

    Anyway,

  • by dbc (135354) on Wednesday September 26, 2012 @01:44AM (#41460435)

    The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.

    It's simple math. At what volume will the chip be produced? A modern fab costs $X Billion, and you know pretty much exactly how many wafers you can run during the 3 years it is state-of-the-art. After that, add $Y Billion for a refit, or just continue to run old processes. Anyway, say a new fab at refit time would cost $Z Billion. Refitting the old fab instead costs $Y Billion. So you save $Z-$Y by doing a refit. So the original fab cost you $X-($Z-$Y). Divide by number of wafers the fab can run during its life, that is the cost per wafer. Now compute die area for hand layout versus auto layout, and adjust for imporved yield for smaller die. Divide by die per wafer. That is how much less each die costs you. Now since the die is smaller, it probably runs faster, so adjust your yield-to-frequency-spec upwards, or adjust your average selling price upwards if the speed difference is "large" (enough MHz to have marketing value). That is the value of hand layout. It isn't rocket surgury to work out a dollars-and-cents number.

    Anyway, even at Intel for at least the past 20 years only highly repetive structures like datapath logic has been hand laid out. Control logic is too tedius to lay out by hand, doesn't yield much area benefit, and is where the bulk of the bug fixes end up so it's the most volatile part of the layout from stepping to stepping.

    So, can hand layout have a positive return on investment? Yes, if you run enough wafers of one part to make the math work out. These days the math will only work out for higher volume parts.

    (Yes, I'm ex-Intel).

We are Microsoft. Unix is irrelevant. Openness is futile. Prepare to be assimilated.

Working...