Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AMD Hardware

AMD Cancels 28nm APUs, Starts From Scratch At TSMC 149

MrSeb writes "According to multiple independent sources, AMD has canned its 28nm Brazos-based Krishna and Wichita designs that were meant to replace Ontario and Zacate in the second half of 2012. The company will likely announce a new set of 28nm APUs at its Financial Analyst Day in February — and the new chips will be manufactured by TSMC, rather than its long-time partner GlobalFoundries. The implications and financial repercussions could be enormous. Moving 28nm APUs from GloFo to TSMC means scrapping the existing designs and laying out new parts using gate-last rather than gate-first manufacturing. AMD may try to mitigate the damage by doing a straightforward 28nm die shrink of existing Ontario/Zacate products, but that's unlikely to fend off increasing competition from Intel and ARM in the mobile space."
This discussion has been archived. No new comments can be posted.

AMD Cancels 28nm APUs, Starts From Scratch At TSMC

Comments Filter:
  • by Anonymous Coward on Tuesday November 22, 2011 @04:27PM (#38140504)

    You should do more then make slashdot comments and watch porn.

  • Competition ? (Score:4, Informative)

    by unity100 ( 970058 ) on Tuesday November 22, 2011 @04:31PM (#38140544) Homepage Journal
    AMD has no competition in APU arena. It is dominating it.

    http://techreport.com/articles.x/21730/8 [techreport.com]

    its actually possible to game with acceptable detail and fps with entry-mid level laptops without paying a fortune now.
  • Global Foundries (Score:5, Informative)

    by Anonymous Coward on Tuesday November 22, 2011 @04:42PM (#38140684)

    The description is somewhat misleading in that Global Foundries is not a "long-time partner," but what were AMD's own internal wafer fabs until Global Foundries was spun out as a separate company in 2009.

  • by 0123456 ( 636235 ) on Tuesday November 22, 2011 @04:46PM (#38140734)

    I salute you, mythical IT-worker who manages to get an overclocked computer work-approved.

    Who said it was approved? In a previous job a friend inherited a computer from someone who'd left and never understood why it would crash every few days and hit bugs that no-one else seemed to see until he looked in the BIOS and discovered the previous user had overclocked it.
     

  • by WilliamBaughman ( 1312511 ) on Tuesday November 22, 2011 @04:55PM (#38140812)
    Calling Global Foundries AMD's "long-time partner" really dates "MrSeb", he must have started reporting tech news in the last three years. Global Foundries isn't just a "partner" to AMD, it's part-owned by AMD, and was spun out of AMD's manufacturing and merged with Chartered Semiconductor.
  • by Bengie ( 1121981 ) on Tuesday November 22, 2011 @04:58PM (#38140844)

    With multi-core CPUs, just because you can't reach 100% usage doesn't mean your not CPU limited.

  • by Joce640k ( 829181 ) on Tuesday November 22, 2011 @05:32PM (#38141248) Homepage

    Vista? Ack.

    At least have the decency to install Windows 7.

  • by jd ( 1658 ) <imipak@ y a hoo.com> on Tuesday November 22, 2011 @05:43PM (#38141390) Homepage Journal

    Software isn't the bottleneck. Caches are *tiny* compared to the size of even single functions in modern programs, which means they get flooded repeatedly, which in turn means that you're pulling from main memory a lot more than you'd like. Multi-core CPUs aren't (as a rule) fully independent - they share caches and share I/O lines, which in turn means that the effective capacity is slashed as a function of the number of active cores. Cheaper ones even share(d) the FPU, which was stupid. The bottleneck problem is typically solved by increasing the size of the on-chip caches OR by adding an external cache between main memory and the CPU. After that, it depends on whether the bottleneck is caused by bus contention or by slow RAM. Bus contention would require memory to be banked with each bank on an independent local bus. Slow RAM would require either faster RAM or smarter (PIM) RAM. (Smart RAM is RAM that is capable of performing very common operations internally without requiring the CPU. It's unpopular with manufacturers because they like cheap interchangeable parts and smart RAM is neither cheap nor interchangeable.)

    Really, the entire notion of a CPU - or indeed a GPU - is getting tiresome. I liked the Transputer way of doing things (System-on-a-Chip architecture) and I still like that way of doing things. The Transputer had some excellent ideas - it's a shame it took Inmos so long to design an FPU (and a crappy one at that) and given that the T400 had a 20MHz bus at a time most CPUs were running at 4MHz, it's a damn shame they failed to keep that lead through to the T9000.

    What I'd like to see is a SoC where instead of discrete cores (uck!) you have banks of independent registers, pools of compute elements and hyperthreading such that the software can dynamically configure how to divide up the resources. There's nothing to stop you moving all the GPU logic you like into such a system. It's merely more pools of compute elements. Microcode is already in use and microcode is nothing more than software binding of compute elements to form instructions. (Hell, microcode was already common on some architectures back in the 80s and was available for microprocessors within a decade of their being invented.) There's nothing that says microcode HAS to be closed firmware from the manufacturer - let the OS do the linking. It's the OS' job to partition resources and it can do so on-the-fly as needs dictate - something a manufacturer firmware blob can't do. Put the first 4 gigs onto the SoC and have one MMU per core plus one spare, so that each core can independently access memory (provided they don't try to access the same page). The spare is for direct access to memory from the main bus without going through any CPU (required for RDMA, which most peripherals should be capable of these days).

    Such a design, where the OS converts the true primitives into the primitives (ie: instruction set) useful for the tasks being performed, would allow you to add in any number of other true primitives. Since any microcode-driven CPU is essentially a software processor anyway, you can afford to put extra compute elements out there. Any element not needed would not be routed to. Real-estate isn't nearly as expensive as is claimed, as evidenced by the number of artistic designs chip manufacturers etch in. Those designs are dead space that can magically be afforded, but there's nothing to stop you from replacing them with the necessary inter-primitive buffering to build ever-more complex instructions from primitives without loss of performance. I'm willing to bet HPC would look a whole lot more impressive if BLAS and LAPACK functions were specifically in hardware rather than being hacked via a GPU.

    Of course, SoC means larger chips. So? Intel was talking about wafer-scale processors several years back (remember their 80-core boast?) and production has only improved since then. The yield is high enough quality that this is practical and since the idea is to software-wire the internals it becomes trivial to bypass defects. T

  • by Anonymous Coward on Tuesday November 22, 2011 @05:52PM (#38141504)

    Nobody pays much attention to single-core performance anymore, and I have no idea why. There are tons of programs that people use on a regular basis that are single-core limited.

    There's a very simple reason: physical limitations. The current processor technology is more or less maxed out for single-thread performance. There's probably some gains available by completely changing the instruction set or completely giving up on multi-thread performance, but nothing that Intel can put into a chip they can sell. They can't up clock speed anymore due to the speed of light (except a little bit when doing a die shrink). The obsession with multi-core isn't because Intel and AMD think everyone wants to run more threads; software is moving towards using more threads because Intel and AMD simply can't improve single-thread performance but they, at least for a little while longer, can keep adding more cores.

  • Re:AMD = Stagnated. (Score:5, Informative)

    by GreatBunzinni ( 642500 ) on Tuesday November 22, 2011 @06:12PM (#38141782)

    Intel i5 661: http://www.newegg.com/Product/Product.aspx?Item=N82E16819115217&Tpk=i5%20661 [newegg.com]
    According to these benchmarks [cpubenchmark.net], we have:

    • AMD Phenom II X4 965 4,291 $129.99*
    • Intel Core i5 661 @ 3.33GHz 3,286 $175.66*

    And this doesn't account for the money spent on a motherboard, which adds a hefty price to any intel offering.

    So, looks like you botched your careful number check.

  • by hkultala ( 69204 ) on Tuesday November 22, 2011 @07:04PM (#38142406)

    Software isn't the bottleneck. Caches are *tiny* compared to the size of even single functions in modern programs, which means they get flooded repeatedly, which in turn means that you're pulling from main memory a lot more than you'd like.

    Wrong.

    The code size of average function is much smaller than instruction cache for any modern processor.
    And then there are L2 and L3 caches.

    Instruction fetch needing to go to main memory is quite rare.

    And then about data.. depends totally on what the program does.

    Multi-core CPUs aren't (as a rule) fully independent - they share caches and share I/O lines, which in turn means that the effective capacity is slashed as a function of the number of active cores. Cheaper ones even share(d) the FPU, which was stupid.

    None one of the CPU's sharing FPU with multiple HW threads are cheap.

    Sun Niagara I had slow shared FPU, but the chip was not cheap

    AMD Bulldozer, which usually has sucky performance, sucks less on code which uses the shared FPU.

    FPU operations just have long latencies and there are always lots of data dependencies, so in practice you cannot
    utilize FPU well from one threads, you need to feed instructions from multiple treads.

    Intel uses HyperThreading for this, AMD Bulldozer it's CMT/shared FPU/module.
    GPU's are barrel processors for the same reason.

    The bottleneck problem is typically solved by increasing the size of the on-chip caches OR by adding an external cache between main memory and the CPU.

    Much more often the bottleneck is between the levels of the chip's caches.
    The big outer level caches are slow and processors spend quite often small time waiting for data coming from them. And if you increase the size of the last level caches, you make them even slower.

    One of the reason's for bulldozer's sucky performance is because it has small L1 caches(so it needs to fetch data deom L2 cache often), but big and slow L2 cache. So there is this relatively long L2 latency happening quite often.

    External cache.. has not been been used for about 10 years by Intel or AMD. It's either slow or expensive, and usually both. Now when even internal caches can easily be made with sizes over 10 megabytes, the external cache has to be very expensive in order to compete with internal caches, and still it only makes sense on some server workloads.

    After that, it depends on whether the bottleneck is caused by bus contention or by slow RAM. Bus contention would require memory to be banked with each bank on an independent local bus. Slow RAM would require either faster RAM or smarter (PIM) RAM. (Smart RAM is RAM that is capable of performing very common operations internally without requiring the CPU. It's unpopular with manufacturers because they like cheap interchangeable parts and smart RAM is neither cheap nor interchangeable.)

    Smart RAM is a dream, and a research topic in universities. It's uncommon because it does not (yet) exist.

    And most of the problems/algorithms are not solveable by "simple" smart ram that can only operation on data near each others. And it you try to make it even smarter, then you end up making it costlier and slower, it will become just chip with multicore processor and memory on same chip.

    There are some computational tasks where smart ram would improve the performance by great magnitude, but for the >90% of all the other problems, it has quite little use.

  • Re:AMD = Stagnated. (Score:4, Informative)

    by mrchaotica ( 681592 ) * on Tuesday November 22, 2011 @07:24PM (#38142632)

    I got an AMD Phenom II X4 840 for $59.99 a few days ago (at Microcenter); I'm sure it's more than half as fast as a 965, so it's an even better value. I got a new motherboard (AMD 760G chipset) with it too; it was also $59.99. Not bad, I think -- would I have been able to find an Intel solution for that price/performance?

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...