Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Hardware

Ask Slashdot: How Do You Estimate the Cost of an Algorithm Turned Into an ASIC? 97

"Another coder and I are exploring the possibility of having a video-processing algorithm written in C turned into an ASIC ("Application Specific Integrated Circuit") hardware chip that could go inside various consumer electronics devices," writes Slashdot reader dryriver. The problem? There seems to be very little good information on how much a 20Kb, or 50Kb or indeed a 150Kb algorithm written in the C language would cost to turn in an ASIC or "Custom Chip".

We've been told that "the chip-design engineering fees alone would likely start at around $500,000." We've been told "the cost per ASIC will fluctuate wildly depending on whether you are having 50,000 ASICS manufactured or 5 million ASICs manufactured." Is there some rough way to calculate from the source code size of an algorithm -- lets say 100 Kilobytes of C code, or 1000 lines of code -- a rough per-unit estimate of how much the ASIC hardware equivalent might cost to make?

Why do we need this? Because we want to pitch our video processing tech to a company that makes consumer products, and they will likely ask us, "So... how many dollars of extra cost will this new video processing chip of yours add to our existing products?"

There were some interesting suggestions on the original submission, including the possibility of C to HDL converters or a system on a chip (SoC). But are there any other good alternatives? Leave your own thoughts here in the comments.

How do you estimate the cost of an algorithm turned into an ASIC?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: How Do You Estimate the Cost of an Algorithm Turned Into an ASIC?

Comments Filter:
  • FPGA (Score:5, Insightful)

    by ArmoredDragon ( 3450605 ) on Saturday August 24, 2019 @06:42PM (#59121710)

    Why not prototype it with an FPGA first?

    • Re:FPGA (Score:5, Insightful)

      by thesupraman ( 179040 ) on Saturday August 24, 2019 @06:47PM (#59121722)

      Because they dont seem to understand the first thing about what they should be doing?

      Can I suggest 'How about employing someone who knows what they are talking about' as a first step?
      'Engineering through slashdot' is even more stupid these days than it would have been 15 years ago, when slashdot was more interesting to engineers.

      Short answer: No, 'lines of C' have no meaningful link to 'cost of ASIC'.

      However ArmoredDragon is 100% right, learn to build an FPGA based implementation, or pay someone to do that. An FPGA design DOES have some relation to an ASIC implementation.

      • Even before the FPGA, code it on a GPU. You can buy about 100,000 Nvidia cards for the cost an ASIC design.

        • by evanh ( 627108 )

          GPU first might be a good idea ... not because of cost or speed vs FPGA/ASIC ... but simply because it forces you to parallel the algorithm to make use of the parallel hardware - A critical step in making the hardware work for you. Which will get you in the right mindset for the FPGA design work.

          FPGA designs can be directly absorbed into ASIC design flows. Therefore marketing numbers can be estimated from the FPGA.

          • No. They just need to use an fPGA to demonstrate the effectiveness of the algorithm. The consumer electronics maker will have relationships already and resources to design and run the manufacturing though their vendors (which will give them better rates than these guys can get.

            They are in the algorithm business not the manufacturing business.

            • by evanh ( 627108 )

              Thing is, parallel design is takes some getting used to. Freebie GPU tools are easy to try out before venturing into choosing and configuring up a FPGA to design on.

            • Re: FPGA (Score:5, Interesting)

              by LostMyBeaver ( 1226054 ) on Sunday August 25, 2019 @12:11AM (#59122172)
              I'm not convinced this is even the case.

              I've worked in the video codec world for some time and there are things we know to be infallible... the first and most important thing is... our course is certainly not infallible. ASIC is almost never a good idea. More importantly, if there's a meaningful algorithm to image processing, placing it in an ASIC rather than preparing it as an IP Core for someone else's ASIC and licensing it is idiocy.

              Consider the Analog Devices ADV212 which is a JPEG-2000 codec. So... it is very clearly an image processing chip. It is designed to encoded individual frames of video in real-time. It is not powerful enough for 1080p at 60fps, So you need to use two of these chips and encode the Y channel with one and the Cb and Cr channels with the other. The chips are then supposed to be able to knit the color channels together into a single "properly formatted" JPEG-2000 file. Let me spoil the story... they don't.

              The circuitry required to use this is very complicated. For example, if you were to consider a C function, there's an input (parameters), a processing unit (code) and an output (return). It's REALLY easy.

              In electronics, we have the same stuff... but the input is a fairly large function which accepts a word of data at a time, has to be sequentially clocked, has to be performant, etc... there also needs to be configuration registers to describe the content. There needs to be a memory access structure that should hopefully be well suited to the algorithm. If the full image needs to be cached, there needs to be internal memory and possibly external memory. For internal memory, it can be structured as internal SRAM which is EXTREMELY fast, but also quite expensive. If he image size is huge, then there will need to be a memory controller and memory will need to be addressed in bursts. This is true for external memory as well.

              Something extremely important which I'm sure they don't understand is that things like loops and dynamic memory allocation don't really work in ASIC. Most people have no idea that the Ethernet MTU is still important because all the devices on a network using ASICs for forwarding need to preallocate the buffer memory on its ports to support the MTU because efficiently managing memory is nearly impossible in ASIC.

              A lot of people are commenting that they should try FPGA or GPU or whatever first... this isn't worth the effort.

              They have written an algorithm which works in terms of sequential processing of data. They use loops and conditions.

              I'm not sure why they were thinking ASIC, but they made a comment about "For use in cameras" which leads me to believe they want to be an image processing core on a camera... nifty. Camera's do use ASICs. But the real estate on a circuit board within a camera almost certainly would not allow another chip to be placed there. But for the moment, let's imagine that it could.

              The ASIC is chosen over a C function in this (of which the camera can certainly run C code and also certainly has an image processing unit to accelerate it) because of performance. They want to be able to process the algorithm live. And they want to do it for what I'll assume are 40+ megapixel cameras... live... as in real-time.

              To support streaming media from a 40MP CCD into an ASIC in real-time in HDR, so let's assume 12-bits per pixel per channel... so 40 million * 12 * 3. This would be 1.44 gigabits per image in and 1.44 gigabits per image out. It would be 43.2 gigabits per second for 30 frames per second.

              There would have to also be a 1.44 gigabit frame buffer including addressing logic within the ASIC. If this is a traditional FIR/IIR style filter, 1.44Gb is enough to do the job... but if this is "an algorithm" rather than a filter, I'd imagine that there needs to be an output buffer as well.

              Depending on the processing latency of the algorithm, it might be necessary to have a third buffer in place for staging the next image while processing the current one. So, let's assume we'll nee
              • Brilliant response and fascinating read.
              • Re: FPGA (Score:5, Insightful)

                by id ( 11164 ) on Sunday August 25, 2019 @02:54AM (#59122354) Homepage

                probably cheaper to hire this guy.

              • My answer to their question is that there is no possible relation between C code and costs involved in FPGA. They could however use some Intel FPGAs, develop in C and then mask their FPGA as ASIC, but the average cost per chip for such a design will be about $300 per fabricated chip with a minimum order of 1000 or so. Of course, the chip would be worthless without first sorting how to use RAM and how to handle I/O.

                Which is why I'm saying they should prototype it on an FPGA. That will show them quite well what they're up against. Building a gate array to add two 8 bit integers is a lot different than "int a = b + c".

                https://www.youtube.com/watch?... [youtube.com]

                Oh but wait, they're doing image processing? Ok, now make a gate array that does a dot product of two vectors. Bet your C code doesn't even do a dot product, does it? Some library that you don't even understand does it for you, and in a way far different than an ASIC has to

              • by evanh ( 627108 )

                ASIC is the natural target. At this stage it's irrelevant weather they have in mind a separate chip or not. That a case by case marketing question.

                FPGA development is a requirement for any solution. The FPGA design can directly be a library part for any hardware. Without that they have very little more than a research example.

                GPU is just a taster for parallel processing experience. They're clearly trying to learn how to get to a hardware solution.

              • by Arkham ( 10779 )

                Wow smart people on Slashdot still exist. Thanks for this enlightening answer. I came here to suggest an FPGA or PLA but this is a much better answer than I could have mustered.

              • A lot of people are commenting that they should try FPGA or GPU or whatever first... this isn't worth the effort.

                I don't understand this statement at all. Going to ASIC without testing on an FPGA first would be insanely stupid.

                They have written an algorithm which works in terms of sequential processing of data. They use loops and conditions.

                The whole point of going from CPU to FPGA/ASIC is that you can avoid the loops, and go to clocked pipeline stages running in parallel.

        • by Cederic ( 9623 )

          Buy? Rent them from Amazon or Google to test and finetune the algorithm. You don't need to buy them at all.

          Although to be fair if all you're doing is testing and finetuning then you can just buy one decent games machine and work on that. It'll be slower than a proper work oriented GPU but you can take it home and game on it afterwards.

          (Yeah, I've been training AI on my games PC this week)

      • Is there some rough way to calculate from the source code size of an algorithm -- lets say 100 Kilobytes of C code, or 1000 lines of code -- a rough per-unit estimate of how much the ASIC hardware equivalent might cost to make?

        Short answer: No, 'lines of C' have no meaningful link to 'cost of ASIC'.

        However....

        In fact there are countless ways to calculate pretty much anything. In this case, there are so many groups of variables with such vast possible fluctuations that final total product run cost might have a range into the millions of dollars. But one should see pretty quickly that, presuming say total cost of $500K for viable product, 10000 units, and 150k, it really tells you nothing of any actionable value to know that it cost you 3.33333 ten thousandths of a cent per byte. Maybe the memory chip is different

        • most likely your business will fail (as is true of all new business).

          This is useless knowledge. Instead, you should be looking to understand why businesses succeed, because that knowledge is actionable.

      • by gl4ss ( 559668 )

        I don't see how the code size is going to relate to the cost at all too.

        and nobody without seeing the algorithm is going to give any kind of an accurate estimation on how much work it is. or how much it is redundant. or even if it's going to be faster and cheaper than using an arm chip to run the algorithm anyways.

        Still, it would be logical to develop fpga based prototype first and then ask them to turn that into asic.

      • Re:FPGA (Score:5, Informative)

        by tlhIngan ( 30335 ) <slashdot&worf,net> on Sunday August 25, 2019 @05:12AM (#59122540)

        Yes, start with an FPGA, because really, that's how EVERYONE does it. Once you're done, you take the same Verilog or VHDL code and then add in your foundry libraries to generate the transistors and simulations.

        In fact, the general order of things is you write a software simulation of what you want done. Then you create a hardware simulation of it (via software, so it runs slowly). Then you do an FPGA simulation of it (FPGAs will be much faster). Then it's ASIC time.

        Thing is if the FPGA runs "well enough", you can often stop there - either using the FPGA in your final design directly, or using one of the many "FPGA ASICs" out there where they are basically the same FPGA but hardcoded with your bitstream, which makes them slightly cheaper.

        ASICs are expensive - you roughly have to budget around 1 million dollars for just the masks. Each mask is around $100K each, and you need many of them - 3-5 for the transistors (you'll need well diffusions, N and P diffusions, and poly masks and maybe even contact masks), plus at least 2 for each metal layer you plan on, and you have a minimum of 2 metals (metal layout, through-contact layout).

        And you seriously need someone who knows what they're doing, because when you respin the layout due to bugs (steppings), you want to have a sea of spare and unused transistors so you can rewrite the chip without redoing all the transistor masks. That's what the stepping refers to - going from A0 to A1 means it's a metal revision (the transistor layouts are identical - if you needed spare transistors or gates, they were already placed and just made inactive, ready for fixing any bugs you might have). But going from say, A2 to B0 means the entire mask set was replaced.

        A good design will ensure that a bug fix will incur as few mask redesigns as possible - nothing good about having to re-do an entire 10 metal mask set to fix one bug, when proper planning would've isolated it to around 2 metal layers.

        It's even worse if you have a performance requirement, because now you need to make sure your design is tolerant to variations to achieve the speed you need, otherwise you need to do grading. At least the foundry can help you here by making chips with intentional variations for your characterization so you can see how fast and how slow your chip will be, but it's only available after the masks are made.

        And yes, you employ tons of simulation tools, simulation accelerators etc before you tape out. FPGAs help a lot too, since you can manually place and route to help speed things up, which you want to do because ASICs are so expensive. Sure a fully done implementation using an FPGA will be expensive and slower, but you can fix it with the said "FPGA ASIC" which either is program once, or you can get mask programmed versions where you only pay for one mask to be made - the one containing your code.

      • I actually wrote the C model for a graphics ASIC once without knowing what I was doing. We got it working but wow that made some actual hardware engineers unhappy. Having learned how this works the hard way, Iâ(TM)d like to advise you that performance modelling of hardware is an entire career path and that you are asking a far more complicated question than you know.
      • by ebvwfbw ( 864834 )

        'Engineering through slashdot' is even more stupid these days...

        Probably a step above engineering by Google, however.
        I've seen that done. It didn't turn out well.

      • Short answer: No, 'lines of C' have no meaningful link to 'cost of ASIC'.

        Exactly. If you have C program then you have cheap commodity hardware that can run it.You build an ASIC when you find an efficient algorithm that DOESN'T run on commodity hardware, hence can't be expressed in C. And you don't generally build it for the entire C program, you build it for the parts of the C program for which you found a better algorithm.

        Like the O(1) packet routing algorithm that can be implemented with a TCAM when the best algorithm in C is O(log n).

    • Re:FPGA (Score:4, Insightful)

      by locater16 ( 2326718 ) on Saturday August 24, 2019 @07:13PM (#59121774)
      This is the correct answer, beside the criteria OP gives are useless. How fast do you want your program to run, how parallel is it, how much memory do you need, of what type?

      The memory the program that runs takes up, just by the code itself, is an utterly useless metric to go on. People experiment with FPGAs first and foremost to figure out what they need out of an ASIC.
    • Exactly... Obviously the person asking the C->ASIC question doesn't really know anything about any of the processes involved. They need to work with people that do. On that note, I wouldn't advise anyone to work with them though as there is little point in creating a product for someone like that unless you are a partner. Even then, it is unadvisable as they haven't the faintest clue what they are doing, and won't put in any substantial work. They will benefit with no work involved and if it fails, will
      • It also doesn't say much for the value of their video coding algorithm. Have they really managed to come up with something better than, say, H.265, for which you can buy chips off-the-shelf?
    • Yes. Once they implement their algorithm in synthesizeable Verilog or VHDL, they will be in a much better position to estimate cost.

    • by gweihir ( 88907 )

      Actually, not doing that first would be pretty foolish. In particular, it may turn out that the speed-ups are not that good.

      Also, this is decidedly a "if you have to ask this, then you you lack half a decade or so of experience to do it competently" question. For example, due top an entirely different data-flow model than software, LoC counts (which are mostly useless for anything anyways) are even more useless than usual. I would say though that $0.5M is in the low side for an ASIC.

      That said, usually putti

    • Why not prototype it with an FPGA first?

      Because they should be looking at production on an FPGA (yes, after prototyping on an FPGA).

      The quoted NRE of $0.5M would only cover mask costs for a low volume/old technology device. There are very few designs which fit into this niche between using an FPGA and a high volume ASIC (which will cost a lot more in NRE charges).

      Given the much lower risk with an FPGA, doing an ASIC is a crazy choice.

      • If we're talking an ASIC with any modern process, then just producing the masks will run a lot more than $0.5M.

        If you include the tools and the cost of producing the masks in the NRE, then you should be looking at an NRE of at least $5M, likely much more, depending on complexity. It's been about 5 years for me but the total NRE for the first generation ASICs I've been involved with in the past were over $30M. The projects I am referring to had between 10M and 40M gates depending on the market segment the

    • Probably the best solution. FPGA designs , GPU and CPU algorithms can quite different but at least FPGA will be HDL. Drop me a line if you need help...
  • by sirsnork ( 530512 ) on Saturday August 24, 2019 @06:44PM (#59121714)

    I can't answer your actual question, but I get the feeling you'd be better off trying to leverage existing hardware. Video encoding/decoding processes are broadly similar and you could probably leverage existing hardware acceleration built into video chips already by using the appropriate API's.

    It seems exceptionally unlikely any OEM would add an extra ASIC to any sort of media player just to get hardware support for your particular codec without it being a revolutionary change, and even then I have my doubts

    • One reason I have seen companies want ASICs is to keep their proprietary stuff protected. In fact, this is the norm in most IoT products where code is written to the chip, and the read line's e-Fuse is blown. Having an ASIC further helps this.

      It also helps differentiate them from commodity stuff, especially if all their stuff is hardware accelerated.

      Problem is that you will be paying a ton of cash for every step of the ASIC's process. You better make damn sure you have everything well nailed before tape-

      • The other problem is that it doesn't work. They're almost certainly going to have to have their hardware fabbed in China, so their design will get stolen anyway.

  • Also depends on how good you are at negotiating. When you are manufacturing hardware, there are a lot of pitfalls. A lot of them. Your top priority should be to find someone who can navigate the difficult path successfully. It will save you millions of dollars and save you from almost certain failure.
  • You Probably Can't (Score:4, Interesting)

    by tsqr ( 808554 ) on Saturday August 24, 2019 @07:02PM (#59121748)

    We've been told that "the chip-design engineering fees alone would likely start at around $500,000." We've been told "the cost per ASIC will fluctuate wildly depending on whether you are having 50,000 ASICS manufactured or 5 million ASICs manufactured." Is there some rough way to calculate from the source code size of an algorithm -- lets say 100 Kilobytes of C code, or 1000 lines of code -- a rough per-unit estimate of how much the ASIC hardware equivalent might cost to make?

    In a word, no. Both of the things you've been told are true. ASIC cost is essentially driven by the non-recurring engineering fees, assuming the chip isn't huge (meaning fewer chips per wafer) and yields aren't low (again, meaning fewer chips per wafer). There is no magical formula for converting lines of source code to dollars per ASIC.

    You could try this. [sigenics.com] I'm guessing, though, that you won't be able to answer the questions in the form.

  • I could write 5 lines of C that multiplies a couple million by million matrices and does a trig function on each sum... yup 3 terawords of storage space plus the vector math pipelines with chebyshev polynomial gizmos in them..

    guess how much a circuit that does that would cost to fab? more than most here would earn in their lifetime....

    • by Cederic ( 9623 )

      How long would those 5 lines of C take? I'd like to see them.

      I'm not doubting you, and I'm quite comfortable with them omitting the multiple libraries you'd sensibly utilise, I'm just curious.

      • by gtall ( 79522 )

        Could probably do it in 5 lines of APL...hell, probably one line would be sufficient.

      • for (r = 0; r BIGASSSIZE; r++) {
        for (c = 0; c BIGASSSIZE; c++) {
        matrixres[r][c] = 0;
        for (s = 0; s BIGASSIZE; s++)
        matrixres[r][c] += sin ( matrixa[r][s] * matrixb[s][c] )

        }
        }

        Here's my invoice:

        For: Cederic

        • by Cederic ( 9623 )

          Interesting. No idea why you need to do that but go for it.

          I'll pay you if I ever actually use this, thanks.

  • by gupg ( 58086 ) on Saturday August 24, 2019 @07:02PM (#59121756) Homepage
    C to HDL compilers are inefficient. Worth trying on something called a FPGA (its a flexible hardware logic that can be molded to pretty much any algorithm). The way to think about costs is: - Team to convert C to HDL manually: ~12 people for something reasonably complex - Optimize & map to a hardware architecture: ~6 people - Design tools: ~$1M - Chip tapeout costs: $1-5M depending on if you want to use the latest or old tech - If there is any software work that goes on top of your chip (because you have a CPU), then add that team too Per chip costs: - Each wafer costs ~$5000 - Lets say you get ~50 net good die, then its $100 per chip I suggest talking to a chip design consulting company & getting a quote
    • by gupg ( 58086 )
      btw, if your chip is small, you will get 250 net good die, so cost will be $20 per chip if your chip is complex, you need a large verification team + costs
    • Design tools: ~$1M

      Yep, the perversion of the modern age...

      • It isn't javascript running in as browser. Why anybody would think to ask slashdot about it is mystifying.

        • I was thinking more of Lola(-2) or OKAD. These days the software practice in the design field sounds like something /. really wouldn't want to have something to do with. It sounds really ugly.
    • Well, I disagree with some of your numbers. On 55 nm (old tech), we were seeing mask costs on the order of $0.6 million, and wafer costs on the order of $1000, and for small die (20 mm^2) roughly 3000-3500 die will yield per wafer (a 12" / 300mm wafer is about 70,000 mm^2).

      If they actually want to realize a die, some of these costs can be reduced with a shuttle run - the fab collects a number of small designs and combines them on a single mask set. You may only get 25-100 die out - but the upfront costs a

    • Unless they are on a shared wafer they will get many more than 50 die off a wafer. But even so, it would require expected sales of millions of units in order to get the unit cost low enough for inclusion in a product. And all that cost would be front-end, basically it would require production ready units to even open dialog with end product manufacturers. If I were building something I would not want to include a chip from an unknown vendor, especially if they had no working samples.

      In college I fabric

  • Fist prototype your design in an FPGA, then move to an ASIC. I'd recommend the Xilinx Zynq platform. They have a great Dev board (ZCU102). After you succeed in implementing your algorithm in the FPGA, you'll be ready to get an accurate ASIC quote
  • I worked at a company that implemented Video codecs on their own custom hardware. We did some MPEG-4 -> MPEG-2 transcoding of multiple streams on the fly. No lines of C code translate directly into lines of HDL or complexity of an ASIC. You need someone to look at your codec design and architect some hardware based on that. We prototyped with FPGAs, (which is a bit like a flexible/emulation platform) where you can inject software/hardware structures for debugging. Later on, once your product is prototyp
  • Comment removed (Score:5, Informative)

    by account_deleted ( 4530225 ) on Saturday August 24, 2019 @07:09PM (#59121770)
    Comment removed based on user account deletion
    • FPGAs are a good place to turn the C into something that resembles hardware and get it working, and get a rough size estimate. They're planning to hit the mass consumer market though, FPGAs are far too expensive and power hungry for the final solution.

      Starting with the GPU approach isn't obviously a winner unless the consumer device they want to target has that GPU. Designing a GPU in an ASIC is massively complex and expensive and while I've not tried to procure someone elses IP (say ARMs Mali) I'm going to

      • I think the grandparent comment meant implement their C code on a GPGPU (in OpenCL etc) before even bothering with a hardware route. If they an implement their algorithm on a GPU than can just license the code/patent to third parties. An ASIC is unnecessary if the job can just be done on the GPU in a filter graph for video.

  • by Photo_Nut ( 676334 ) on Saturday August 24, 2019 @07:11PM (#59121772)

    Q: How do you estimate how much time it will take to cook a dinosaur egg?
    A: Get a dinosaur egg and cook it.

    Some times you just need to go through the motions. Do *you* think it's worth doing? Then do it. Convince sponsors to support your efforts.

    Tell me you're going to be selling X product in Y months without having X product ready to sample and I'll get the idea to integrate something similar onto my FPGA and try it out there. Go out and do it, and tell me it's ready for me **and** my competition, and I'll buy it first.

    What stops your algorithm from working *just as fast* on the GPU that comes with the latest ARM SoC. Eventually they'll put an FPGA on them, too. Do they already have an ASIC for video coding like they have an ASIC for ML? Can you adapt your algorithm to that?

    • Do they already have an ASIC for video coding like they have an ASIC for ML?

      Yes, and often it's hyperspecialized to one or more of a small number of MPEG or H.26x codecs commonly used in broadcasting or home video. These commonly include MPEG-2 (aka H.262, used in DVD and ATSC), MPEG-4 Part 2 (an H.263 extension best known from DivX and Xvid), AVC (aka H.264), and HEVC (aka H.265).

      • by guruevi ( 827432 )

        Modern video encoder/decoders have a lot more than just those, they typically are just a specialized processor with all the generic calls that make up those codecs and then some. This means they are adaptable simply by using some firmware to also do other less common media formats like VP8/9, various audio formats and depending on required bandwidth even multiple streams or simultaneous encode/decode.

    • Q: How do you estimate how much time it will take to cook a dinosaur egg? A: Get a dinosaur egg and cook it.

      And you're likely going to need to do it three or four times before your estimate is right.

  • 1. Tell the world all about the math for a video-processing algorithm.
    2. Smart people after work, who have retired, who are learning will do the design/engineering.
    3. The tested and ready results will be sent to "Custom Chip" factory.
    4. Robots and computers make the hardware. Humans and robots test to ensure perfect quality.
    5. The production run will be for the number of people who want the product.
    6. Everyone gets to learn and the world gets free code/math with the open hardware?

    Get a univer
  • I'm going to guess if you're asking /. for professional advice, then it's too much. Look into FPGA instead.

  • Wrong question (Score:5, Insightful)

    by Kohath ( 38547 ) on Saturday August 24, 2019 @07:32PM (#59121802)

    I think you are asking the wrong question. The questions you should be asking are:

    - what is the end market?
    - how many of the final devices will sell?
    - at what price?
    - then how much more will the customer be willing to pay because your chip is included? How many more units will be sold because your chip is inside?
    - what are typical margins for companies in similar businesses?

    Given the sales increase, price increase and margin, that will start to give you an idea of the opportunity.

    If it's not a really nice opportunity, the cost of turning the algorithm into an ASIC isn't your real problem. If it's a really nice opportunity, then your hardware partner probably knows more about how much extra an ASIC will cost than whatever you will come up with.

    If the customer will pay a lot more but the number of units is limited, then consider an FPGA implementation like the other commenters suggested.

    An ASIC can cost a few cents or many thousands of dollars per chip. The C to HDL tools are probably the way to go, but you will still need to rewrite almost all your code. Those tools have gotten a lot better.

  • What is your reason for wanting to put it on an ASIC? If it works on a DSP or some other processor, that is almost always going to be your cheapest route. You typically only go to an FPGA it ASIC if you can't meet design requirements, eg power, throughout, latency. Not to mention, the processor based solution, if it can work, will almost assuredly be easier to modify or fix in the future.
  • You've already been told your answer of starting at around $500k and from there it depends on how many chips you order. Why are you asking the question on here?

    If you are serious about your product have at least one of your team move over to China and get involved in the maker/small consumer electronic devices scene over there to make the contacts with the manufacturers. Otherwise you aren't going to get any kind of deals.

  • In recent years the C-to-HDL efficiencies have improved significantly. Most vendors quote 15% inefficiency, meaning the resulting circuit may consume around 15% more transistors, run slower, etc. But by some measurements they actually rival the ability of what a team of small humans could possibly do in a short amount of time (2-3 months). If youâ(TM)re prototyping or implementing the final thing on an FPGA Xilinx and Intel both have tools that compile from C and C++ into HDL (Verilog or VHDL) funct
    • > they actually rival the ability of what a team of small humans could possibly do

      How would they compare to full-grown humans?
      My small human (she's 5 years old) does some smart things, but it's a weird comparison.

      Or did you mean Japanese engineeers?

  • So I am a digital engineer who writes a lot of VHDL from scratch to target both FPGAs and ASICs. Due to the cost, we always prototype on FPGAs first because mistakes are just too expensive. What you are asking for does not really exist. In C, you are defining a serial operation. Duplicating that exact process in an FPGA or ASIC makes no sense because there is nothing to gain. Your algorithm needs to be broken up into multiple parallel paths. FPGAs and ASICs are just digital circuits that you get to create a
    • There are several commercial C to HDL tools out there as well as academic.
      Catapult HLS for example. LegUp is another, I think Xilinx has their own HLS as well.
      I never used any of them except for small experiments, and unless your code already closely matches the input they expect you will have to do a lot of rewriting anyway, but it does exist.
      Not sure how good the results are compared to hand written HDL.

      But yeah, asking for an estimate based on lines of code is nonsensical.
      • You can drop an ARM core into a chip, run the code on the ARM code, and not even worry about the translation. A lot depends on what the interfaces are, whether he needs to keep up with line rate, what the bandwidth is and how complex the code is.

        There's just nowhere near enough information, and the way the question is posed suggests a major impedance mismatch between what he's expecting and how things really work.

    • Seems to be the proper place to ask this question. I was never a real engineer, no matter what my visa said.

      My rough rules of thumbs:

      Human hardware is about 10 times faster than human software.

      Basic computer hardware is about 100 times faster than typical software. (This covers the EEPROM or FPGA approach.)

      Fully optimized custom hardware is at least 1,000 times faster than software. (This is the ASIC side.)

      Do you [Josh White] think that's a plausible description?

      If so, then the "cost" has to be considered i

  • What they're great at is doing the same thing over and over, like if your code has very little branching and is operating on the same tiny amount of data. I know there was some attempts at superior colors, but it's really hard to become an expert..

    • Speaking of not being an expert.....

      I have no idea what this is all about. But to be honest, if 90% of the articles on /. were like this, I'd still be here. This is news for nerds. This is the sort of window into a subject I know nothing about that /. used to (occasionally) be awesome at giving.

      More of this, less of the politics and social hot-button topics of the day. Nobody is trolling this. It's just a bunch of really smart people providing their expertise. What /. is supposed to be. Thanks all!

  • by known_coward_69 ( 4151743 ) on Saturday August 24, 2019 @08:59PM (#59121908)

    You'll never make money selling extra chips unless you can do something really crazy like 16k decoding at the hardware level for 1080p or so. your best bet is to work out the silicon and license the patents and processes to chip makers as part of their CPUs or SoC's

    • This. No one is randomly adding chips to their phones because someone approached them.
      They add them because they need some functionality.
  • by svirre ( 39068 ) on Saturday August 24, 2019 @09:03PM (#59121918)

    The size of your algorithm in code size does not tell me anything as to how expensive it is to implement in hardware. To estimate costs these are the steps you need to take:
    First off you need a design team capable of handling the implementation. You need digital design team that can come up with an architecture and a design for your problem. If you are uanble to do any of this work yourself, expect to spend a a few hundredK on a prestudy to get to a rough project plan with estimates for cost.

    Once underway in addition to design, you need a DV team to verify the design pre-silicon (Never let designers verify their own code, DV is a completely different job) . you need a back-end team to translate the design into a chip layout complete with power and clock support circuitry, you need test and production engineers to make the device manufacturable and you need a team to build a a verification platform and PCB for the device so you can test it. A real world device will also need some IO which you will likely buy as IP. You also need cell library and memory IPs and likely clock and power as well. I am assuming you will be doing your own software and driver work.

    Once the design is done to your chosen process you will get hit by NRE charges to make the recticles. This varies with process. Current day mid range processes (~45nm) will likely cost you 1-2 million US$ for a recticle set. It is good practice to assume you will need at least two sets as mistakes WILL happen. , you also need to pay some NRE for packaging and test boards. This is not too bad and you can get far with a few 100K. If your project calls for very few devices and cost is not important, then you can look for MPW shuttles (Multi Project Wafer). These are common in academia and have multiple design sharing a recticle. This has the impact that your schedules just got very firm, and the yield will be very very low, so the cost pr. chip will be high. You will also have to deal with a smaller max die size. as the recticle needs to be divided into even sized slots.

    In general there are very few applications where an ASIC makes sense.

  • Kinda a variation of the 'if you have to ask how much it costs, you can't afford it'. The cases where custom chips like this decrease the total cost of a design are few and far between, and usually you end up looking at ASICs because more conventional options have already been exhausted and you have a particular limit known ahead of time that the ASIC needs to address.

    But yeah, it is almost always cheaper to throw more conventional hardware at something.
  • by anarcobra ( 1551067 ) on Saturday August 24, 2019 @09:31PM (#59121962)
    > We've been told that "the chip-design engineering fees alone would likely start at around $500,000."
    Seems about right

    >We've been told "the cost per ASIC will fluctuate wildly depending on whether you are having 50,000 ASICS manufactured or 5 million ASICs manufactured."
    Well yeah, economies of scale and all that.
    >Is there some rough way to calculate from the source code size of an algorithm -- lets say 100 Kilobytes of C code, or 1000 lines of code -- a rough per-unit estimate of how much the ASIC hardware equivalent might cost to make?
    No.
    It depends on your specific algorithm.
    If it's 100kB of lookup tables, that's a lot simpler than 100kB of complex for loops and floating point calculations.
    Long story short? Pay someone to take a look at the code and give you an estimate.
    Also depends on how accurate you want it to be. You already have an estimate of half a million.
    Seems like you want it to be more precise. No $/line conversion rate is going to give you any kind of precision better than a straight up guess.
  • Think about all the silicon that is already out there,
    1. Are you sure this ASIC doesn't already exist i.e. Sony Hauppague, Matrox?
    2. Can you FPGA your solution and still meet timing and power requirements?
    3. Does any other CPU/GPU exist the could be converted to your purpose?

    Doing silicon is a big, expensive step, especially if you don't have any experience doing it. With design screw ups, production issues, and yield issues, I could see this costing 2-3x your estimate. It is noble you want to do silicon,

  • You might had chance if you said you target some esoteric high end professional equipment manufacturer. However you said you target a consumer equipment manufacturer and for me that says you have snowballs chance in hell with this plan. Even if you already had a physical chip in hand rolling out from a production line they are not going to add your chip to their board. It makes zero sense to have additional chips for some video processing algorithm. Their product already supports tens of similar algorithm
  • Once it's in SystemC you can do a lot of analysis on it. If you can't afford to implement your algorithm more than once maybe you don't deserve to make a business out of it.

    • SystemC can do engineering cost-estimates? Color me impressed.

      • You can do really rough per-IP area estimates with it. And from there work out a rough estimate of total area very early in the design cycle. You can also do a lot of very early power estimates as well, and start thinking about your thermal budget.

  • I think you're approaching this from the wrong angle, especially for video encoding/decoding there are already bunches of chips and designs out there, even a small increase in performance won't matter.

    What's more is that manufacturers don't want anything specific, even the current hardware accelerators can simply be adapted for newer versions of the codec or competing codecs (eg. an H265 chip will have the same guts required for a VP9 chip, so adding that functionality is simply firmware as long as the bitr

  • 5-10million in dev costs to get to validated ASIC production and you better bring good annual volume or donâ(TM)t even bother. What good is depends on cost what you want your per chip cost to be. Something cheap better have 5-10million annual volume or more. Something expensive could be 50-500k annual volume. And you need good upfront clarity in this so you can match for the right partner.
    • by fubarrr ( 884157 )

      Allwinner made A10 for under $500k and 7-8 moths.

      The trick? They paid a third party chip design studio that does nothing, but cheap cookie cutter SoCs from commodity macros.

      I genuinely see no reason to develop any kind of complex IC from scratch these days if you go for low volume tapeout

  • Why not aim for a hybrid solution?

    The original poster does not specify what the video processing algorithm does, so it could in principle be anything from compression and motion detection to some sort of artificial intelligence, so my comment lends itself more to the first two options than the last.

    All modern recording devices and players have a CPU with all the benefits it accomplishes, so why not employ that and offload only the hard to do operations to an ASIC (and use an FPGA as a prototype)?

    Image proce

  • > We've been told that "the chip-design engineering fees alone would likely start at around $500,000."
    > We've been told "the cost per ASIC will fluctuate wildly depending on whether you are having
    > 50,000 ASICS manufactured or 5 million ASICs manufactured." Is there some rough way
    > to calculate from the source code size of an algorithm -- lets say 100 Kilobytes of C code,
    > or 1000 lines of code -- a rough per-unit estimate of how much the ASIC hardware equivalent
    > might cost to make?

    Well,

  • If your reason to go asic is for performance boost, it may be better first to explore all low-hanging fruits using normal CPU and C code. May be there is a better algorithm (lower big-O); or a better memory characteristic algorithm (more cache-hits and cache performance). May be even a parallel version. You can talk to a computer science person (professor say in algorithm/theoretical computer science) in your local college/university. The point is you may be able to extract lot more from traditional way of
  • There are several reasons that may compel you to design an ASIC. If the algorithm can't run in real-time on the proposed hardware. Reduced hardware requirements of various resources (ram, clock-speed, etc.). Just don't do it because it sounds like a good idea.

    I've created dozens of ASICs over my career and there are things you need to consider before doing it.
    * Choosing the technology. You want to pick a process that meets your needs (with an engineering margin) at the lowest cost.
    * Will the algorithm c

  • by alvieboy ( 61292 ) on Sunday August 25, 2019 @07:02AM (#59122680) Homepage

    The quick answer to your question is: don't.

    To start, ASIC is a standalone integrated circuit, and you seem to be aiming at markets where the PCB space is at prime. The real estate on these boards is rather expensive, and no one will add an extra IC just because.

    Second, forget about "C" to "HDL" - you seem to misundestand how hardware works. So you have an algorithm, and you have implemented it in "C", but the implementation approach for hardware is probably something very very different, if you want to do it efficiently. Unless you want a co-processor implementing the algorithm, you may need to rethink the implementation from scratch.

    Also, why do you need an hardware implementation at all ? Is it for performance issues? For power efficency issues ?

    In case you overcome the second point, and you find a proper way to implement things in hardware, I'd suggest to grab some CPU+FPGA fabric device (a Xilinx Zynq or an Intel Cyclone 10) and implement a mixed software-hardware demonstration. Once you get that, try selling the IP core (HDL) and the software glue to SoC manufacturers (although the SoC real estate is also expensive, but you might be luckier).

    Alvie

  • by Tony Isaac ( 1301187 ) on Sunday August 25, 2019 @07:30AM (#59122726) Homepage

    It seems that your problem is that you don't know the first thing about producing an ASIC. It's not possible to estimate a project which you know nothing about. You have to hire an expert.

    How many times have you seen an executive, or a sales person, estimate the time it would take to deliver a software project? How does that go? Not well, I assure you. People with no expertise in software development should not be estimating the cost of a software project. Same goes for ASIC.

    Once you have an expert, estimating this is the same as anything else:
    - Break the project down into small pieces
    - Estimate each piece
    - Add up the numbers

  • FWIW, this algorithm was used, with some good success on a series of ASICs back in the 1980s. I have no reason to doubt that the underlying ideas are still sound.

    Sometimes, the cost of the ASIC is dominated by the size of the package, other times it is the cost of the gates. I suspect that unless you have extreme I/O needs, this will turn out to be an irrelevance given modern densities; nevertheless:

    • First, count the gates.
      1. Estimate the number of bit of register state and sequencer state you need. IIRC,
  • Armored Dragon gave good advice. If you want a custom chip, try an FPGA first. However, for an "algorithm" how about coding it in something like CUDA that runs on a GPU. If your CUDA code works and is only a few 100 lines long, you might be 1/4th the way to a viable HW design.

    Converting from C code (or anything like it) to an ASIC is a losing prospect. You are in the wrong design space. Chips, ASICs, even FPGAs are all about parallelism. They don't look anything like C code. Even Verilog code repre

  • There actually is quite a bit of info. Like any other design you begin by converting your algorithm to Verilog or VHDL. You can then prototype it on an FPGA. But the key is going from a more "serial model" in C to a more parallel model in Verilog/VHDL. At least if you want efficiency. This also can give you a good idea of the actual chip area needed too.
  • "How do YOU estimate the cost of an algorithm turned into an ASIC?"

    The question is how would I estimate the cost of an algorithm turned into an ASIC? Since I cannot do this myself, the course I would follow is:

    1. Estimate how much it would cost to put together an RFQ, including pre-authorization to pay the responders for their time, and including pre-authorization for the eventual cost of preparing the estimate under the received Quotations.
    2. After obtaining the money to develop the RFQ, develop the RFQ.
    3

  • Why do we need this? Because we want to pitch our video processing tech to a company that makes consumer products, and they will likely ask us, "So... how many dollars of extra cost will this new video processing chip of yours add to our existing products?"

    Unless you know that's what they'll ask, don't waste your time solving the problem. To paraphrase parts of Extreme Programming (XP) manifesto:
    Goal: To produce a product that's easy to change
    Solve the customer's current need. Resist the urge to guess

As you will see, I told them, in no uncertain terms, to see Figure one. -- Dave "First Strike" Pare

Working...