In Generative AI Market, Amazon Chases Microsoft and Google with Custom AWS Chips (cnbc.com) 25
An anonymous reader shared this report from CNBC:
In an unmarked office building in Austin, Texas, two small rooms contain a handful of Amazon employees designing two types of microchips for training and accelerating generative AI. These custom chips, Inferentia and Trainium, offer AWS customers an alternative to training their large language models on Nvidia GPUs, which have been getting difficult and expensive to procure. "The entire world would like more chips for doing generative AI, whether that's GPUs or whether that's Amazon's own chips that we're designing," Amazon Web Services CEO Adam Selipsky told CNBC in an finterview in June. "I think that we're in a better position than anybody else on Earth to supply the capacity that our customers collectively are going to want...."
In the long run, said Chirag Dekate, VP analyst at Gartner, Amazon's custom silicon could give it an edge in generative AI...
With millions of customers, Amazon's AWS cloud service "still accounted for 70% of Amazon's overall $7.7 billion operating profit in the second quarter," CNBC notes. But does that give them a competitive advantage?
A technology VP for the service tells them "It's a question of velocity. How quickly can these companies move to develop these generative AI applications is driven by starting first on the data they have in AWS and using compute and machine learning tools that we provide." In June, AWS announced a $100 million generative AI innovation "center."
"We have so many customers who are saying, 'I want to do generative AI,' but they don't necessarily know what that means for them in the context of their own businesses. And so we're going to bring in solutions architects and engineers and strategists and data scientists to work with them one on one," AWS CEO Selipsky said... For now, Amazon is only accelerating its push into generative AI, telling CNBC that "over 100,000" customers are using machine learning on AWS today. Although that's a small percentage of AWS's millions of customers, analysts say that could change.
"What we are not seeing is enterprises saying, 'Oh, wait a minute, Microsoft is so ahead in generative AI, let's just go out and let's switch our infrastructure strategies, migrate everything to Microsoft.' Dekate said. "If you're already an Amazon customer, chances are you're likely going to explore Amazon ecosystems quite extensively."
In the long run, said Chirag Dekate, VP analyst at Gartner, Amazon's custom silicon could give it an edge in generative AI...
With millions of customers, Amazon's AWS cloud service "still accounted for 70% of Amazon's overall $7.7 billion operating profit in the second quarter," CNBC notes. But does that give them a competitive advantage?
A technology VP for the service tells them "It's a question of velocity. How quickly can these companies move to develop these generative AI applications is driven by starting first on the data they have in AWS and using compute and machine learning tools that we provide." In June, AWS announced a $100 million generative AI innovation "center."
"We have so many customers who are saying, 'I want to do generative AI,' but they don't necessarily know what that means for them in the context of their own businesses. And so we're going to bring in solutions architects and engineers and strategists and data scientists to work with them one on one," AWS CEO Selipsky said... For now, Amazon is only accelerating its push into generative AI, telling CNBC that "over 100,000" customers are using machine learning on AWS today. Although that's a small percentage of AWS's millions of customers, analysts say that could change.
"What we are not seeing is enterprises saying, 'Oh, wait a minute, Microsoft is so ahead in generative AI, let's just go out and let's switch our infrastructure strategies, migrate everything to Microsoft.' Dekate said. "If you're already an Amazon customer, chances are you're likely going to explore Amazon ecosystems quite extensively."
Re:Who is going to fab these suckers? (Score:4, Insightful)
but until the chip is fabbed, all of that means zilch.
If your design is just straight digital RTL, with no fancy analog or RF stuff, it will usually work on first silicon.
This chip is presumably a tensor processor, which means millions of multipliers all doing the same thing. They either all work or none of them work. Making a multiplier isn't that hard.
TSMC isn't going to be cranking anything new in the 3nm range
3nm is booked, but 5nm fabs have available capacity. At 5nm, Samsung is an alternative fabber.
Re: (Score:2)
Also these are new designs so why do they have to be on a newer node on the first batches? There are LOT of fabs out there without enough load right now if you're willing to use an older node; that is, there are a LOT of other companies besides Samsung, TSMC, & Intel.
Re: (Score:2)
these are new designs so why do they have to be on a newer node on the first batches?
They have to be on the newest node for the greatest performance. If they are taped out for 5nm then they have to be taped out again for 3nm, you don't just select a different fab and hit "print".
Yes, I know they don't use tape any more, I'm just using that word to separate that part of the design process
not really correct. (Score:1)
"What we are not seeing is enterprises saying, 'Oh, wait a minute, Microsoft is so ahead in generative AI, let's just go out and let's switch our infrastructure strategies, migrate everything to Microsoft.' Dekate said. "If you're already an Amazon customer, chances are you're likely going to explore Amazon ecosystems quite extensively."
no but what we are seeing is customers that are AWS only are adopting multi cloud and those just making the choice have one more reason not to choose AWS.
Re: (Score:2)
"What we are not seeing is enterprises saying, 'Oh, wait a minute, Microsoft is so ahead in generative AI, let's just go out and let's switch our infrastructure strategies, migrate everything to Microsoft.'
If only Microsoft hadn't spent their entire history being assholes... we might actually want Bing, etc.
This really is not about more computing power (Score:3)
No amount of computing power will give you better models because no amount of computing power will give you better training data. This seems to be a moronic reflex: It is some computing hype! Quick, get custom hardware!
Re: (Score:2)
As you add more parameters to an LLM, there are non-linear jumps in performance even with no increase in training data.
Adding more training data obviously helps, but it is not true that it is the only way to improve performance.
Re: (Score:3)
Yep. Also, it's not so much about quantity as quality. While quantity isn't irrelevant by any stretch, quality matters more than quantity. Also, in addition to more parameters, improved architectures and training methodologies yield better models. And it's an incremental process. You develop a model, you test it, you learn, and then you develop and train another one based on what you learned. So there's a continual market for more training.
My issue with this Amazon news is... I mean, it's great to have mo
Re: (Score:2)
Nope. You need better training data. More training data just causes more problems unless it is also better. After a certain amount, you just get overfitting from more training data and that makes the model unusable. The same can happen with more parameters, but it really depends.
Re: (Score:2)
No amount of computing power will give you better models because no amount of computing power will give you better training data. This seems to be a moronic reflex: It is some computing hype! Quick, get custom hardware!
Computing power is everything when you're training, refining the model, retraining ... seeing what works and what doesn't.
Re: (Score:2)
Complete bullshit. Computing power is a secondary concern.
Re: (Score:3)
This is a gold rush and Amazon are in the business of selling shovels. At the moment they're reselling NVidia brand shovels, but they want to have all the profit for themselves.
Re: (Score:1)
Re: (Score:2)
Yep. I think people don't really understand why NVidia is quite so popular. AMD certainly don't.
Deep learning works on any NVidia card. It doesn't matter how shit or low end, as long as your card's architecture is new enough to run the oldest supported driver version required by Pytorch, then you can do deep learning. Doesn't matter if it's a GBP150 GTX 1650 or a $30,000 datacentre GPU, it will still run pytorch (or tensorflow if you're still on that).
End result, everyone uses NVidia. Students learning will
Re: (Score:2)
No amount of computing power will give you better models because no amount of computing power will give you better training data.
Data quality is far from limiting factor. There is still tons of untapped potential for improved training outcomes by throwing more transistors and better algorithms at the problem.
Re: (Score:2)
And statements like that are the reason for the current hype: Too many people with no clue how these things actually work.
Re: (Score:2)
Data quality is far from limiting factor. There is still tons of untapped potential for improved training outcomes by throwing more transistors and better algorithms at the problem.
And statements like that are the reason for the current hype: Too many people with no clue how these things actually work.
On the subject of no clue... I really want to learn more about this impressive "language interface" that took half a century to achieve and how it contrasts with dumb "reasoning engine" behind it.
I have references to support everything I'm saying.
Data not currently limiting factor
https://arxiv.org/pdf/2211.043... [arxiv.org]
Models can self-improve
https://openreview.net/pdf?id=... [openreview.net]
Model scaling
https://arxiv.org/pdf/2203.155... [arxiv.org]
Lots of places available to spend ones computer time
https://arxiv.org/pdf/2209.155... [arxiv.org]
Re: (Score:2)
You do seem to not understand the quality (or rather lack thereof) of your references. Anybody can upload anything they like there.
Chasing whom? (Score:3)
How much of the global AI processor market is supplied by Microsoft and Google? Certainly not counting the processors in Microsoft and Google data centers, the market share is pretty close to zero. And inside Microsoft and Google, there are likely a lot of Nvidia processors.
Everyone is chasing Nvidia. It's not necessarily that Nvidia has the best hardware, although Nvidia appears to compete well among the best hardware. The problem is providing a close to turnkey solution in a field where models are changing monthly. A GPU happens to be a parallel processor that handles AI fairly well. Some ASIC and ASIC-like processors may be somewhat more optimal on some models and workloads, but it would be challenging to design a non-GPU-like processor that is optimal for all models and workloads, now and in the future.
Business folks talk a lot about time to market. Nvidia has first mover advantage in terms of flooding the market (business, academia, and tinkerers) with their systems. Just like IBM's reputation from many decades ago (i.e., won't get fired for buying IBM), Nvidia has already garnered a mindset in the field, so that competitors have the onus of showing that they are not only comparable but better than Nvidia.
Re: (Score:2)
Certainly not counting the processors in Microsoft and Google data centers
I think this is precisely what the article is about. Amazon isn't planning to sell these processors, they are planning to use them in their own data centers.
Low hanging fruit (Score:2)
What they really need are large analog matrix processors. Especially for batch inference you load the model into banks of affordable DRAM. Concurrently and independently transfer next chunk into SRAM of the matrix processor, run a batch of inferences (thousands at a time?) and repeat.
Might not be able to compete with Nvidia in terms of realtime inference speed but you can clobber them on cost, power consumption, throughput and scalability.
Perhaps small predictive models could be applied to work like branc