Hunting Malware With GPUs and FPGAs (hackaday.com) 44
szczys writes: Rick Wesson has been working on a solution to identify the same piece of malware that has been altered through polymorphism (a common method of escaping detection). While the bits are scrambled from one example to the next, he has found that using a space filling curve makes it easy to cluster together polymorphically similar malware samples. Forming the fingerprint using these curves is computationally expensive. This is an Internet-scale problem which means he currently needs to inspect 300,000 new samples a day. Switching to a GPU to do the calculation proved four orders of magnitude efficiency over CPUs to reach about 200,000 samples a day. Rick has begun testing FPGA processing, aiming at a goal of processing 10 million samples in four hours using a machine drawing 4000 Watts.
Re:Acronyms... (Score:4, Informative)
Graphics Processing Unit.
It's more or less a CPU with more cores and less functionality per core. There are typically a few instructions you would otherwise expect form a DSP like saturated addition.
Re: Acronyms... (Score:2)
That's an extreme oversimplification.
If you assume you have 1000 threads of execution, you could execute each one independently on a 1000 core machine. This is not true on a GPU. Those threads on a GPU will be grouped together. Each thread in a group will be executing the same instructions, so you can't have each thread executing done independent code.
Conventional CPUs can handle any permutation of branch. In the GPU if you have an "if-else" condition and some threads in a group do the "if" and others do t
Re: (Score:1)
Eternal September.
Re: (Score:2)
WTF is a GPU?
An indication that you stumbled onto the wrong site, read the wrong article, and then proceeded to comment just to figure out where on the internet you ended up.
Here let me direct you back to mainstream media [cnn.com]
4KW (Score:3)
Wow how are you powering this thing a dryer plug?
Multiple PSUs?
That's a heck of a lot of power for a single machine.
Re: (Score:2)
Then again by power cost he may have meant 4KWH which would be what you would expect from a machine using 1KW for 4 hours.
That's probably what happened but It's not nearly as interesting.
Re: (Score:2)
At 240VAC, that is less than 17 amps, so a dedicated 20 amp circuit with 12 guage wire would do it (NEMA 6-20). No bigger than a standard computer plug. Still, that is a shitload of power.
Re: (Score:2)
Re: (Score:2)
And then, of course, there's the flipside of power consumption: heat production. He's not only got to provide the current, he's going to have to provide cooling for this little bonfire he's contructed.
Re: (Score:2)
Throw enough high-draw things like fancy graphics cards into a cluster or a rack, and it doesn't seem all that difficult.
A quick google says power hungry [realhardtechx.com] GPUs are the norm.
With great power comes great power bills.
Re: (Score:2)
Re: (Score:2)
Sorry, I wasn't clear in what I was citing:
At 150W, that's 26 cards. At your 250W, that's 16 cards.
4000W isn't that hard to reach ... put it into a case with all of the other things, and you're talking, what, 3-4 machines?
So, yeah, they're not 1000W cards, but if you're talking about combining a couple plus the rest of the overhead it's not that hard to get there.
Re: (Score:2)
whoa there! The article you quoted is misleading. First off, it lists Recommended Power Supplies. This is NOT the same as the power draw by the GPU. This is the manufacturers recommendation of what you need to ensure stable performance of EVERYTHING in the PC with that card installed. The higher end the card, the greater the recommended minimum, partly to compensate for increased GPU needs, but also because the kind of people that run these cards are likely to have a crap load of other stuff that needs feeding as well.
Actually the high power recommendations are to cope with the clueless noobs who buy white box PSUs, which can barely supply 50% of their rated current for an extended period without catching fire. Oh, and maybe also to allow some small headroom for later system expansion.
Re: (Score:1)
It takes two wind turbines and about 7200ft^2 but I can run a *very* large house, complete with a server room and network closet.
It is not cost effective so you still have a point. Technically, I push power into the grid and get credits. I can use, save, sell, or trade those credits. Once I've gone a whole year with my current configuration - I'll be donating them to a local elementary school. They're evil bastards who somehow con me into shit. They send me cards, Valentines' Day cards, and make me awful co
Re: (Score:2)
On the bright side (and by that I don't mean the IR it emits) it may make coffee as a side product!
Re: (Score:2)
A Business Oppertunity (Score:1)
All the malware authors could make some easy money selling him some processing time from all the botnets they run.
For research, this seems invaluable (Score:2)
As an effective deterrent, I cannot imagine this will be viable long-term. It seems to me that it is much easier for the attackers to generate more permutations than it is for the defenders to identify them. Will clients be able to keep up with matching against that many definitions? Maybe you only scan on particular servers, and because of the CPU intensive nature, you sell it as a service. Well, guess w
Re: (Score:2)
Well, it's really interesting ... from the limited stuff in the article, it's essentially calculus (I think).
Sure, you can do a lot of permutations, but you can only do so many of them which are fundamentally different. Because they still share some underlying similarity with the original.
As I understood it, imagine a wavy line through space. Variations of the same thing will follow that wavy line +- some space around that line for the permutations. Close up the permutations look really different, but as
Re: (Score:2)
Re: (Score:2)
Sure, but taken far enough this solution would mean the attackers would need to write a whole new thing.
Once you have this, you check something new, identify it as a match of the thing, and add it.
The attackers can always be more nimble, but if a solution which can adapt and say "oh, that's just a variation of this, I'll block it" then you can at least ratchet up the arms race.
Re: (Score:2)
I'm thinking one really big wrench that can be thrown into algorithmic detection is if a randomly selected salt is used in each permutation of the malware. That could force this type of analysis to require dramatically larger resources with little architectural investment on the part of the malware creators.
Re: (Score:2)
Information on what Wesson is calculating seems hard to come by, but this may be it:
https://www.google.ch/patents/... [google.ch]
Something doesn't add up again (Score:3)
Switching to a GPU to do the calculation proved four orders of magnitude efficiency over CPUs to reach about 200,000 samples a day.
4 orders of magnitude?! Was he processing 20 samples a day before? What kind of CPU was he using? 8088?
Re: (Score:1)
4 orders of magnitude in efficiency, not throughput. Currently he's processing 200.000 samples per day, using 4 kW. He might have processed 20 samples a day using the same 4kW, or 2000 samples a day using 400 kW. The latter may sound like a lot, but renting such peak processing power is the whole point of the cloud.
4 orders of magnitude? (Score:2)
Really, 4 orders of magnitude? 10000 times faster with GPUs than CPUs? I call bullshit. You might get a factor of 100 if you pick a SoA GPU and a shitty CPU. But comparing things of similar generation, you will not get a factor of 100 on modern hardware. So either they are not in base 10, or there is BS going on.
Re: (Score:2)
No it isn't. Rule of thumb : 1 high end GPU = 12x 4 cores in single precision, 8x 4 cores in double.
People getting more than two orders of magnitude differences are comparing (highly-)unoptimized code.
Optimizing code on CPU or GPU is hard.
Re: (Score:2)
So a gpu like titanx get you about 2Tflop/s and 350GB/s of memory bandwidth. A modern core i7 with 8 cores gets you about 100Gflop/s and about 50GB/s of memory bandwidth. If you are looking at integer ops you get similar ratios.
So assuming you can saturate both architecture, you should see a difference of roughtly a factor of 10. If your application saturates one architecture and not the other one, I could buy an other factor of 10 with a bit of arguing. But to have an other factor of 100, you need to do so
Re: (Score:2)
Phi is a strange architecture and people aren't used to it. but 9 orders of magnitude is a bit ridiculous, really 1 billion time faster... Do you have the actual report?
4000 watts (Score:1)
I hope he does something useful with the heat. And now we're giving the electric company some incentive to make viruses. If all this detective work generates so much revenue, well, as Kennedy said, "why not?"
Re: (Score:1)
The only reason Windows has more malware is it is far more used (even on servers it's a 50/50 split out there to this very day vs. Linux). It's a greater return on investment in botnet creation alone on those grounds since a botnet's more effective with greater numbers of enslaved nodes to call on for say, a DDoS attack. You're going to get that using Windows which commands a good 94% of the pc market on pc desktops against Linux or MacOS X. That's the only real reason. Not that Windows is less securable (a
Cloudsource the computer power required? (Score:1)