cloude-pottier writes "An enterprising individual went on eBay and found boards with more than half a dozen Virtex II Pro FPGAs, nursed them back to life and build a SHA-1 cracker with two of the boards. This is an excellent example of recycling, as these were originally a part of a Thompson Grass Valley HDTV broadcast system. As a part of the project, the creator wrote tools designed to graph the relationships between components. He also used JTAG to make reverse engineering the organization of the FPGAs on the board more apparent. More details can be seen on the actual project page."
If this story is hard to understand (was for me), then a comment following TFA might be useful, if you don't/didn't read that far:
5. FPGA - field programmable gate arrays are sort of like reconfigurable circuitry - they can be programmed to perform complex computations in one giant "step", rather than as a sequence of instructions (how a general purpose cpu like the pentium operates).
This makes them fairly pointless for general computing, but when you need to crunch a bunch of numbers in the same way over and over, they can REALLY outperform a general cpu. Usually these are used to manipulate audio / video data streams in real time (the original purpose for the FPGAs used in this project) - but recently people have started using them to brute-force try to crack an encryption scheme. Where a general purpose cpu might take upwards of 40 clock cycles to check one possible answer, each of the FPGAs in this system can check at least one answer PER clock cycle.
This guy pulled a bunch of FPGA systems out of some (defective?) HDTV video processing systems - reverse engineered exactly how everything was wired together, reprogrammed the FPGAs to do SHA-1 hash cracking rather than HDTV video processing, and added some usb control circuitry so the system could take commands from / return results to a pc.
One could use this same board setup to do any sort of massively parallel data processing, but right now the system isn't wired up to really feed large amounts of data into / out of the system in real time. He can get away with that as hash cracking results are fairly small and infrequent, so the limited means he has for getting "answers" out of the system isn't too much of a problem.
There are some FPGA's that can control their output impedance on pins, but an FPGA is really for digital electronics - you're using 4-way look-up tables to emulate arbitrary 4-input logic-gates for the most (99.99%) part. I've seen genetic-algorithms produce capacitance-based designs where unconnected circuits affect each other due to analogue effects, but not humans. We tend to stick to the straight and narrow...
An FPGA really is conceptually very simple, and they're not hard to "program" either... Contrived example:
Verilog design to add/subtract 2 numbers (you'd never do this, but...)
always @(a or b or addnsub)
begin
if (addnsub)
result = a + b;
else
result = a - b;
end
endmodule
Compare that to a K&R "C" routine to do the same thing...
void addsub(a, b, addnsub, result) short a; short b; unsigned char addnsub; short *result;
{
if (addnsub)
*result = a + b;
else
*result = a - b; }
In both cases, of course, you'd just use the 'if...else...' part, but I wanted to show more language structure...
The key thing to remember is that in C, all things happen serially, unless you arrange otherwise with threading libraries. In Verilog, any block beginning with 'always @' happens in parallel with every other 'always @' block. Once you've mentally-mapped the concept of vast numbers of threads to the way hardware works, any competent multi-threaded programmer can become a competent hardware engineer.
Of course, there's "guru stuff" in there as well (as much as you want, trust me:-), you don't get world-beating overnight, but it's relatively easy to get the 80% solution, and that might be just fine. Eking out the last 20% is where it gets hard, as you have to understand the internal structure of the LUTs, and how they interact with the carry-chain, what the LUT->LUT delay can be useful for etc. None of this is at all relevant unless you're missing your timing on a critical circuit (eg: you need 133MHz so your DDR2 SDRAM can work, but the synthesis tools (equivalent to a compiler) only deliver 125 MHz for your design).
The 'always@' part is the hint of just where the power lies. *Everything* can happen in parallel, so you can build pipelines (like CPU's are pipelined today) into your logic, thus reducing the time taken per step (while taking multiple steps), thus increasing your clock rate. The benefit of course is that although the *first* result comes out in the same time, every clock thereafter, you'll also get a result.
I wrote a JPEG decoder a couple of years or so ago, running at ~130MHz. That doesn't sound much, but that comes to ~65 MPixels/second because of the pipelining. Looking at the SSE-optimised intel libraries, a CMYK422->CMYK baseline decode (which is what the FPGA was doing) takes 371 clocks/pixel. The intel chip I was comparing to was a 3.6GHz P4, meaning it could do ~9.7 Mpixels/second. For motion-jpeg that's the difference between decoding SD frames (for the P4) and decoding HD frames (for the FPGA)...
So, FPGAs tend to run slowly (relative to today's CPUs) but can exploit parallelism in ways CPUs just can't, but of course for serial processing, you can't beat a tradition
hmm... you seem to know a lot about FPGAs, so I'll ask you something I've been wondering for a while... Coming from a traditional software end of things, I'm used to seeing "accelerating co-processors" available to do useful tasks much faster than the main CPU. I'm thinking not only the FPU (when it was a separate chip), but things like a modern GPU and such. Many of these have been slowly integrated back into the CPU as time has gone on, the FPU being the best example, so now it's something you can just cal
There are many libraries you can put on your FPGA. Some are open source, some costs A LOT. It's similar to a dll or a jar: you have an interface you bind to and you program your stuff around it. You can get modules to process FFTs, encryption, ethernet, VGA, sound, video, pretty much anything you can imagine. You can even use a CPU library to have a gereal cpu like your x86 and execute assembler instructions. You can even turn an FPGA into an old defunct cpu to repair an old electronic hardware. Amazing stu
That all sounds wonderful... (and does make me want to try some FPGA programming - it sounds really cool)...but that sounds like it's still implemented in the main programmable logic gates of the FPGA (that is, in "software"), like how a.so/.dll is great on a normal CPU, but is still just part of your program running. I'm more thinking of a specific hardware piece, like an FPU co-processor. Something not re-programmable, but theoretically much faster for that specific task. It wouldn't make sense for a lot
Well, common FPGA's are basically look-up tables surrounded by a sea of interconnect logic. The designer specifies the function of each LUT, and the connections between them using a language such as Verilog or VHDL. They're not "generic logic", they're defineable logic. Example: On a CPU, you have the 'add x,y' instruction - that's a chunk of logic on-chip. On an FPGA, that chunk of logic doesn't exist until you write a design that needs it.
You can buy (though I think they're very expensive) "IP cores", which are pre-packaged modules ready to plug-in-and-go. There are some free ones available as well. You may have to do more work to get the free ones to work [grin].
There are also built-in hard cores on modern FPGA's. You never used to be able to synthesize the statement "a = b * c;" in a verilog design, for example. Now that FPGA's have hardware multiplier blocks in them, it synthesises to a bunch of wires connecting up the LUTs to the built-in hardware. For the more-complex examples you suggest, it's best to implement them in logic, because an FFT (of a particular radix, input format (complex or real), and output requirements) is a very specific piece of hardware, and not generally useful to most customers.
You get multipliers, blocks of fast dual-port RAM, even entire processors (PPC) embedded into the FPGA fabric these days. Of course, you pay more for things like embedded CPUs... Funnily enough, a CPU is one of the easier things to write for an FPGA IMHO. You'll never get the speed of the FPGA fabric to match the hard-CPU core though...
To do what you're talking about though, you'd need a way to interface the FPGA to the PC - there's a freely available PCI core, so you'd then just need a card which had a PCI interface (there's one from Enterpoint [enterpoint.co.uk] for ~$150... Then you just need to link the PCI core to your own cores (FFT, whatever) and write software to offload any FFT's to your co-processor. Xilinx offer the "Embedded Development Kit" to make this easier (you have to pay for this, the other tools are free to download). I don't know if anyone has made the freely-available PCI core into an EDK module though...
It sounds like my idea is happening, but at a much lower level than I was thinking (like the multiply example you gave). I guess I'm still thinking of things at the wrong level (software, high level functions), when it's much more basic things that need to be accelerated.
The dual-port RAM interface makes a lot of sense - it'd be a lot nicer than trying to do it yourself with general purpose pins, I'd think.
There are also built-in hard cores on modern FPGA's. You never used to be able to synthesize the statement "a = b * c;" in a verilog design, for example. Now that FPGA's have hardware multiplier blocks in them, it synthesises to a bunch of wires connecting up the LUTs to the built-in hardware. quartus 3 at least can synthisize a = b * c on a chip without a hardware multiplier, it takes a lot of logic cells though.
No, I'm aware of what you say, I just can't afford *any* of the commercial IP cores. I enquired about the cost of a JPEG core once, and was basically laughed at. I'm coming from a different perspective, that's all. FPGA's are a hobby for me, nothing more. I can afford to spend a few hundred dollars on a kit board, but I'd never drop a few grand on a core... I'd either do it myself or make do without. I'll use webpack exclusively for development (since they dropped the in-between option, Foundation is far too
"He no longer has to worry about trying to be the baddest motherfucker in the world. The position is taken. The crowning touch, the one thing that really puts true world-class badmotherfuckerdom totally out of reach, of course, is the hydrogen bomb."
For some reason this passage comes to mind. I can now just learn to blow glass better; computers are never going to be my bag.
4-input LUTs are on their way out, the migration towards LUT6 and beyond has begun in the current high-end FPGA families (Virtex5, Stratix II/III) and will most likely enter the volume-oriented ones (Spartan, Cyclone, etc.) soon. Single-LUT 4:1 muxes alone can enable drastic improvements in many designs. As for FPGAs being cheap, all things are relative. There are ASICs out there for nearly any common application imaginable and these are often well under $50 and are usually designed by people who have extens
As I mentioned above to a different poster, I think we're playing in different ballparks. The cheapest (and pretty-much useless for anything other than playing around on) V5 dev-kit I know of is ~$1000. That's an order of magnitude more than the cheapest S3A or S3E kits. 4-luts may be on their way out at the very high end, but they're definitely still around in the sort of things us mere mortals can buy/use.
All things are relative. IMO, the only thing that exists in FPGAs is a cheapest/smallest/slowest device that will fit a design with the required safety margins and futureproofing headrom. Most of the places I worked at where we used FPGAs were sold to Virtex: XC2V6000, XC2VP70, XC4FX100, XC4LX200 - since they were mostly doing ASIC prototyping, they preferred to spend $2000 extra up-front than have to re-do their $10000 prototype boards because they underestimated the final design's gate count or had to inc
The argument was not about using FPGAs in consumer-level devices or not because we, as you said, actually do. It is just that in consumer electronics, CPLDs and FPGAs are usually there to replace glue-logic and complement the microcontroller/embedded processor's capabilities... like working around bugs. One example of FPGA in consumer equipment is early Radeon SLI: the 'master' board used a Xilinx FPGA to receive pixel data from the slave board and combine it with data from the master to produce the final im
NSA@home is a fast FPGA-based SHA-1 and MD5 bruteforce cracker. It is capable of searching the full 8-character keyspace (from a 64-character set) in about a day in the current configuration for 800 hashes concurrently.
Anybody, have an idea how fast that is compared to modern a CPU?
IIRC, the last time I did anything like this it took my 2200+ AMD about 24 hours to do a 6-character keyspace (from 64-character set) - with MD5.
"NSA@home is a fast FPGA-based SHA-1 and MD5 bruteforce cracker. It is capable of searching the full 8-character keyspace (from a 64-character set) in about a day in the current configuration for 800 hashes concurrently." So your 2200+ AMD is beaten to little pieces by this monster.
Source, well, you had to click a single link to their homepage [unaligned.org]. That'll learn you to post early. Or not, since I supplied you the answer anyway.
Arg! Whoops, sorry about that. Read the post, but thought you were quoting the summary. I've wondered about Cell performance myself for a while, but I haven't got the time to go out of the way to do some measurements. For SSE3/4: I would call it highly unlikely that we would see anything like the performance they are posting: 2 ^ 12 performance difference for MD5 alone is quite a lot. Maybe SSE5 might speedup SHA-2 as well. Anyway, you might want to add the T2 (Rock processor) from Sun to that list, it has 8
That's nice, his own SHA-1 cracker. But, even with advanced cryptographic attacks, SHA-1 is still in the order of 2^63. Not something you would like to try with just a few FPGA's. What is meant here is a cracker to find out which plain text, with limited entropy, is used to create a certain hash value. A SHA-1-based password cracker would therefore be a better name, I suppose.
It seems from here [unaligned.org] that it searches a 64 ^ 8 = (2 ^ 6) ^ 8 = 2 ^ 48 keyspace in 24 hours. No small feat, it should therefore do about 3,257,812,230 hashes in a second. It does 800 concurrently, which makes for 4 million a second per SHA-1 unit. Ouch, that's really fast.
Note that this could be done with any hash or symmetric algorithm, as long as it can be implemented on FPGA. So the moral of the story: use very long password (or even better, pass phrases), or make sure that they won't be able to acquire the hash.
By comparison my Althon 64 3200+ does about 883,000 16byte hashes a second
$ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 2586683 sha1's in 2.93s Doing sha1 for 3s on 64 size blocks: 2063294 sha1's in 2.90s Doing sha1 for 3s on 256 size blocks: 1199179 sha1's in 2.75s Doing sha1 for 3s on 1024 size blocks: 479901 sha1's in 2.84s Doing sha1 for 3s on 8192 size blocks: 71496 sha1's in 2.87s OpenSSL 0.9.8c 05 Sep 2006 built on: Tue Mar 6 08:16:57 UTC 2007 options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cis
Perhaps this gets brought up each time, but what are we supposed to use for password encryption anyway? MD5 seems to be inadequate. SHA-1 is also waning. I switched to Blowfish on all my FreeBSD servers partially because of MD5 problems, but also because it's not a common format to come across for anyone figuring they'd just have MD5 hashes to try - I understand however that blowfish was not intended for this purpose. But it seems like MD5 and SHA are getting weaker by the day with computational power on t
Don't forget that PKCS#5 v2.0 uses an iteration count and a salt. This means that the algorithm is not applied just once, but 1000 times (or more, 1000 is the minimum). This would mean a slowdown of 1000 on these kind of crackers *if* they implement the iteration count. A salt would make it hard to use a default configuration like this one found on internet as well. As said, the hash algorithm itself does not matter too much. The problem with all these schemes is that the amount of entropy in the passwords i
Just like the contest to create the AES encryption standard, there is an ongoing one (or happening soon) for cryptographic hashing algorithms. You can probably expect a good one with good vendor support within a couple years or so. Note that if you are using hashes in a typical cryptographic environment (signing a message by encrypting the hash of the message with your private key so that others can verify via your public key), I don't believe this would be a problem. Also, this is only effective in any
No, no-one has reported cracking it. Bear in mind that Ken is capable of hiding [cmu.edu] stuff below the source code of the OS, Ken could have set it up so that when a program outputs this particular string, Unix takes some predetermined action such as calling in the black helicopters.
On a more serious level, but for the same reason, there is no reason to think that this entry in the password file corresponds to a valid Unix password, since if that system was based on his code, he login will bypass normal authenti
I know about the "Reflections on Trusting Trust" paper. I'm not sure he ever implemented it; he just implied he knew how to. It pays to be paranoid!
In the paper UNIX Password Security - Ten Years Later [springerlink.com] the authors wrote "Over the past 10 years, improvements in hardware and software have increased the crypts/second/dollar ratio by five orders of magnitude." That was about 20 years ago, so if no-one has reported cracking ken's password in the meantime, I think the original UNIX password algorithm has sto
Password encryption is data storage. Like your hard disks, use something that's big enough for today and remember to upgrade it when the encryption method doesn't have sufficient space to protect you any more.
There's three ways to attack a hash: attack collision resistance, attack pre-image resistance, and attacking the plaintext. Collision resistance means it should be difficult to find two texts that have the same hash value. The upper bound for these attacks is 2^(n/2), where n is the length of the hash. For SHA-1, that upper bound it 2^80. Because of some more sophisticated attacks, 2^63 is now the current best for a collision attack.
Pre-image resistance means given a hash it should be impossible to find
Can someone explain how this sort of machine could be used, practically speaking, to break e.g. an email encrypted with PGP, or/etc/shadow?
It probably can't, really. But it is useful for two things:
1.) It can be used to crack passwords/passkeys encrypted in this manner. It makes it possible for the owner of the device to obtain the actual password/passkey if he can get a hold of the hashed version. This may be useful in cracking networks or security barriers.
2.) It can display to the world, just how cheat it is to build a device which can reverse hashed data. If he can get his hands on this device for small money, imagine how many FPGAs t
I used to draw patterns like that while suffering through triple Maths - draw a circle, mark off every 10 degrees (or 5 if it looked like being really boring today), then join every point to every other point. Mindless, yet strangely satisfying.
And what kind of nerd site doesn't let you use a degree symbol?
For the record the company is Thomson and that is a peice of equipment known as the Princeton Engine used by the IC developers to quickly verify their software/algorythms. It was lying around in our computer room (known as the Princeton Engine room) for years. Its replacement is from Cadence and is called Palladium and has the power of several hundred of those old fpga boards.
He's not looking for collisions - he's looking for preimages of a given hash. Since he can't search a large enough space to find a preimage of an arbitrary hash, the most useful application of this sort of thing is password cracking - given the hash of someone's password, search the space of plausible passwords until you find one that matches the hash (taking salt into account as appropriate). Fun but not too advanced. Shame - what I was really hoping to read was that he'd implemented the latest collision
The boards contain 15 Virtex-II Pro (XC2VP20) FPGAs in 3 identical sets of 5 (here called "channels"). Each channel also owns a Spartan-II (XC2S50) FPGA that was originally used as a control chip, and a DSP (ADSP21160M) which probably calculated transform parameters. There is also a shared XC2S50 chip, which is not used in this application, just like the DSPs. The clock distribution tree unfortunately contains 2 domains, which means the 39MHz channel clock had to be distributed from chip to chip, using inte
Seems to me that it searches all possible 64-bit words that could be given to SHA-1. It cleverly reorders the search so that the Hamming distance of each block is at most 2 bits from the previous block, which allows Virtex block RAM resources to be used as part of the hash hardware. FPGA engineers often only use block RAMs for caches, FIFOs and scratchpads, so it is interesting to see them being used as part of the pipeline in this way despite the two-port limitation.
So it doesn't search all possible inputs to SHA-1, but maybe you could use it in this situation:
SHA-1(salt, password) = hash
Given hash and salt, you can find password by a brute force search on this hardware (assuming password is less than 9 characters in length). This could be useful for obtaining user passwords from/etc/shadow when something like md5crypt is in use, although md5crypt might well be designed to defeat/slow down this type of attack, for example by using multiple rounds of hashing (as done by the older DES-based crypt program).
If you can read/etc/shadow you're root.. which means you aren't gaining anything by it.
There are still arbitrary file disclosure vulnerabilities which *only* allow you to view files, not gain access to the server itself. If you pull the password hashes, you can then bruteforce the passwords and gain full root access to the system. Plus it would give you access to any *other* machines on the network which the admin used the same root password. Just rooting a single box wouldn't give you access to any other machines (assuming that didn't share the same initial vuln).
That's assuming the user logs in on a regular basis. On a server that isn't a given. If you pull the password hashes, you can bruteforce all of the users passwords generally in under a week on a 1Ghz processor. With this tool it sounds like you could do it significantly faster. I'd probably use both approaches if I was trying to compromise a network.
You might be in the shadow group, and there might be a server application that is in said group in order to read/etc/shadow, so if you can exploit that service to gain access to the contents of the shadow file, you can then try to root the machine after cracking root's (or someone with sudo I guess) password.
You might be in the shadow group, and there might be a server application that is in said group in order to read/etc/shadow, so if you can exploit that service to gain access to the contents of the shadow file, you can then try to root the machine after cracking root's (or someone with sudo I guess) password.
For the uninitiated... (Score:5, Informative)
This makes them fairly pointless for general computing, but when you need to crunch a bunch of numbers in the same way over and over, they can REALLY outperform a general cpu. Usually these are used to manipulate audio / video data streams in real time (the original purpose for the FPGAs used in this project) - but recently people have started using them to brute-force try to crack an encryption scheme. Where a general purpose cpu might take upwards of 40 clock cycles to check one possible answer, each of the FPGAs in this system can check at least one answer PER clock cycle.
This guy pulled a bunch of FPGA systems out of some (defective?) HDTV video processing systems - reverse engineered exactly how everything was wired together, reprogrammed the FPGAs to do SHA-1 hash cracking rather than HDTV video processing, and added some usb control circuitry so the system could take commands from / return results to a pc.
One could use this same board setup to do any sort of massively parallel data processing, but right now the system isn't wired up to really feed large amounts of data into / out of the system in real time. He can get away with that as hash cracking results are fairly small and infrequent, so the limited means he has for getting "answers" out of the system isn't too much of a problem.
Posted at 4:39AM on Sep 1st 2007 by smilr
Re:For the uninitiated... (Score:4, Funny)
Parent
It's mainly logic, not analogue parts (Score:5, Informative)
An FPGA really is conceptually very simple, and they're not hard to "program" either... Contrived example:
Verilog design to add/subtract 2 numbers (you'd never do this, but...)
Compare that to a K&R "C" routine to do the same thing...
In both cases, of course, you'd just use the 'if...else...' part, but I wanted to show more language structure...
:-), you don't get world-beating overnight, but it's relatively easy to get the 80% solution, and that might be just fine. Eking out the last 20% is where it gets hard, as you have to understand the internal structure of the LUTs, and how they interact with the carry-chain, what the LUT->LUT delay can be useful for etc. None of this is at all relevant unless you're missing your timing on a critical circuit (eg: you need 133MHz so your DDR2 SDRAM can work, but the synthesis tools (equivalent to a compiler) only deliver 125 MHz for your design).
The key thing to remember is that in C, all things happen serially, unless you arrange otherwise with threading libraries. In Verilog, any block beginning with 'always @' happens in parallel with every other 'always @' block. Once you've mentally-mapped the concept of vast numbers of threads to the way hardware works, any competent multi-threaded programmer can become a competent hardware engineer.
Of course, there's "guru stuff" in there as well (as much as you want, trust me
The 'always@' part is the hint of just where the power lies. *Everything* can happen in parallel, so you can build pipelines (like CPU's are pipelined today) into your logic, thus reducing the time taken per step (while taking multiple steps), thus increasing your clock rate. The benefit of course is that although the *first* result comes out in the same time, every clock thereafter, you'll also get a result.
I wrote a JPEG decoder a couple of years or so ago, running at ~130MHz. That doesn't sound much, but that comes to ~65 MPixels/second because of the pipelining. Looking at the SSE-optimised intel libraries, a CMYK422->CMYK baseline decode (which is what the FPGA was doing) takes 371 clocks/pixel. The intel chip I was comparing to was a 3.6GHz P4, meaning it could do ~9.7 Mpixels/second. For motion-jpeg that's the difference between decoding SD frames (for the P4) and decoding HD frames (for the FPGA)...
So, FPGAs tend to run slowly (relative to today's CPUs) but can exploit parallelism in ways CPUs just can't, but of course for serial processing, you can't beat a tradition
Parent
FPGA question... (Score:3, Interesting)
Coming from a traditional software end of things, I'm used to seeing "accelerating co-processors" available to do useful tasks much faster than the main CPU. I'm thinking not only the FPU (when it was a separate chip), but things like a modern GPU and such. Many of these have been slowly integrated back into the CPU as time has gone on, the FPU being the best example, so now it's something you can just cal
Re: (Score:3, Interesting)
Re: (Score:2)
I'm more thinking of a specific hardware piece, like an FPU co-processor. Something not re-programmable, but theoretically much faster for that specific task. It wouldn't make sense for a lot
Re:FPGA question... (Score:4, Interesting)
You can buy (though I think they're very expensive) "IP cores", which are pre-packaged modules ready to plug-in-and-go. There are some free ones available as well. You may have to do more work to get the free ones to work [grin].
There are also built-in hard cores on modern FPGA's. You never used to be able to synthesize the statement "a = b * c;" in a verilog design, for example. Now that FPGA's have hardware multiplier blocks in them, it synthesises to a bunch of wires connecting up the LUTs to the built-in hardware. For the more-complex examples you suggest, it's best to implement them in logic, because an FFT (of a particular radix, input format (complex or real), and output requirements) is a very specific piece of hardware, and not generally useful to most customers.
You get multipliers, blocks of fast dual-port RAM, even entire processors (PPC) embedded into the FPGA fabric these days. Of course, you pay more for things like embedded CPUs... Funnily enough, a CPU is one of the easier things to write for an FPGA IMHO. You'll never get the speed of the FPGA fabric to match the hard-CPU core though...
To do what you're talking about though, you'd need a way to interface the FPGA to the PC - there's a freely available PCI core, so you'd then just need a card which had a PCI interface (there's one from Enterpoint [enterpoint.co.uk] for ~$150... Then you just need to link the PCI core to your own cores (FFT, whatever) and write software to offload any FFT's to your co-processor. Xilinx offer the "Embedded Development Kit" to make this easier (you have to pay for this, the other tools are free to download). I don't know if anyone has made the freely-available PCI core into an EDK module though...
Simon.
Simon
Parent
Re: (Score:2)
It sounds like my idea is happening, but at a much lower level than I was thinking (like the multiply example you gave). I guess I'm still thinking of things at the wrong level (software, high level functions), when it's much more basic things that need to be accelerated.
The dual-port RAM interface makes a lot of sense - it'd be a lot nicer than trying to do it yourself with general purpose pins, I'd think.
Re: (Score:2)
quartus 3 at least can synthisize a = b * c on a chip without a hardware multiplier, it takes a lot of logic cells though.
Re: (Score:2)
I'm coming from a different perspective, that's all. FPGA's are a hobby for me, nothing more. I can afford to spend a few hundred dollars on a kit board, but I'd never drop a few grand on a core... I'd either do it myself or make do without. I'll use webpack exclusively for development (since they dropped the in-between option, Foundation is far too
Re: (Score:2)
For some reason this passage comes to mind. I can now just learn to blow glass better; computers are never going to be my bag.
Re: (Score:2)
As for FPGAs being cheap, all things are relative. There are ASICs out there for nearly any common application imaginable and these are often well under $50 and are usually designed by people who have extens
Re: (Score:2)
Simon.
Re: (Score:2)
Re: (Score:2)
It is just that in consumer electronics, CPLDs and FPGAs are usually there to replace glue-logic and complement the microcontroller/embedded processor's capabilities... like working around bugs. One example of FPGA in consumer equipment is early Radeon SLI: the 'master' board used a Xilinx FPGA to receive pixel data from the slave board and combine it with data from the master to produce the final im
How fast is that? (Score:3, Informative)
Anybody, have an idea how fast that is compared to modern a CPU?
IIRC, the last time I did anything like this it took my 2200+ AMD about 24 hours to do a 6-character keyspace (from 64-character set) - with MD5.
Re: (Score:2)
So your 2200+ AMD is beaten to little pieces by this monster.
Source, well, you had to click a single link to their homepage [unaligned.org]. That'll learn you to post early. Or not, since I supplied you the answer anyway.
Re: (Score:2)
I was wondering how it compared with the latest and greatest like x64 with SSE3/4 or a Cell processor...
(that'll learn you to actually read the post you are replying to. Or not)
Re: (Score:3, Informative)
I've wondered about Cell performance myself for a while, but I haven't got the time to go out of the way to do some measurements. For SSE3/4: I would call it highly unlikely that we would see anything like the performance they are posting: 2 ^ 12 performance difference for MD5 alone is quite a lot. Maybe SSE5 might speedup SHA-2 as well. Anyway, you might want to add the T2 (Rock processor) from Sun to that list, it has 8
Re: (Score:2)
You should compare against VIA hardware. Their CPUs are crap for general usage, but the crypto acceleration is really good:
http://www.logix.cz/michal/devel/padlock/bench.xp [logix.cz]
Page doesn't seem to include MD5/SHA1 though, but you can compare that to AES on your box.
Re: (Score:2)
From one of the comments (I assume c/s is Cracks per second?):
SHA-cracker? (Score:5, Informative)
It seems from here [unaligned.org] that it searches a 64 ^ 8 = (2 ^ 6) ^ 8 = 2 ^ 48 keyspace in 24 hours. No small feat, it should therefore do about 3,257,812,230 hashes in a second. It does 800 concurrently, which makes for 4 million a second per SHA-1 unit. Ouch, that's really fast.
Note that this could be done with any hash or symmetric algorithm, as long as it can be implemented on FPGA. So the moral of the story: use very long password (or even better, pass phrases), or make sure that they won't be able to acquire the hash.
Re: (Score:2)
Its a bit like if you built your own cruise missile. Telling the whole world about it might not be the smartest thing to do.
Re: (Score:3, Funny)
EFF [cryptome.org] seems to think it is the smartest thing to do.
Re: (Score:2, Informative)
Re: (Score:3, Insightful)
But it seems like MD5 and SHA are getting weaker by the day with computational power on t
Re: (Score:3, Informative)
As said, the hash algorithm itself does not matter too much. The problem with all these schemes is that the amount of entropy in the passwords i
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Interesting)
Re: (Score:2)
On a more serious level, but for the same reason, there is no reason to think that this entry in the password file corresponds to a valid Unix password, since if that system was based on his code, he login will bypass normal authenti
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Collision resistance means it should be difficult to find two texts that have the same hash value. The upper bound for these attacks is 2^(n/2), where n is the length of the hash. For SHA-1, that upper bound it 2^80. Because of some more sophisticated attacks, 2^63 is now the current best for a collision attack.
Pre-image resistance means given a hash it should be impossible to find
Maybe I'm Not As Big A Nerd As I Thought... (Score:2, Interesting)
Re:Maybe I'm Not As Big A Nerd As I Thought... (Score:5, Funny)
what?
Parent
Re: (Score:2)
Can someone explain how this sort of machine could be used, practically speaking, to break e.g. an email encrypted with PGP, or /etc/shadow?
It probably can't, really. But it is useful for two things:
1.) It can be used to crack passwords/passkeys encrypted in this manner. It makes it possible for the owner of the device to obtain the actual password/passkey if he can get a hold of the hashed version. This may be useful in cracking networks or security barriers.
2.) It can display to the world, just how cheat it is to build a device which can reverse hashed data. If he can get his hands on this device for small money, imagine how many FPGAs t
I recognise that pattern! (Score:2)
I used to draw patterns like that while suffering through triple Maths - draw a circle, mark off every 10 degrees (or 5 if it looked like being really boring today), then join every point to every other point. Mindless, yet strangely satisfying.
And what kind of nerd site doesn't let you use a degree symbol?
Re: (Score:2)
Re: (Score:2)
Princeton Engine (Score:2, Interesting)
This is essentially a password cracker (Score:2)
Shame - what I was really hoping to read was that he'd implemented the latest collision
15 Virtex-II Pro FPGAs (Score:2)
Re:Benchmarks? (Score:5, Informative)
So it doesn't search all possible inputs to SHA-1, but maybe you could use it in this situation:Given hash and salt, you can find password by a brute force search on this hardware (assuming password is less than 9 characters in length). This could be useful for obtaining user passwords from
Parent
Re: (Score:2)
In the old days when passwords were in
Re:Benchmarks? (Score:4, Interesting)
Parent
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Informative)
http://www.bash.org/?701504 [bash.org]