Stories
Slash Boxes
Comments
typodupeerror delete not in

Comments: 159 +-   Historians Recreate Source Code of First 4004 Application on Thursday November 15 2007, @06:34PM

Posted by Zonk on Thursday November 15 2007, @06:34PM
from the really-hard-to-dig-through-bits-and-bytes dept.
intel
software
hardware
mcpublic writes "The team of 'digital archaeologists' who developed the technology behind the Intel Museum's 4004 microprocessor exhibit have done it again. 36 years after Intel introduced their first microprocessor on November 15, 1971, these computer historians have turned the spotlight on the first application software ever written for a general-purpose microprocessor: the Busicom 141-PF calculator. At the team's web site you can download and play with an authentic calculator simulator that sports a cool animated flowchart. Want to find out how Busicom's Masatoshi Shima compressed an entire four-function, printing calculator into only 1,024 bytes of ROM? Check out the newly recreated assembly language "source code," extensively analyzed, documented, and commented by the team's newest member: Hungary's Lajos Kintli. 'He is an amazing reverse-engineer,' recounts team leader Tim McNerney, 'We understood the disassembled calculator code well enough to simulate it, but Lajos really turned it into "source code" of the highest standards.'"
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Those were fun (Score:5, Interesting)

    by certsoft (442059) on Thursday November 15 2007, @06:46PM (#21372181) Homepage
    Somewhere around 1975 or 1976 I wrote software for a 4004 (using a teletype connected to a modem connected to a mainframe someplace that had the assembler) to run a X-Y table. You would place a wafer with thick-film resistors on it and it would test each one to make sure it was within tolerance and if it wasn't it would mark it with magnetic ink. I think we were probably still using the infamous 1702 EPROMs but there might have been something newer at that time.
    • Re: (Score:3, Interesting)

      somewhere around 1982 a buddy of mine and myself disassembled and commented microsoft's basic for the trs-80 color computer. Then we improved it with tons of new statements via the hook in ram. Documenting a bloody calculator is childs play compared to that and we weren't especially proud of it, just curious.

      • somewhere around 1982 a buddy of mine and myself disassembled and commented microsoft's basic for the trs-80 color computer.
        And you lived to tell the tale.
      • Thousands of people now and in the future would be interested in studying this code. Please dig up and post this work. Perhaps to one of the 'vintage computer' websites.

        People are still writing assembler code for tiny microprocessors. However now it is being done for very inexpensive microcontrollers like the Atmel AVR and the Microchip PIC. This ICs have all their major components integrated (like program ROM, limited RAM, UARTs, and ADC) and sell for about $1-$2. This business is moving
        • Re: (Score:3, Interesting)

          the tools we had were the Leventhal 6809 book, we wrote the disassembler (and the assembler) ourselves,
          to make it a little easier to relate to I said color computer but in fact it was a very little known
          clone called the Dragon 32 (which, incidentally as we found out had 64K that you could use if you
          pulled a few tricks).

          I wished I had known about OS/9 at the time (but this was long before the age of easy access to
          information and in Europe).

          But hey, why am I feeding the trolls... anonymous ones at that :)

          I gu
    • In 1970 the PDP series from DEC, e.g. PDP-8, had an interpreted (and used interactively) language called FOCAL, arrays (even sparse ones), real numbers, usual math and other functions, for loops, if statements blah blah blah... all the usual stuff - the entire interpreter *and* runtime was programmed in a total of 2K instructions (and they were primitive instructions). That was normal for the time.
      • Re: (Score:3, Interesting)

        And I once wrote a full-featured symbolic assembler in 1579 bytes. Besides symbolic labels, it supported address expressions with +=/* and logical AND/OR, hex and text strings, and a lot more. To the best of my knowledge it is the smallest symbolic assembler ever written. I published and sold it as The Assembler for the VIC-20.
      • A little while later, he sold us the "first" 8008 in the area. Dick.

        Why the abuse?

        Did he overcharge you?

  • by Dusty (10872) on Thursday November 15 2007, @06:53PM (#21372247) Homepage
    You can still run it on the latest Intel x86 chips. ;)
  • "Historians Recreate Source Code of First 404 Error Message"

    (truth be told, quick scanning the headlines, that's what my brain registered)
  • by gatekeep (122108) on Thursday November 15 2007, @06:55PM (#21372271)
    "...an authentic calculator simulator..."

    What the hell is an authentic simulator?
  • by Eberlin (570874) on Thursday November 15 2007, @07:07PM (#21372365) Homepage
    Quick, someone send this over to the folks who wrote Excel!
  • by geekoid (135745) <dadinportland.yahoo@com> on Thursday November 15 2007, @07:11PM (#21372405) Homepage Journal
    58008
  • Commander Keen (Score:5, Interesting)

    by QuantumG (50515) <qg@biodome.org> on Thursday November 15 2007, @07:16PM (#21372451) Homepage Journal
    I once reverse engineered the classic id software game Commander Keen. John Carmack did some cool stuff in that code.. each sprite had two function pointers in it, one was called when the sprite came into contact with another sprite, the other was called every frame to animate the sprite (he called it the "think" function). When you killed a monster the sprite was replaced with a "body" which was just like a sprite but had a few less fields (so it took up less memory). One of the neatest things he did was use this exact same framework of sprites and bodies to animate the "static" parts of the game. For example, the color coded doors that you have to get the key cards to open were sprites with a contact function that checked if the player had the right key card, at which time they would "die" and be replaced by a body that had a think function would make them slide out of the way.

    For anyone who would like to take a look, I've put the re-engineered source code [insomnia.org] up.
    • Re:Commander Keen (Score:5, Interesting)

      by Cheesey (70139) on Thursday November 15 2007, @07:52PM (#21372789)
      Carmack's code is always interesting. Most famously, there's the infamous square root approximation from Quake [codemaestro.com]. But I'm still impressed by the original Doom render loop, with it's self-modifying code.

      The loop is drawing columns (vertical slivers of wall). It needs to interpolate between two things: the input wall texture, and the output part of the screen. Carmack uses something like Bresenham's line drawing algorithm to do this, but because the 386 has such a limited register set, he stores the fractional increment in an immediate attached to the "addl" instruction:

      doubleloop:
          movl ecx,ebp // begin calculating third pixel
      patch1:
          addl ebp,12345678h // advance frac pointer
          movb [edi],al // write first pixel
          shrl ecx,25 // finish calculation for third pixel
          movl edx,ebp // begin calculating fourth pixel
      patch2:
          addl ebp,12345678h // advance frac pointer
          movl [edi+SCREENWIDTH],bl // write second pixel
          shrl edx,25 // finish calculation for fourth pixel
          movb al,[esi+ecx] // get third pixel
          addl edi,SCREENWIDTH*2 // advance to third pixel destination
          movb bl,[esi+edx] // get fourth pixel
          decl [loopcount] // done with loop?
          movb al,[eax] // color translate third pixel
          movb bl,[ebx] // color translate fourth pixel
          jnz doubleloop
      and elsewhere... :)

      movl ebx,[_dc_iscale]
          shll ebx,9
          movl eax,OFFSET patch1+2 // convice tasm to modify code...
          movl [eax],ebx
      A similarly impressive trick is used to draw floors, where 3D interpolation is required because each texture needs to be crossed diagonally, not vertically. I never understood how Doom drew floors until I looked at the code, and I still think it's deep magic. And that's without even mentioning the BSP code!
      • "Carmack's code is always interesting. Most famously, there's the infamous square root approximation from Quake."

        That is indeed impressive code, but John claims he didn't write it. In fact, nobody at id has claimed authorship of it. It was speculated that perhaps Michael Abrash wrote it, but he denies authorship as well. My speculation is that it was a cool snippet of code floating around the public domain, and somebody at id had the good judgment to realize that it was significantly faster than the stan
  • by compumike (454538) on Thursday November 15 2007, @07:26PM (#21372543) Homepage
    Take a look at this set of videos from MIT's 6.004 Computation Structures [mit.edu] class. They basically walk through the design of a simple 32-bit CPU from transistors, to gates, to functional blocks, to a full processor.

    Anyway, reading about how hard it was to recreate the source code from the 4004 makes me wonder how easily we could find source code for some apps from even a decade ago. Lots of companies have gone bankrupt / discontinued products / been sold / etc, and we all know that lots of people aren't good about backing up their code. It's neat to go to the Linux Kernel Archives and look at the Historic Linux sources [kernel.org].

    --
    Educational microcontroller kits for the digital generation. [nerdkits.com]
    • Simple -- buy a wire wrap tool, a breadboard kit and the TTL Handbook.

      Can't find them? Should be on the shelf there somewhere. There must be a lot of old kit you can use to desolder TTL circuit components. You may need to build a Heathkit http://en.wikipedia.org/wiki/Heathkit/ [wikipedia.org] dual-trace CRO first though.

      Geez I'm getting old.

  • Amazing! (Score:5, Insightful)

    'He is an amazing reverse-engineer,' recounts team leader Tim McNerney, 'We understood the disassembled calculator code well enough to simulate it, but Lajos really turned it into "source code" of the highest standards.'

    No disrespect to Lajos, but have we really fallen so far in programming standards that it's considered "amazing" to disassemble a 1024 byte program? Back in my day (and stay the hell off my lawn!) we used to disassemble programs all the time. I reverse engineered the operating system for a computer I developed for because we wanted to hook into places that weren't accessible.

    Disassembly is apparently a lost art in these decadent days of some programmers never using anything but scripting languages (e.g., Java, Python, Perl) and having no clue what goes on under the hood.

    • No disrespect to Lajos, but have we really fallen so far in programming standards that it's considered "amazing" to disassemble a 1024 byte program?

      Good question. Lets look at the excerpt from TFA included in TFS:

      'He is an amazing reverse-engineer,' recounts team leader Tim McNerney, 'We understood the disassembled calculator code well enough to simulate it, but Lajos really turned it into "source code" of the highest standards.'

      Sure looks to me that what Lajos is being credited with isn't the disassembl

      • 'He is an amazing reverse-engineer,' recounts team leader Tim McNerney, 'We understood the disassembled calculator code well enough to simulate it, but Lajos really turned it into "source code" of the highest standards.' [...] Sure looks to me that what Lajos is being credited with isn't the disassembly, at all.

        I disagree, McNerney seems to be saying that they understood the machine language well enough to simulate the calculator, but Lajos disassembled and commented the source code so that they underst

        • Re:Amazing! (Score:4, Insightful)

          by be-fan (61476) on Thursday November 15 2007, @08:44PM (#21373275)
          "Programs must be written for people to read, and only incidentally for machines to execute." - Abelson & Susman

          From a theoretical point of view, assembly knowledge isn't particularly useful because it doesn't lend itself to rigorous analysis (the "science" part of "computer science"). From a practical point of view, since very few programs are written in assembly language anymore, knowledge of it has limited utility. Further, from a practical point of view, I'd much rather deal with a programmer who can explain his work in terms of data structures and algorithms than one that is stuck thinking in terms of registers and memory locations.

          There is certainly a place for assembly knowledge*. It's just a niche, and not a particularly important one anymore. Meanwhile, there are lots and lots of diverse applications for the theory they teach you in those classes instead of assembly. In my own work, I've had to bust out the graph theory way more often than I've had to bust out my knowledge of asm tricks for fast line-rendering...

          *) Interestingly enough, one of those places is inside the language runtimes of high-level languages. There are usually lots of neat tricks inside those things (eg: using the NaN space of double-precision floats to store unions of floats and 51-bit integers without extra variant tags!)
      • Re:Amazing! (Score:4, Interesting)

        by dmonahan (957638) on Thursday November 15 2007, @08:31PM (#21373175)
        Sometime in the early 70s, a Honeywell division, one of our steady clients, called with a strange request. They had built a small number of special machines for the Navy. Now the Navy wanted more. Honeywell had the circuit drawings and the bootable tape (which they got from the Navy). They had no documentation (not even the instruction set). They asked us to rebuild the code. We did. Dick.
    • No disrespect to Lajos, but have we really fallen so far in programming standards that it's considered "amazing" to disassemble a 1024 byte program?

      I dunno. I'm certain I could look at any given one kilobyte program and tell you "that opcode is adding the results of those two", but it takes a certain kind of cleverness to figure out why it's using opcodes for constants and how they manage to pack a shift-right-branch-if-odd into two bytes plus an index register.

      See also "The Story Of Mel" [pbm.com]. Now imagine being tasked with turning that into readable, understandable code. That's the real accomplishment.

      • Now imagine being tasked with turning that into readable, understandable code.

        Don't have to imagine. Well, that's not quite true; I've never done anything on the scale of drum timing. But "disassembly" isn't just "opcode 53 adds X to the accumulator"; that part's (nearly) always been easy and automated. The hard part is going through, figuring out what every single memory location is for, making up your own consistent label for it, understanding the self-modifying part of the code, etc. Disassembly is a
        • Been there, done all of that. And no, what he did isn't extremely difficult or novel, but I still give him credit for pulling off nice work.

    • The company I work for hired 4 programmers (from out of country) to re-work existing code and clear out known bugs. As a result, the log in no longer worked. 2 weeks later, the testers could get in, but none of the drop down boxes worked and more. Problem is they are wizards. They click and drop code with out understanding what the code does. The US trained programmers cant get the time of day from the head of IT.
    • This is not your typical disassembly job.

      If you read the fine source code linked to in the article, you would see that not only is the machine code disassembled, but the virtual machine that it implements is fully described. That's not a trivial exercise.

  • 'leet speak first turned the world upside down as a joke about "BOOBLESS". I wonder if the 4004 could run a softporn text adventure game like that.
  • by lseltzer (311306) on Thursday November 15 2007, @08:13PM (#21372967)
    I found a buffer overflow. Exploit code to follow...
  • File version 6.0.6000.16386
    Size: 172 KB (176,128 bytes)

    kinda puts things in perspective, doesn't it?

    ah, "progress"
  • I have the original 4004 reference guide (blue cover), scored during an early Wescon convention in 1970. I looked at this and said -- "Oooh, a whole hexadecimal digit on a single chip. That's going to change things."

    People used to consider square wave logic charts a programming tool back then, too.

  • How has no one mentioned this yet? - Don't blame me too much, I just copied and pasted from: http://downlode.org/Etext/power.html [downlode.org]

    The Feeling Of Power
    by Isaac Asimov

    Jehan Shuman was used to dealing with the men in authority on long-embattled earth. He was only a civilian but he originated programming patterns that resulted in self-directing war computers of the highest sort. Generals, consequently listened to him. Heads of congressional committees too.

    There was one of each in the special lounge of New Pentagon. General Weider was space-burned and had a small mouth puckered almost into a cipher. He smoked Denebian tobacco with the air of one whose patriotism was so notorious, he could be allowed such liberties.

    Shuman, tall, distinguished, and Programmer-first-class, faced them fearlessly.

    He said, "This, gentlemen, is Myron Aub."

    "The one with the unusual gift that you discovered quite by accident," said Congressman Brant placidly. "Ah." He inspected the little man with the egg-bald head with amiable curiosity.

    The little man, in return, twisted the fingers of his hands anxiously. He had never been near such great men before. He was only an aging low-grade technician who had long ago failed all tests designed to smoke out the gifted ones among mankind and had settled into the rut of unskilled labor. There was just this hobby of his that the great Programmer had found out about and was now making such a frightening fuss over.

    General Weider said, "I find this atmosphere of mystery childish."

    "You won't in a moment," said Shuman. "This is not something we can leak to the firstcomer. Aub!" There was something imperative about his manner of biting off that one-syllable name, but then he was a great Programmer speaking to a mere technician. "Aub! How much is nine times seven?"

    Aub hesitated a moment. His pale eyes glimmered with a feeble anxiety.

    "Sixty-three," he said.

    Congressman Brant lifted his eyebrows. "Is that right?"

    "Check it for yourself, Congressman."

    The congressman took out his pocket computer, nudged the milled edges twice, looked at its face as it lay there in the palm of his hand, and put it back. He said, "Is this the gift you brought us here to demonstrate. An illusionist?"

    "More than that, sir. Aub has memorized a few operations and with them he computes on paper."

    "A paper computer?" said the general. He looked pained.

    "No, sir," said Shuman patiently. "Not a paper computer. Simply a piece of paper. General, would you be so kind as to suggest a number?"

    "Seventeen," said the general.

    "And you, Congressman?"

    "Twenty-three."

    "Good! Aub, multiply those numbers, and please show the gentlemen your manner of doing it."

    "Yes, Programmer," said Aub, ducking his head. He fished a small pad out of one shirt pocket and an artist's hairline stylus out of the other. His forehead corrugated as he made painstaking marks on the paper.

    General Weider interrupted him sharply. "Let's see that."

    Aub passed him the paper, and Weider said, "Well, it looks like the figure seventeen."

    Congressman Brant nodded and said, "So it does, but I suppose anyone can copy figures off a computer. I think I could make a passable seventeen myself, even without practice."

    "If you will let Aub continue, gentlemen," said Shuman without heat.

    Aub continued, his hand trembling a little. Finally he said in a low voice, "The answer is three hundred and ninety-one."

    Congressman Brant took out his computer a second time and flicked it. "By Godfrey, so it is. How did he guess?"

    "No guess, Congressman," said Shuman. "He computed that result. He did it on this sheet of paper."

    "Humbug," said the general impatiently. "A computer is one thing and marks on a paper are another."

    "Explain, Aub," said Shuman.

    "Yes, Programmer. Well, gentlemen, I write down seventeen, and just undernea
  • 1024 Bytes? Bah! (Score:3, Interesting)

    by LS (57954) on Friday November 16 2007, @04:01AM (#21376041) Homepage
    How about 256 bytes for a 3D rotating parallax tunnel fly-through [256b.com] !!!

    LS
    • Re: (Score:3, Interesting)

      The original lacked a gui.

      And scientific functions.

      And the ability to convert hex.

      And store/recall.

      The original had 4 functions. This one has at least 40. Would you rather the MS guys spend time seeing if they can force their 114k application down into 10k, or perhaps writing an operating system that doesn't suck?
      • by DragonWriter (970822) on Thursday November 15 2007, @07:01PM (#21372317)

        Would you rather the MS guys spend time seeing if they can force their 114k application down into 10k, or perhaps writing an operating system that doesn't suck?


        It'd be an improvement if MS did either.
      • I'm pretty sure it had a GUI. I'f I were to guess, I'd say it was buttons...possibly with numbers on them.
      • The KDE guys have gotten theirs to about 5k with a somewhat more functionality than Window's.
          • Re: (Score:3, Informative)

            They put all the actual code in a shared library:

            # ldd /usr/bin/kcalc
            libkdeinit_kcalc.so => /usr/lib/libkdeinit_kcalc.so (0x00002b1351db8000)
            ...

            # ls -lh /usr/lib/libkdeinit_kcalc.so
            -rw-r--r-- 1 root root 436K 2007-07-03 19:15 /usr/lib/libkdeinit_kcalc.so
        • I kind of wish their Excel could do at least 200!
          Or at least that it would do "=(10.1)-10-0.1" properly
          • Re:Only 1024? (Score:5, Interesting)

            by TDRighteo (712858) on Thursday November 15 2007, @10:57PM (#21374353)
            Floating-point math doesn't fix itself. Let's not be hard on Microsoft when:

            Python 2.5.1 (r251:54863, Oct 30 2007, 13:54:11)
            [GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2
            Type "help", "copyright", "credits" or "license" for more information.
            >>> 10.1-10-0.1
            -3.6082248300317588e-16
            and...

            $ perl
            printf("%s\n", 10.1-10-0.1);
            -3.60822483003176e-16
            and...

            $ php
            <?php
            echo (10.1-10-0.1);
            ?>
            -3.6082248300318E-16
            Note that the answers vary across languages too...
      • That sounds like a dare to me.
          • Re: (Score:3, Funny)

            by Anonymous Coward
            Did it, but the ATI drivers still sucked.
            • whah! ah! ah! ah! ah! ah!

              Your comment violated the "postercomment" compression filter. Try less whitespace and/or less repetition. Comment aborted.
    • That is the correct answer - all modern calculators are descended from a competitor's model which incorrectly calculated 9+9 to be 18.
    • You could read the documentation. You want 9+ 9+ = At the end you did 9+9+9 You could look at what's on the tape. IHBTHIND
    • by bpharri2 (173681) on Thursday November 15 2007, @09:02PM (#21373479) Homepage
      Of course if you had bothered to read the article, you'd know that it doesn't work like todays calculators but like the old adding machines:

      "The electronic calculators that accountants used 35 years ago worked differently than the familiar four-function calculator we use today. These were designed to behave much like mechanical adding machines of the 1960's. After every number you want to add to the total, you need to press +, so = doesn't work like you'd expect. Here are some examples:

      To add three numbers: 61 + 79 + 83 + = (if you forget the last +, the 83 won't get added)
      To subtract two numbers: 2007 + 1971 - =
      To multiply two numbers: 125 x 5 = (this is more like we're used to)
      To divide two numbers: 625 / 5 = "
I never expected to see the day when girls would get sunburned in the places they do today. -- Will Rogers