The Woman Who Mastered IBM's 5,400-character Chinese Typewriter (fastcompany.com) 58
Fast Company's technology editor harrymcc writes:
In the 1940s, IBM tried to market a typewriter capable of handling all 5,400 Chinese characters. The catch was that using it required memorizing a 4-digit code for each character. But a young woman named Lois Lew tackled the challenge and demoed the typewriter for the company in presentations from Manhattan to Shanghai.
More than 70 years later, Lew, now in her 90s, told her remarkable story to Thomas S. Mullaney for Fast Company.
More than 70 years later, Lew, now in her 90s, told her remarkable story to Thomas S. Mullaney for Fast Company.
Re: (Score:1)
It's called the Dumbass Tax. Enjoy!
Re: (Score:2)
Dunning-Krugerrands.
Great story about human memory (and aging) (Score:2)
But a really stupid FP. Why are you propagating the AC BS Subject?
I really enjoyed reading the story and should have spent more time with it. Means extra to me based on my own studies of Japanese? But you only have to memorize some 2,000 characters to handle the officially sanctioned kanji of post-war Japanese. Thanks to modern input technology, I've mostly given up trying to remember the stroke order for each character, though I can still reconstruct it to reproduce the proper character with the strokes in
Re: (Score:2)
Re: (Score:2)
Not sure of the exact schedule, but that's approximately correct. For the first six years there are a certain number of kanji for each grade, but after that it's more open for how and when you learn the last thousand. There are graded readers that show only the kanji appropriate for each year, with the rest of the words (to be studied later) appearing in hiragana.
I'm still waiting for a superior teaching program to teach kanji on a more intelligent (and faster) basis. I even have a funny (in the bad sense)
finally a good read (Score:5, Informative)
Re: (Score:2)
I basically concur (per my longer comment), but I think you're being too unkind to the introverts that probably dominate "most on slashdot" [sic]. It's just that the noisy and extroverted trolls desperately want attention in the discussions, so they post lots of junk.
Me? By their MEPRs I want to ignore them. It's a simple matter of not wasting time.
Re: (Score:2)
Thanks, based on that I'll read it. I was going to skip because there is a glaring error in the part quoted in the summary.
There are way more than 5k characters in Chinese. At least 10x that many. It's more accurate to say that there are about 5k that are commonly used and needed for various documents, books etc.
Re: (Score:2)
There are way more than 5k characters in Chinese. At least 10x that many.
That's a little misleading since most of those are graphical variants of the same character (like seal script). They look different, but they're the same.
Re: (Score:1)
Re: (Score:2)
Thanks, based on that I'll read it. I was going to skip because there is a glaring error in the part quoted in the summary.
There are way more than 5k characters in Chinese. At least 10x that many. It's more accurate to say that there are about 5k that are commonly used and needed for various documents, books etc.
The article does address that, by way of citing a contemporary newspaper article: "The machine...has 5,400 characters (the most commonly used of the 80,000 in the Chinese language)"
Re: (Score:2)
Sometimes these kinds of articles are interesting, but I can't think of anything to say in the comment section other than "wow, that was interesting!" So I say nothing.
Re: (Score:2)
Great, one for EMACS (Score:2)
Re:Great, one for EMACS (Score:5, Insightful)
For the sake of fairness, they should now find someone who learned to use VI's immediate predecessor.
You mean ex? Isn't that basically what you get when you run ed, or just run vi on a teletype or equivalent (like with no TERM[TYPE] set?)
Re: (Score:2)
I'm more in awe of the women who trained for months to use the voder [wikipedia.org], another dead-end technology with a brief life.
Not practical for masses (Score:2)
She's absolutely gorgeous! I'm in lust. *slap slap back to reality*...
Unless the market was a few highly-trained specialist, such a typewriter doesn't seem practical. A system that looks up characters via a phonetic alphabet, such as pinyin, is more practical because it's hierarchical like an English dictionary: match first letter, then second, then third, etc.
However, different dialects pronounce each Chinese word (character) differently. But roughly half of Chinese office workers knew either Cantonese or
Re:Not practical for masses (Score:4, Informative)
A system that looks up characters via a phonetic alphabet, such as pinyin, is more practical
Absolutely. But the typewriter in TFA was built in 1947. The technology needed to support pinyin entry didn't exist.
Lois Lew trained for years to master the device. Today, a Chinese teenager can learn to enter pinyin on her smartphone thumbpad in less than five minutes.
But roughly half of Chinese office workers knew either Cantonese or Mandarin
Actually, more than 90% know Mandarin, either as a 1st or 2nd language. And Cantonese is not in 2nd place. There are more speakers of Min dialects (Fujianese, Hokkien, etc.) and Wu (Shanghaiese, Zhejiang, etc.) than the Yue dialects of Guangdong (Cantonese).
Re: (Score:1)
I can kind of envision a mechanical device with say 3 or 4 phonetic dials for the word: 1st letter, 2nd letter, 3rd letter, that would narrow down the results. It would be kind of like the key+lock concept to filter platters or cards with notches. It would probably be more expensive to construct than the machine in the article, but if it saves training time it could be worth it.
Re: (Score:2)
Not only did the technology needed to support Pinyin not exist, Pinyin itself did not even exist.
Re: (Score:1)
You don't need technology for a pinyin-like language. I'm not following you.
Re: (Score:2)
the typewriter in TFA was built in 1947. The technology needed to support pinyin entry didn't exist.
I wonder if that's true, since it was taking multi-character input already. Instead of typing a four digit code, you could type three characters, a tone number, and then a disambiguation number.
Code complexity. (Score:2)
since it was taking multi-character input already. Instead of typing a four digit code, you could type three characters, a tone number, and then a disambiguation number.
Again, keep in mind that this device predates IBM's Selectric by almost two decade. And was only produced a couple of decade after the Enigma.
The 4 digit code is probably a very simple and direct encoding of the position of the character on the drum.
The 5 x 10 x 10 x 10 chords used to select character are probably not due to any systematic organisation of the characters, but tighly couple to the way the character is mechanically selected from the drum.
I.e.: you can't assign an arbitrary "code" to a characte
Re: (Score:2)
I.e.: you can't assign an arbitrary "code" to a character.
You can only assign an arbitrary position on the drum, and the specific code flows from this position.
I'm not sure what you're saying here. Why can't you put characters on arbitrary positions on the drum?
Code tree (Score:3)
I'm not sure what you're saying here. Why can't you put characters on arbitrary positions on the drum?
Sorry I wasn't clear.
What I mean is that this whole thing looks like:
- The drum contains a grid of 60x100 symbols rolled around.
- More precisiley, probably a bunch of 2D grid, each 10x10 symbols, and in turn this grid are arranged in a 6x10 higher grid.
- The code which is typed on the pad defines which position of this 2-level grid will be selected and printed on the paper.
- i.e.: 1 pair of number define the horizontal displacement of the drum (to select a specific "ring" around the drum, i.e.: a specific c
Re: (Score:2)
Oh yeah, I see your point on the mechanical nature of the machine requiring a more rigid encoding.
However, I am certain you could create an encoding of any Chinese character at a fixed-depth (might need to be 5 deep instead of 4, though), using a pronunciation based alphabet.
Re: (Score:2)
Instead of typing a four digit code, you could type three characters, a tone number, and then a disambiguation number.
Pinyin is not limited to three letters. For instance, "shuang" is six letters.
The other problem is that the disambiguation can't be done with one digit. For instance, "qing" can be more than 40 different hanzi.
Modern pinyin entry systems deal with this probabilistically from context. If you enter "qing" you will have 40 choices. But if you ignore them and just keep typing, then "qinggeiwoyigepingguo" will give you one choice and it will be the correct one.
Re: (Score:1)
It doesn't have to be. It just narrows the candidates down to a much smaller set. Certain "problem words" will indeed take extra work and/or extra digits; but it will be only a small percent.
Re: (Score:2)
40 is better than 5000
Well, of course (Score:5, Funny)
Lois Lew tackled the challenge and demoed the typewriter for the company in presentations from Manhattan to Shanghai.
She had to; that's how wide the keyboard was ... :-)
Re: (Score:2)
I suppose it deserves a funny mod, but there were actually some Chinese and Japanese keyboards with separate keys for each character. You didn't have to memorize combinations, but you had to remember the locations to know where to look. I can't find a reference, but the discussion of Japanese typewriters on Wikipedia makes me believe one model may have had 2,400 keys. I remember seeing a picture with different banks of keys, but it was a long time ago and maybe it was just some kind of prototype. I think th
Re: Well, of course (Score:3)
Japan wouldn't have really had much of a market for a Kanji typewriter, because any Japanese word can, with total 100% legitimacy, be written using only Katakana & Hiragana, which work exactly the same way as the Roman alphabet, because they're alphabets themselves.
Oversimplifying a bit, hiragana is used to write "real" Japanese words, katakana is used to write foreign loanwords, but both are more or less phonetically equivalent.
In a pinch, or for literary/artistic effect or shock value, you CAN technic
Re: (Score:3)
Thank you for the deep and thoughtful comment, but only partial concurrence. From your reply I actually think that you probably read Japanese better than I do, but I also think you are missing some parts of the puzzle and you even got a couple of details wrong.
For example, in spite of my limited Japanese reading, I suspect that I've done more Japanese input than you have. There were a couple of places where I think you should have mentioned specific topics such as DOS-V and ATOK and the abomination known as
Re: (Score:2)
All 5,400 Chinese characters? (Score:2, Informative)
There are WAY more than 5,400 Chinese characters. If you know the most common 5,400 Chinese characters, you can live and do business in China without too many problems. In daily life in China, sometimes a name or a phase will use a character that isn't in the top 5,400 Chinese characters. If you want a PhD in Chinese literature, you need to know over 10 or 12,000 characters.
This BBC source has "a comprehensive modern dictionary will rarely list over 20,000 in use."
http://www.bbc.co.uk/languages... [bbc.co.uk]
Re:All 5,400 Chinese characters? (Score:5, Informative)
There is dictionary with over 106,000 Hanzi in it, other sources will cite over 85,000 except really it's a matter of choosing which historical or technical sources to include.
http://blog.tutorming.com/mand... [tutorming.com]
Re: (Score:2)
Some characters were only ever used once (plus direct quotes). It gets difficult to decide when a character has been used enough to count. See https://en.wikipedia.org/wiki/... [wikipedia.org]
Re: (Score:2, Interesting)
right but your source mentions the meaning of the "used once" characters is lost, they're not what's in a comprehensive dictionary, The authoritative dictionaries have over 80,000 characters.
Topic is of interest to me because of Japanese kanji which is subset of hanzi, government has list of over 1800 needed for daily Japanese... but still those 80K+ are potentially available for use and are in many historical, literary and technical cases. Daily written Japanese of course takes those ~2K characters and
Re: All 5,400 Chinese characters? (Score:2)
Ironic isn't it? Ahead of its time. (Score:5, Interesting)
For centuries the advantage of the western alphabet-based systems of writing had a big advantage over the open-ended glyph system of Chinese writing. When data processing became a thing starting, I suppose, with the telegraph and Morse code, that advantage became even wider.
Fast forward a century or so and the "invention" of GUIs and now we are using symbology again in the form of an open-ended dictionary of icons with dozens of dialects and regionalisms. Just like where the Chinese were those centuries ago. Full circle.
I bet most of us could not say to the nearest hundred the number of GUI icons we see, use, click on, and recognize. Graphical tools like Adobe Illustrator/Photoshop have hundreds all by themselves. Add in all the crap that Microsoft and Apple dumps on you.
And it isn't limited to that. Phones, kitchen appliances, street signs, cars, medicines, and even shipping cartons and food packages all have their own iconography.
So the 5,400-character typewriter might seem pretty quaint, but it also was where we are now.
Re: (Score:1)
Having a convention is fine. I'm willing to tolerate a convention for an image that represents "settings" instead of just saying "settings". At least if we use something slightly intuitive like a wrench, representing adjustment, reconfiguration. A speaker probably relates to audio. A musical note.
Then you have a fucking phone app or website or hip program with random dots and lines and arrows and ribbons instead of functionality. That's if they aren't relying on random unintuitive gestures to open things wi
Re: Ironic isn't it? Ahead of its time. (Score:2)
Do you advocate emoji becoming a language? Is there an emoji dictionary? Emoji might be easier to communicate across languages and even cultures. I could see emoji is easier to translate between languages, especially for machines. I dunno though I think everyone will just learn and use English online because it is the path of least resistance. Old people are not going to learn emoji BS. Not to mention sites like slashdot canâ(TM)t (see!) even handle utf-8.
Re: (Score:2)
" Phones, kitchen appliances, street signs, cars, medicines, and even shipping cartons and food packages all have their own iconography. " due to poor design which is the vast majority of design.
Excessive variety of icons defeats their intended purpose just like excessive variety of written characters. It is not an advantage.
Icons and other characters are "code" and code is best kept simple and CLEAN, not bloated by vanity be it "change is progress" art-tards thrashing through shitty UIs (all of them) or bl
Re: (Score:3)
Re: (Score:2)
The answer turns out to be e (the constant, 2.71828...). The most efficient encoding scheme is if you have
Re: (Score:1)
Re: (Score:2)
Was gonna say, ternary is the most efficient encoding scheme.
No, icons suck! (Score:2)
Mobile icons drive me insane because their meaning is often not clear. I don't want to guess, it may sign me up to a Goatse Cruz or whatnot.
On desktops you have the icon rollover option for pop-up English descriptions. There is no consistent equivalent on mobile.
Racism. (Score:1)
Re: (Score:1)
Another story thats only on slashdot because it mentions IBM. If it was a French typewriter company there wouldnt be any story on S.
Tough time submitting stories on /.? Know the feeling.
It's a woman! (Score:2)
Wow, it's a woman! A woman did it! Not a man! Wait, maybe it's a trans woman?
Re: (Score:1)
Wow, it's a woman! A woman did it! Not a man! Wait, maybe it's a trans woman?
Not back then. Genuine woman.
The Typewriter and I (Score:1)