Forgot your password?
typodupeerror
Handhelds Cellphones Portables (Apple) Data Storage Hardware

Squeezing a Wikipedia Snapshot Onto an 8GB iPhone 169

Posted by timothy
from the but-you'll-miss-the-latest-edit-wars dept.
blackbearnh writes with this excerpt from O'Reilly Radar "Think about Wikipedia, what some consider the most complete general survey of human knowledge we have at the moment. Now imagine squeezing it down to fit comfortably on an 8GB iPhone. Sound daunting? Well, that's just what Patrick Collison's Encyclopedia iPhone application does. App Store purchasers of Collison's open source application can browse and search the full text of Wikipedia when stuck in a plane, or trapped in the middle of nowhere (or, as defined by AT&T coverage...)"
This discussion has been archived. No new comments can be posted.

Squeezing a Wikipedia Snapshot Onto an 8GB iPhone

Comments Filter:
  • by benwiggy (1262536) on Friday July 03, 2009 @10:45AM (#28571633)
    "Wikipedia, what some consider the most complete general summary of human knowledge we have at the moment."

    There. Fixed that for you.

    • by L4t3r4lu5 (1216702) on Friday July 03, 2009 @10:49AM (#28571699)

      "Wikipedia, what some consider the most complete general summary of human knowledge[citation needed] we have at the moment."

      There. Fixed that for you.

      There. Fixed that for you.

      • by mikael_j (106439) on Friday July 03, 2009 @11:25AM (#28572097)

        In all seriousness, I'm starting to get extremely annoyed by what is IMHO flagrant abuse of the [citation needed] tag on Wikipedia, I don't know how many times I've seen it used in situations where it just wasn't needed. And I don't mean in "But anyone who spends all day working on FOO knows that BAR!" situations but more along the lines of "The earth orbits the sun[citation needed]." or even better "Sir NameOfArticle was in his day frequently regarded as a national hero in $COUNTRY.[citation needed]. <Six paragraphs that detail, with plenty of sources, exactly how famous Sir NameOfArticle was.>".

        I've actually begun wondering if maybe there are certain individuals who are deliberately trolling Wikipedia by adding [citation needed] in places where it just doesn't belong and then sit around giggling as they read the discussion pages of various articles they've messed with.

        /Mikael

        • Re: (Score:2, Insightful)

          by larry bagina (561269)
          It also seems completely random and arbitrary. If they need citations, then they need citations on every sentence/idea/paragraph that isn't general knowledge. Maybe a bot goes though and randomly adds them.
          • If they need citations, then they need citations on every sentence/idea/paragraph that isn't general knowledge.

            I think that's a good viewpoint. Now, I am all for Wikipedia and have on many occasions found it to be of paramount value, but there is no way anyone in their right mind should trust an "encyclopedia anyone can edit". The way I see it, Wikipedia is a collection of sources, and WP articles function mainly as summaries of those sources. No-one should accept anything that's written in WP without checking where the information came from, and that's why everything that doesn't fall under common knowledge should

            • Re: (Score:3, Insightful)

              by dimeglio (456244)

              Look, this is how it works: I'm asked to find out what COBIT is. Naturally, I google it and find a hit in Wikipedia. From there, I get a fairly comprehensive idea of what it might be - with external links that I don't bother to click. I then explain to the team what COBIT is and tie it in with our business objectives. The team then might want to investigate further or get certification if it is a requirement for the job.

              Now, I'm not sure what you mean by trust. I do trust that the information I gathered on

            • by leenks (906881)

              Why is this any different than any other encyclopedia, or any other source for that matter?

            • by mopower70 (250015)

              I think that's a good viewpoint. Now, I am all for open source and have on many occasions found it to be of paramount value, but there is no way anyone in their right mind should trust "source code anyone can edit".

              There. Does that read any differently?

          • Re: (Score:3, Insightful)

            by atraintocry (1183485)

            If there is a pattern, it's that the person who put [citation needed] didn't necessarily agree with the preceding statement, but either (a) didn't want it to turn into a conflict or (b) didn't feel like doing the research and (perhaps rightly) decided that the person making the claim should do it.

            I think that in the aggregate they turn the tone of WP into something that's very passive-aggressive. But individually they are harmless, just pointing out the obvious ("here is a statement that is unverified").

            Whe

        • by billcopc (196330)

          Douchebags ? On my Wikipedia ?

          SURELY YOU JEST!

        • [citation needed] is really the only good defense against weasel words. All one editor that doesn't agree with another can do (that doesn't involve an edit war) is to ask for greater and greater clarification.

          Which after a certain point becomes pretty weaselly in and of itself.

          (I am not a frequent WP editor and there's a lot about WP that I can't f-n stand. But I think at this point everything noteworthy that could be said about the pros and cons of 'crowdsourcing' the editing of an encyclopedia has been sa

      • by Anonymous Coward

        "Wikipedia, what some[who?] consider the most complete general summary of human knowledge[citation needed] we have at the moment."

        There. Fixed that for you.

        There. Fixed that for you.

        There. Fixed that for you.

      • Re: (Score:2, Funny)

        by Anonymous Coward

        "[[Wikipedia]], what some[who?] consider[weasel words] the most complete general summary of [[human knowledge]][which?][citation needed] we[who?] have at the moment[weasel words]."

        There. Fixed that for you.

        There. Fixed that for you.

        There. Fixed that for you.

  • Now I can get lost in the mass of links wherever I go! Life's Good

    Well, I could if I had an iPhone, sounds like an impressive achievement though, but how much space do you have left over after it?

    • Re:Nice! (Score:4, Informative)

      by Starayo (989319) on Friday July 03, 2009 @11:31AM (#28572149) Homepage
      The filesize of the app is about 2GB. Pretty amazing!

      I'd be grabbing it right now if I didn't only have ~350MB of free space left on my iPhone...

      Would be a great app for iPod Touch users.
      • by JordanL (886154)
        All that porn is taking up a lot of space...
        • by Starayo (989319)
          It is, actually! I have a feeling the software I used to convert it didn't compress it properly.
      • Re: (Score:2, Troll)

        by Eunuchswear (210685)

        I'd be grabbing it right now if I didn't only have ~350MB of free space left on my iPhone...

        So stick a bigger SD card in it already.

        • Re: (Score:3, Informative)

          by SuperKendall (25149)

          So stick a bigger SD card in it already.

          He can't, can you just loan him your mobile SD capable device that can run the app?

          Oh that's right...

          • He can't, can you just loan him your mobile SD capable device that can run the app?

            Oh that's right...

            What's "right" exactly?

            That this open source app could be ported to many mobile SD capable devices? Like many smartphones for example?

        • Sometimes an encyclopedia app you'll use once a year is less important than the pocket space you save every day by giving up expandability.

          That said, it would be pretty nice to have an iPod or iPhone with an SD slot. But just regarding the music player side...the players that mix storage either don't use a database or make you wish you hadn't turned it on. And while I know and understand the virtues of having your music files nicely organized in a system you devised yourself, as my music drive climbs toward

          • Please explain the connection between lack of expandability and a nice database.

            Correlation is not causation.

            • If you can make changes to the storage between syncs (or if you're not syncing to a host machine at all) then the database has to be built and indexed on the player, which takes longer. A *lot* longer, which is why I don't use it with Rockbox.

              My friend had a Cowon that did something similar and he had nothing good to say about it. He also eventually went the iPod + Rockbox route.

              My larger point was that although I generally make sure that gear is open/hackable before I buy it, there are certain things that

  • by Blue Stone (582566) on Friday July 03, 2009 @10:50AM (#28571703) Homepage Journal

    This is easily doable.

    Once you trim the earth reference down to "Mostly harmless".

    • Re: (Score:2, Funny)

      by Dishevel (1105119) *

      This is easily doable.

      Once you trim the earth reference down to "Mostly harmless".

      What reason did you have for adding "Mostly"?

    • Re: (Score:3, Funny)

      by LifesABeach (234436)
      And the words, "Don't Panic" should be easily viewable.
  • by ColdWetDog (752185) on Friday July 03, 2009 @10:50AM (#28571709) Homepage
    1. Goes to foreign country - one that he has never visited before
    2. Doesn't have wireless access.
    3. Instead of wandering about the country he spends most of his time programming ("Then basically, I spent a significant fraction of my time there in Japan, again, in 2007 writing those applications") an application so he can look up stuff about the country he isn't spending much time actually visiting.

    I bow before you sir. Awesome.
    • Yeah, when I went to Nicaragua a few months back (to learn Spanish) I had to stop myself from digging into my WordPress blog to play around with the formatting and make things display nicely.

      I did end up spending a blog post talking about the difference between image scaling with sampling and image scaling by averaging pixels. It was curious how the difference would probably not be noticeable at home, but on the 8 year old laptop I was using, the averaging method took a good 10 seconds per image, while the

    • by pzs (857406) on Friday July 03, 2009 @11:31AM (#28572151)

      You're right that this guy has flown the geek flag pretty high here; however, at least it's to some useful purpose. There are all kinds of facts about a country that are quite hard to discover just wandering about in it, and Wikipedia would be the ideal candidate to answer them.

      Last time I went on holiday (to Australia) I came back with a dozen questions I wanted answering, just because I didn't have internet access while I was out there; Wikipedia access would answer many of these questions. Examples:

      • I heard that Beds Are Burning [wikipedia.org] was about the Australian aborigines - I never knew this before and wanted to look up more details on it.
      • As a result of that, I wanted to know far more information about how well aborigines were integrated in Australia at the moment. Answer: badly [wikipedia.org], but again hard to find out just by wandering around in Australia and difficult to raise with a random Aussie.
      • Australia is experiencing a lot of drought at the moment, but while we in Sydney, it rained quite a few times. I wanted to know more about the drought [wikipedia.org] and what parts of the country it was affecting.
      • ...

      I could answer these questions by going into an internet cafe, but this isn't always possible. A portable Wikipedia sounds like a great idea.

      • I think you could probably get answers to your questions by visiting public libraries, and talking to people. Maybe the "talking to people" bit might not get you definitive answers (though probably as good as a lot of Wikipedia content) but you might have found out a whole lot more. Also the public libraries probably had a lot of this info if you were looking for solid facts.

        I appreciate a portable + off the net wikipedia would be a cool tool as well but nothing beats chatting to the locals.

      • Re: (Score:3, Insightful)

        by dbcad7 (771464)
        When I go on vacation to a country like that, I will buy a travel guide. I also spend a great deal of time researching prior to traveling. Having Wikipedia available 24/7 would be nice if I went off the grid that I had planned, but not life changing. Wikipedia is not that great as a travel guide. It "might" cover some things to see, and rarely things to do, but is more geared towards questions like those you asked in your post. Those type of questions are easily answered later without taking anything away f
  • Oblig (Score:5, Funny)

    by TinBromide (921574) on Friday July 03, 2009 @10:51AM (#28571721)
    xkcd comic reference [xkcd.com]

    Yeah, pretty much you're turning your iphone into a hitch hiker's guide to earth, or at least america and europe if you can manage to squeeze wiki-travel onto it.
  • Don't panic.!! (Score:1, Redundant)

    by leuk_he (194174)

    Seen that, done that been Got the t-shirt [xkcd.com] in 1978 [wikipedia.org]

  • Are you crazy? Imagine the frustration when you find this horrible typo and you can't fix it!
    • Re: (Score:1, Offtopic)

      by mmkkbb (816035)

      Gah, or the nearly ubiquitous misuse of the word "irony."

      • I know, isn't it ironic?

      • Or people saying "random" when they mean "arbitrary."

        It's annoying (although expected) that the words people mix up are the ones that require subtlety in their use. It's as if the more expressive a word or phrase is, the more likely it is to be horribly misused. That's the real story of Babel...some of us want to keep building language into something more expressive, but the higher that tower gets, the easier it is for the DGAS crowd to knock it over.

        Especially journalists. They think that they'll sound sma

  • Nothing new (Score:5, Informative)

    by Hrshgn (595514) <rince2001@gmOOOx.ch minus threevowels> on Friday July 03, 2009 @10:58AM (#28571805)
    This is nothing new. Wikipedia has been available for several years now in MDict format: http://www.octopus-studio.com/product.en.htm [octopus-studio.com]
    • Re: (Score:3, Insightful)

      by Sentry21 (8183)

      Given the trouble Patrick had squeezing down a full DB dump of Wikipedia to fit into 2GB (for the app store), I find it impossible to believe that the 162 MB files I've found so far for Wikipedia in MDict format are anywhere near the full text (which Patrick's app is).

      • by leenks (906881)

        The MDict version was created in June 2006, and I believe uses a dump from 2003 so that the files don't get too big for the MDict database format.

    • Re: (Score:2, Informative)

      by tomthepom (314977)

      You're not looking hard enough [legaltorrents.com]. Wikipedia has also been available in Tomeraider [tomeraider.com] format for a while now.

    • I put Wikipedia on my Laptop a few years ago, for when I'm travelling around with no net access. Since then I periodically update it.

      Last I checked, it was only a 2.2GB download gzip'd.

    • by MPAB (1074440)

      I've been using the Wikipedia on my Palm TX for two or three years now. There's a guy that makes snapshots of the spanish Wikipedia each 6mo or so and uploads them in TomeRaider format. The last version takes about 2/3 of my 2gb card with reduced images.

      Countless times I've looked into it for info about places, people, etc. Almost any town of Spain appears there, with touristic references. Also It's solved many a doubt on history, math, physics, biology, etc.

      The only drawback is the index, which has problem

    • by samkass (174571)

      I could be wrong, but I think that until recently Wikipedia's license would have forbid copying it into an iPhone app for sale at the App Store. I think the recent license change is what allowed this more than any technological advance.

  • Better (Score:2, Informative)

    by Anonymous Coward

    And for those preferring accuracy and editorial responsibility :

    http://www.ipodnn.com/articles/08/02/27/britannica.on.iphone/

  • ... so clearly this app will never make it through Apple's review process.

    • I would be of the opinion that they could never make a complete review of this package, as Wikipedia is constantly changing, and already huge.
      • by maxume (22995)

        "Snapshot" means that it isn't constantly changing.

        Still, the only way they do a complete review is if they are slavishly, inanely devoted to process rather than results (because the group of people offended by Wikipedia is so small as to not be worth worrying about as customers; perhaps this isn't obvious, but I would take it as obvious).

      • Sometimes I have the idea that the approval process consists of an Apple employee typing swearing words into any text field of an app.

        We're in the middle of creating an iPhone app for a client and they provide an API for the app. I've actually advised them to filter out a bunch of bad words, so the approval process won't be impeded for some dumb reason.

      • by hedwards (940851)
        Don't they just hire middle schoolers to type in dirty words and see what happens? Or if they're being really cost conscious, they just block anything too large to completely review.
  • It's cool but not $10 cool. I use 2 free apps that let me access wikipedia. Nothing really new or radical about this app unless wikipedia is really much larger and the author managed to cull 2gb from it.
    • I use 2 free apps that let me access wikipedia. Nothing really new or radical about this app

      Except it works when you're away from a hotspot, even if you paid only $220 (not $100 + $80/mo * 24 months) for your device.

  • It would be nice if he shared/donated some of the profits from this to Wikipedia, seeing as he's getting the database for free. There didn't seem to be a mention of it in the article or his personal site.

    • Re:Profits (Score:4, Informative)

      by twoshortplanks (124523) on Friday July 03, 2009 @12:21PM (#28572607) Homepage
      He is; It's detailed on the info for the app in iTunes. Since you need iTunes to read that, I'll simply post a screenshot: http://img.skitch.com/20090703-e7kkm8i7f4wdq9ir92td898wr3.jpg [skitch.com] (skitch may eventually delete that image after a while...)
      • Re: (Score:3, Interesting)

        by Me! Me! 42 (1153289)
        I wonder exactly what "portion of the proceeds" go through to the Wikimedia Foundation?
        I hate when companies don't just come out and say it explicitly. It makes me think they might just be paying a penny on the dollar so they can play the "philanthropy" card. I like that Target Corp clearly states that "5% of our profits" go to charity (admittedly, much of this may be in the form of product donations, but still.)
        http://en.wikipedia.org/wiki/Target_Corporation#Philanthropy [wikipedia.org]
        • by ensignyu (417022)

          I can imagine that for some companies, they don't want to mention the percent to avoid revealing their actual sales figures -- since the total amount of money donated will probably show up in the charity's public disclosures. Also, it lets them change it according to how generous they're feeling at the moment and their financial situation.

          It would be nice if they at least announced the total amount raised, which gives you a reasonable sense of whether it's a significant amount of money for a company of thei

  • Isn't the wireless access is for that purpose???
  • XML Compression (Score:4, Interesting)

    by firefarter (307327) <chris@cec[ ].de ['ube' in gap]> on Friday July 03, 2009 @11:34AM (#28572185) Homepage

    So, I'm reading here that they convert the XML into proprietary metadata and compress that.

    Why not use EXI (Efficent XML Interchange) http://www.w3.org/XML/EXI/ [w3.org] which has been tested as more efficient that gzip and requires less memory to parse? Especially since the XML processing can remain the same, since the nodeset is the same.

  • http://www.instructables.com/id/SBK1NAUFF78M26B/ [instructables.com]

    I found these instructions in May 2008 and created a reasonably current snapshot of wikipedia that is still rather compact on a Psion 5MX. Not quite the same "curb appeal" as an iPhone, but a lot more functional.

    Best,

  • Bah, that's nothing. I made an offline Wikipedia midlet! Unfortunately J2ME is unpleasant to say the least, and my phone only supports 2 GB SD cards so it only has some of the articles and without text.

  • by jbarr (2233) on Friday July 03, 2009 @11:57AM (#28572397) Homepage

    I've been using this app for quite a while on my 1st gen iPod Touch, and it works and works well. It's amazing just how many articles it has. Other than some cosmetic and minor feature issues, the only real limitation is that Apple limits data file size to 2GB, so there is an obvious limit as to how much can go into the file. But it is amazingly complete. No images, no fancy tables--just text articles at your fingertips.

    If you Jailbreak your iPhone/iPod Touch, then an excellent alternative is the Wiki2Touch app. Unfortunately, it seems that it's been pretty much abandoned in development, so it may be hit-or-miss if it works on OS v3.x. This implementation was REALLY slick. It provided a 4GB data file (that was much more complete) and a small Web server. You enabled the Web server, fired up Safari, and pointed it to a local URL. The app presented quick and very readable articles. And if you went to the trouble to download and process, you could also add about 4GB of image files to make things more complete (on a larger-capacity device, of course.)

    Here's a review that I posted for both apps just over a year ago on my iPod Touch Tips site:
    http://jimstips.com/ipod-touch-tips/ipod-touch-review-wikpedia-on-your-ipod-touch.html [jimstips.com]

    In both cases, the main complaint is updating. In order to update the data file, you have to re-download the data, and depending on the app, you are typically at the mercy of the developer to provide an update. Otherwise, you had to download, index, and install the HUGE files yourself.

    If you absolutely HAVE to have updated, offline data, check out the Wikipanion app. It's a nice compromise.

    • >> Other than some cosmetic and minor feature issues, the only real limitation is that Apple limits data file size to 2GB, so there is an obvious limit as to how much can go into the file.

      From the interview, Apple can't actually handle anything even close to the 2GB stated limit. The app itself is a stub, and downloads the data from elsewhere.

  • Sounds a bit small.

  • App Store purchasers of Collison's open source application can browse and search the full text of Wikipedia when stuck in a plane

    This page [wikipedia.org] is not recommended when you're stuck in a plane...

    • by jo_ham (604554)

      I would also argue that a list of accidents involving a wood plane should be avoided. Avoiding them might be a very close shave though.

  • by Anonymous Coward on Friday July 03, 2009 @01:06PM (#28573075)

    I bought this application 6 months ago and there are 3 majors problems with it:
    1) The search function is broken because you need to type the exact word (prefix)
    2) This is plain text: no pictures and no tables so most articles with "list" are useless
    3) No update mechanism so the dump used will be outdated soon.

  • Web Version (Score:3, Funny)

    by SnarfQuest (469614) on Friday July 03, 2009 @02:23PM (#28573779)

    Is there a version of this that will run in a web browser? Anyone have a link?

    • I know you're being funny, but my first idea for how to implement this would be to

      • Store the data massively compressed
      • Run a custom http and/or cgi script which on-demand decompresses and serves straight-up html
      • Look at that in a browser

      I use a browser for viewing /usr/share/doc/**/*html. Not all uses of a browser have to leave 127.0.0.1.

  • by daemonenwind (178848) on Friday July 03, 2009 @06:32PM (#28575755)

    try this link from your mobile phone:
    http://wapedia.mobi/en/ [wapedia.mobi]

    That way you get the whole thing, up-to-date, and with no trouble or major memory usage.

  • With North Korea's nuclear weapons threats looming over the whole world every day, one must wonder what would happen if, one day, nuclear missiles are fired, plunging the world into a post-apocalyptic darkness.

    The progress of humanity could be lost with the destruction of the Internet, libraries, etc. Luckily, you can now carry the history of the world and beyond - on your iPhone! Combine that with a power generator, and you'll still hold the history of the world!

"Pull the trigger and you're garbage." -- Lady Blue

Working...