Forgot your password?
typodupeerror
Hardware

Building a Better Webserver 286

Posted by michael
from the slashdot-will-beat-a-path-to-your-door dept.
msolnik writes: "The guys over at Aces' Hardware have put up a new article going over the basics, and not-so-basics, of building a new server. This is a very informative, I think everyone should devote 5 minutes and a can of Dr Pepper to this article."
This discussion has been archived. No new comments can be posted.

Building a Better Webserver

Comments Filter:
  • by Delrin (98403) on Tuesday November 27, 2001 @05:46PM (#2621429) Journal
    But don't.
    Actually a very interesting article, to be honest, in my 1 year of building webserver applications. I haven't gone through a process like this once. Usually we make a rough guess about how the application has performed (or more usually underperformed on existing servers, and just scale a percentile. As you can imagine, this is hardly realistic. Thanks for the read!
  • by Archie Binnie (174447) on Tuesday November 27, 2001 @05:46PM (#2621431) Homepage
    Lets see exactly how long their lovely new webserver stands up to a slashdoting... :)

    (Maybe they just sent this so they could test it? Plan.)
    • *bork* Looks like they should have gotten a bigger machine afterall, got halfway through it here...
      • doh, I wish I learned how to copy and paste properly

        Regarding load on their server [aceshardware.com]
        Here you can see how the total number of threads varies with the workload throughout the day. The maximum number of concurrent threads shown here is 117. The average is around 90 to 100 until later on in the day when the thread count drops down into the 80s and then finally around 75 by midnight. Resident memory size for the web application (the entire Java process) remained at 260 MB for the entire day. In fact, it has never really grown far beyond this size. The size remains relatively static because the caches are a fixed size and the applications do not grow over time (i.e. no memory leaks). The database acquires a fixed amount of memory upon initialization, and it is configured to use 512 MB. Currently, the server reports a total of roughly 1 GB of free, unallocated memory. So, we have quite a bit of room to grow with our new system.

        I really wanna see todays numbers...

    • It is 15 minutes after the article post and the site is dead. I got to the part about calculating how much RAM was required per visitor and multiplying by the expected number of visitors.

      Maybe they need to adjust their constants. :-)

      It is those d*mn modem users that drive up the RAM use. They stay connected longer on their GET and tie up resources longer.
      • I'm thinking that their "better webserver" isn't so hot, considering the "connection refused" messages I'm getting.
      • by john@iastate.edu (113202) on Tuesday November 27, 2001 @06:20PM (#2621662) Homepage
        Well, lots of big iron gets crushed by the slashdot effect too. This thing is running on a piddly little Sun, after all. And it was very responsive early.

        One thing that does seem to work against the onslaught is a throttling webserver [acme.com]. If you haven't got the bandwidth etc to serve a sudden onslaught of requests, probably the best thing to do is to just start 503'ing -- at least people get a quick message 'come back later' instead of just dead air.

        • Speaking as the maintainer of a site that is periodically slashdotted...

          Yes, a throttling server is a great idea. If you recognize that there will always be a load too high for you to handle (10 requests per minute for my site, yes minute, it is a physical device), then you must either decide to deal with the load or let the load crush your machine.

          Consider a typical web server. When it gets overloaded it slows down, each request takes longer to handle, there are more concurrent threads, overall efficiency drops, each request takes longer to handle.... welcome to the death spiral. (on my site-which-must-not-be-named-less-it-be-slashdotte d, everyone waiting in queue gets a periodic update, at a certain point the load of generating the updates swamps the machine. I have to limit the number of people in queue.)

          The key decision is to determine how many concurrent threads you can handle without sacrificing efficiency and then reject enough traffic to stay under that limit.

          This is where optimism comes in and bites you in the ass. You remember that every shunned connection is going to cost you money/fame/clicks whatever so you set the limit too high and melt down anyway.
        • If you can't live without Apache, there's always mod_bandwidth [cohprog.com].

          Not quite as elegant a solution, but it's nice for preventing your web server from taking all of your bandwidth (if, say, you run it off your cable modem, and wish to continue gaming...).
    • Unfortunately it doesn't seem to have stood up long enough to read the article. I suppose I'd better put my can back in the fridge...

      Sigh, and I was hoping I could use it to justify a quad Xeon server with 4GB of RAM as the next web app's server on our 8 user LAN....

    • We showed them...

      Looks like they might to revisit their approach to building a better webserver.

      It is hard to say if we have maxed their bandwidth or maybe given the server a real life lesson in load.

      I suspect the article might get a rewrite ;)

      Unfortunately I wasn't able to get past the first page, but me thinks the next article would introduce additional server's and some load balancing.

      [Slashdot Seal Of Death]
    • its already slashdotted...

      sweet and bitter irony, isnt it?
    • Re:New Webserver? (Score:2, Informative)

      by benspionage (265723)
      An excellent reply [aceshardware.com] to the "they've been slashdotted" comment was given in the forum for this article. I should note that the site is responding fine now.


      Most people are unlikely/too lazy to follow the comment link above so I've repeated the first part of the response below:


      Yes, I read quite a few snide comments on slashdot about this server not being able to handle the load and ridiculing the article because of it. Frankly these people dont have a clue. It would be pointless in the extreme to operate a server 24/7 to handle the kind of loads the "slashdot effect" generates unless those kind of loads are the norm... A well tuned properly designed website/server should be equipped to handle 2 to 3 times its _expected_ peak traffic rate (which seems about what this server can do as its tuned now). It is a waste of money and hardware trying to do anymore than that imho as 99.9% of the time you would have alot of $$$ sitting totally idle in the form of hardware and bandwidth. Being a server admin myself, I think the guys here did an EXCELLENT job explaining what is involved with hosting a fairly high-traffic website effeciently. And I also think the server/programming for this site is well designed and does its job admirably (better than 99% of the websites on the internet at least). They did an excellent job of explaining the pros and cons

      of different approaches to dynamic sites. Knocking them for getting nailed by slashdot isnt exactly productive, I would like to see ANY site which uses database generated content on a single thousand dollar server handle that kind of load (my guess is > 1000 requests per second at its peak from what I have heard from others who have been slashdotted)... Caching can only do so much :)

      [Rest of comment follows, see link above for full version]

    • They took a "powerfull" desktop and made it the new server. When i look at the sun site [sun.com] you can see this is a desktop that is not very upgradable, they already took it to almost the top.

      Memory: Max 2Gb, 2Gb used. (4x increase old memory) may sound a lot but "we will never need more than 640Kb" and already 50% is used and "not growing."
      Processor: 500 Mhz now 25% used. But no more extra processors are possible. (I know 1 sun Mhz != 1 athlon Mhz, but 25% load is far fro near idle)

      They can work arround this limitations by placing an extra server and placing some functions on the other server, but they started with that in their case an extra server would be an extra point of failure.

      In other words, if they keep developing their site we will see such an article agian in one or 2 year. gues this one will be about load balancin g on cheap (sun or x86) hardware.

      I am a little bit suprised they didn't use x86 hardware since that is waht they review all the time. They looked futher than what you would expect.
  • by Typingsux (65623) on Tuesday November 27, 2001 @05:48PM (#2621439)
    Get IIS and no code red patch.
    Instant traffic to your site, no advertising!

  • going on 5 minutes after the initial posting, and still no slashdotting...

    Seems to me that these guys might be onto something here...or maybe they just really know what they're talking about...
  • Well it did load quickly.

    It was a good read and I wish we could do something vaguely similar with our web servers here. Not that we get the server load to demand such improvements at the moment, but I figure it's best to spend the money early on, get a good setup going that can handle high volumes, that way you're not caught with your pants down when things take off for you. It's unfortunate bean counters never think this way.

    Of course I don't think I'll be taking this approach at home - even if it would be fun to have a Sun Blade sitting in the living room purring along answering the 1 or 2 web hits we get a day.
    • by DavidJA (323792) on Tuesday November 27, 2001 @06:05PM (#2621545)

      but I figure it's best to spend the money early on, get a good setup going that can handle high volumes

      Throwing money at the problem is exactly the WRONG approch. You need to start by spending time PLANNING and RESEARCHING the best way to do things.

      For example, if you are setting up a dynamic site like ./, which is serving 100 pages/second. It obviously needs to be dynamic, so you need a database to store all the comments in.

      There are two ways to do this, one is to serve content straight out of the database, but this means that for every page you serve up, there needs to be a database query. (the database queries are the expensive part in terms of time it takes to serve a page). The other way would be to serve the articals as static pages which are generated every minute or so by a process on the database and pushed down to the web server, which serves these up as static pages.

      The advantage of this is that insted of 100 database queries per minute, you end up with, maybe 10 queries per minute to populate the static pages. Sure, you site is no longer 100% dynamic, but it is a whole lot faster, and you have saved thousands of dollars to boot!

      This is just one small off-the-top-of-my-head example of where PLANNING sould become way before spending any money.

    • by NerveGas (168686) on Tuesday November 27, 2001 @06:28PM (#2621703)
      <but I figure it's best to spend the money early on, get a good setup going that can handle high volumes>

      Actually, that's a terribly wasteful way to go. If you work on an easily-scalable infrastructure, then you can pretty much purchase capacity as it's needed, which not only frees up capital for a longer time, you end up spending a lot less, as the price of computers is always dropping, and the performance is always going up.

      steve
  • Offtopic somewhat, but some people are halfway there to upgrading the XBox to a decent Linux server.

    Note this article [icrontic.com] for information on connecting USB keyboards and mice, what shorting the debug pins does on the keyboard, and replacing that measly ATA33 hard drive cable with an ATA100 (surprise, surprise: it actually increased performance :) ).

    • Uh, that's meant to say shorting the debug pins on the *motherboard*. Although, if you find a keyboard with debug pins, let me know. I'd be curious to see what happens...

      *Congratulations screen: you can now type in Swedish Chef! Bork, Bork, Bork!

    • The keyboard and mouse didn't talk to the CPU at all, just powered on.

      Anyway, it's not halfway there, more like .02% there. No idea yet on how to run random bits of code on it, Microsoft obviously will have put all sorts of hurdles there. And then you have to reverse engineer stuff for all the libraries and OS(since they're statically linked) and figure out how to talk to hardware for all the little differences from normal PCs they've added. Long way to go.

    • 1. Start off with deciding what you want on it.
    • 2. Delete, destroy, burn, remove, obliterate anything and everything that isn't on that list.
    • 3. Since the information is likely already on the Internet already, save yourself the time and effort, and burn the list, too.
    • 4. If you insist on going ahead, the order of precidence is: Speed of response, usability, readability, quality, accuracy, honesty, believability. People will believe anything that's delivered to them quickly. Just ask the Afghans.
    • 5. Trust nobody and nothing. Distribute widely. Keep your laser handy. Alpha Complex will be completed shortly. Please wait.
  • If you have a slashdot-proof server, that's a better web server.

  • Good article, but... (Score:5, Interesting)

    by Computer! (412422) on Tuesday November 27, 2001 @05:56PM (#2621493) Homepage Journal
    Microsoft [microsoft.com] has written several white papers [microsoft.com] of this sort already. Of course, they're Microsoft, so that means I can kiss my +2 bonus goodbye. Seriously, though.
  • Why would they go with the desktop version when they want a rackmount server? You can get the Netra X1 for 50$ less and it comes with the exact same hardware but in rackmount case. Check it here [sun.com].
  • by nll8802 (536577)
    I usually get a decent speed processor PIII 800, a really fast scsi drive or raid (Depending on the site), 512MB of ram ( Or more depending on the site ), and a copy of Slackware.
    • Meself, I never hesitate to use multi-processor machines. While the increase in capacity is always nice, it's the increase in responsiveness under load that really makes them shine.

      steve
  • who couldn't figure out at first why Ace Hardware [acehardware.com] put up information about a new webserver?
  • Update (Score:2, Redundant)

    by Karma 50 (538274)
    It would be interesting to see an update from them tomorrow with the same graphs as on the Servers in Practice [aceshardware.com] page with today's data.

    Their site is slowing down under the /. load.
  • by ParisTG (106686)
    Should we really be taking advice on building a web server from someone who's server crashes under /.s load?
  • "I think everyone should devote 5 minutes and a can of Dr Pepper to this article."
    5 minutes my ass. Now that me and 250,000 other people are all trying to access their server within the same 15 minute timeframe, it's going to be a good 5 minute wait per page. Lol!
  • by GCP (122438) on Tuesday November 27, 2001 @06:06PM (#2621556)
    ...the one with a lot of mirrors.
  • It really feels like they only made a token gesture towards using an x86 box. To be honest my next box will probably be a sunblade too (but hey, I'm gonna use it for a desktop ;) Mind you this was a really good article, but I think they should have said that they were just more comfortable with sparc and that was that. There was another good article on a similar subject not long ago, on Anandtech's new server [anandtech.com]. For that article they benchmarked different configs (mobo, proc, etc) then did a price performace.. as far as I recall. And they chose AMD ;)
    • If you compare the straight performance of an X86 CPU to a different architecture, it can come out very good, average, or very poor, depending on what you're doing with it. However, when you compare the cost vs. performance, they really do start to shine. I, myself, pitted a $13,000 quad-Xeon against a $25,000 dual-Alpha for database work, and the Xeon handily bested the Alpha. Had the work been pure number crunching, though, the results probably would have been backwards.

      A lot of the extra money that you spend on "big iron" hardware is spend getting tremendous amounts of I/O to the various CPU's. For something like a database server, where your app pretty much has to run on a single machine, that's great. For something as simple as web-serving, which is extremely easy to cluster, you're wasting your money. Ten $2,000 Intel-based machines will deal out far more than one $20,000 Sun/IBM/Alpha.

      In fact, when one company was doing an embedded solution based on the Strong-ARM chips, just for fun, they used ten of them to dish out over a million web pages per *minute* - and that was with StrongARMs.

      steve
  • Just a can (Score:3, Funny)

    by VFVTHUNTER (66253) on Tuesday November 27, 2001 @06:08PM (#2621572) Homepage
    of Dr. Pepper? I expect to go through a whole case waiting for the slashdot effect to wear off...
  • I was enjoying the article until it /.'ed, and I couldn't get anymore pages to load.

    Therein, a stress test to the folks at Ace's Hardware.
  • The moral of this story:
    If your website is dynamically generated from a database, and your name isn't Amazon.com, don't let Slashdot link to you.
    A single $999 box isn't going to stand up to Slashdot, unless every page is static.
  • I'd call this buying a web server rather than building one...
  • /.ed Well, so much for THAT idea . . .
  • by green pizza (159161) on Tuesday November 27, 2001 @06:14PM (#2621623) Homepage
    Is it just me, or do most folks confuse these two. If a popular website only has a 9 Mbps pipe to the Internet, it doesn't matter how many Crays they have running their webserver farm, they're only going to be able to churn out 9 Mbps (minus overhead). Granted that the converse is possible... gobs of bandwidth, but a slow server... but I would imagine that bandwidth is the limiting factor of at least 99% of websites.
    • ...it doesn't matter how many Crays they have running their webserver farm...


      You've obviously never worked with Java.

    • The server being the bottleneck does sometimes happen, particularly with high-volume websites. If hits are coming in faster than the server can process them, they queue up. CPU usage skyrockets, free memory shrinks, the server starts to thrash, and it often spirals down into a state where it refuses new connections until it's processed the ones it has. This is why you often see HTTP 1.1 / Server Too Busy errors - the server has been swamped. Not the link.

      Tuning a web server is also a bit of an art - most default settings don't take full advantage of the hardware, they throw out Too Busy messages before the CPU/memory is full utilised. Parameters such as queues and worker threads need to be increased to accept more connections. Of course, this can lead to overtuning, where the parameters are set too ambitiously, the server bites off more than it can chew, and chokes.

      Modern web servers on modern hardware can serve a frightening number of flat html pages per second - the real problems stem from poorly optimised dynamic code, usually to do with databases. Sure it's cute to have the site navigation automatically generate from a database query, but it's insanely inefficient. It'll work great under normal light loads, but when you get linked from Slashdot, you're dead.
      • Yup. And you'd be horrified to learn how many enterprise websites run with MS Access as the back end. As I used to tell clients, "Would you use Excel to do corporate accounting? No? Then why Access for corporate databases? They come in the same package....." Here's an example, real basic. You've got a table with a list of states. You have a dropdown on your webpage which lists those states. You'd be frightened (yet again) to know how many 'application programmers' will hit the database every time they want that dropdown to build. Do something more efficent. Depends on your app server of choice, but one example is to read the thing from a global variable. If the variable is a) empty or b) too old then you load the variable from the database. And that's just a real basic example.
      • I agree. I have come across this many times, including a rather extreme example of where I reduced a stock import time for a book retailer from 12hrs to under 30 seconds by (a) re-doing the tables and throwing away useless joins and (b) rewriting the Perl in C using better algorithms eg pre-buildng and doing one insert instead of doing a two pass insert then update. Another example when displaying results about 20 books, instead of doing for (i=1...20) select and doing 20 queries, do select where in (list) which gets the same results in one query. Designing web applications requires more than just the ability to code, you really need to know the architecture of the whole system and how they interact.

        Phillip.
  • by Lumpish Scholar (17107) on Tuesday November 27, 2001 @06:16PM (#2621637) Homepage Journal
    Consider a user with a typical analog modem that has an average maximum downstream throughput of, say, 5 KB/s. If this user is trying to download the general message board index page, about 200 KB in size (rather small by today's standards), it will require a solid 40 seconds to complete this single download.... To maximize the efficiency of the network itself, we can compress the output stream and thus, compress the site. HTML is often very repetitive, so it's not impossible to reach a very high compression ratio. The 200 KB request mentioned above required 40 seconds of sustained transfer on a 5 KB/s link. If that 200 KB request can be compressed to 15 KB, it will require only 3 seconds of transfer time.

    Except that 56 Kbps modems get 5 KBps thoughput by compressing the data! If the client and server compress, the modems won't be able to; the net effect is lots of extra work on the server side, and probably no increased throughput for the modem user.

    The server might or might not see a decrease in latency, and in the number of sockets needed simultaneously; it depends on how much it can "stuff" the intermediate "pipes". The server will see an overall decrease in bandwidth needed to serve all the pages.

    Ironically, broadband customers (who presumably don't have any compression between their clients and Internet servers) will see pages load faster. (And the poor cable modem providers from the previous story will be happy.)
    • I am affraid you are wrong, the modems get 5 KB/s of raw data, not counting compression. I can download zipped files at over 5 KB/s with a dialup modem...

      mod_gzip is your friend.
      • No he's right. You are right that the 5KB is raw data: that's what he is saying. But the difference between whether it's compressed by the server or compressed by the modem is abstract: there are cases where one is better than the other. But, for the most part, compressing it on the server is slightly better than letting it compress downstream, plus it can increase your bandwidth (which is what the article was talking about) and the speed of transit before it gets to the modem-to-modem link, so it is worth doing.
    • by victim (30647) on Tuesday November 27, 2001 @06:51PM (#2621819)
      One other factor to consider is that the gzip transfer encoding compresses much better than the algorithm in the modem. Part of this is the algorithm with its larger dictionary size, the other part is the `pure' data stream being fed to it. It is just the html, not the html interspersed with ppp, ip, and tcp headers.
      • It's not just the size of the dictionary, but with a data stream the dictionary has to be dynamically built and adapted. With a static file the whole of the data can be analysed at once for the optimum dictionary, which can then be appended to the compressed data.

        Phillip.
      • You also have to consider that any amount of compression will help.

        Undernet coder-com was working on an idea to add to ircu.d that would (on multi proc machines) use one processor for the irc functions and the other, which was usually regulated to everyday mundane functions (running ssh server, typical processes) to compress data going from server to server to reduce the lag time in some of the long jumps like *.eu to *.us. This was in the wake of the 3Gbps DDoS attacks on the system, causing several servers to delink. (we miss you irc2.att.net)
        So compression server side has lots of uses, not just for modem lusers. When the vast majority of what you're transfering is conversational text, compression works wonders

        ~z
    • Occasionally you'll find a web page that's got several hundred KB of actual text, but it's usually not that way - most of the bits are decorative GIFs or JPGs which your modem won't compress. So you've got to pay attention to it upfront - use image formats that are already compressed (compare GIFs, JPGs, newer formats like DjVu, different resolutions), and pay attention to how much you want to clutter up the pages with them. Are they fundamental content? Nice but could be lighter weight? Unnecessary clutter when you could use a nice solid-color background instead? How often do you reuse them? Can you cache them effectively, either in the user's browser or ISP, or does the browser think each one of those customized bullets is a different dynamically generated file that it needs to download?
  • the title lead me to believe that it might be an
    article discussing how to design better webserver
    software -- something which would have been
    very interesting since it has been ages since I
    saw a fresh take on that.


    instead: another article on piecing together hardware. *sigh*

  • Why Sun? (Score:3, Insightful)

    by hughk (248126) on Tuesday November 27, 2001 @06:19PM (#2621658) Journal
    Having a quick dig through the article at the far from lightning speed that a /.ed site runs at, I am still no absolutely clear why they stuck with SPARC architecture.


    SPARCs come from Sun, everybody makes a PC - so guess which is cheaper? We see some reasons why they went for the Blade (a nice machine, but rather more expensive than a couple of PCs).


    Please get this right, I'm no x86 fan, but I love the competition going through the line from the processors, chip-sets, mother-boards, etc. This has got to ensure that unless you really want the 2GHz Pentium 4, you have plenty of choice.


    As for reliability, I don't know the Blade, but the SPARC 20s used to give some headaches over their internal construction. It always seemed a little complicated with the daughter boards and they seemed to lose contact after machines were moved around.

    • Re:Why Sun? (Score:5, Insightful)

      by nuetrino (525207) on Tuesday November 27, 2001 @06:51PM (#2621813)
      Because you have absolutely no clue as why they chose the Blade, one must assume you did not do much digging before wasting bandwidth with your post. The article went into great depth explaining why they bought the Blade. The high points were power consumption, lack of a serious and fully supported OS, lack of a server class PC in their price range, the $1000 price tag, and the fact they already had the software written to run on Solaris. They admitted that could get an x86 for a little less, but it wasn?t enough to make good business sense.

      I am amazed at how people buy into the myth of cheap PC?s. Yes, if you are technically oriented and are not running critical applications, a cheap PC will be ok. On the other hand, I have been involved with several enterprises in which my employer insisted on going with cheap PC?s at the expense of short- and long-term productivity. One certainly cannot get a server class PC for $500, and there is few if any available for $1000. I would not say that a Blade would make a good office machine, but it seems to be a good choice for a server.

      • Re:Why Sun? (Score:2, Informative)

        by Anonymous Coward
        I am amazed at how people buy into the myth of cheap PC?s...

        You mean people like Google [google.com] who run their highly-regarded search engine/translator [google.com]/image indexer [google.com]/Usenet archive [google.com] on a server farm of 8,000 inexpensive [internetweek.com] PCs [google.com] with off-the-shelf Maxtor 80GB IDE HDs?

      • Well, one of the big arguments against placing a "cheap PC" into a server role is that the hardware isn't built to the same standards as higher-priced server-grade hardware. Now, it would seem to me that the Blade, by virtue of being the low-end desktop model in Sun's lineup, would be the victim of the same cost-cutting, and itself not be adequate for webserver duty.

        Of course, if you ask me, the article was just an x86 hardware-review site's attempt to justify using non-x86 hardware on their new server.
    • http://www.aceshardware.com/read.jsp?id=45000243 [aceshardware.com]
      On the x86 side of things, we found that much of the inexpensive x86 hardware is targetted towards the home market and, accordingly, was not really suitable to the task at hand. Intel-based servers offered by OEMs did not offer much of a price break over our SPARC options, if any at all. As for the DIY market, there was a premium associated with the more high-end motherboards and other components we desired. Commodity hardware prices aside, a platform change would incur its own costs in the form of investments in software that would have to be replaced.
      I think the software development costs (they'd already done a lot of work for a platform they knew, apparently with some specific third party tools not available for Solaris/x86) were their biggest consideration. (They also mention "sparse hardware support" for Solaris/x86.)

      Oddly, their OS choices for x86 seemed to be Windows 9x and Solaris; no mention of NT/2000/XP, let alone Linux or *BSD.
    • Re:Why Sun? (Score:2, Interesting)

      by Skuld-Chan (302449)
      I use two sun's at home - one is a SS10 (my firewall running 2.4.14) and the other is a SS20 (my file/web/webcam server) - it does samba, nfs and ftp for files - and it has a videopix digitizer for webcam stuff.

      Why did I chose sparc? Well its a tad quieter then a X86 box, smaller, and (and this is the point) it uses up a lot less power. The SS10 ships with a 65 watt ps (at least mine did). Considering you can get these things for less then 25~65 dollars they are a bargain (I paid 25$ for the SS10 and 65$ for the SS20). Anyhoo I kept my SS10 running for 30 minutes on a 300 va ups when the power went out last week - I doubt its drawing more then 25 watts peak. The software is still free since it runs debian linux well (and you can get sloaris for it too for free)

      Plus - I have the added advantage that for some reason sun equipment is like a geek's dream - they look kinda cool sitting on the table next to the cable connection. Everyone who has ever come by has to comment on them somehow - either "whats that" - or "wow - you have one of those?". Don't get me wrong - there slow, (the SS10 has a cacheless microsparc in it), but the SS10 seems to keep up with the 4 megabit cable connection okay.
    • Remember that acehardware was first made in 1997. Gnu and Linux were viewed as too experimental and not proven for use in the server space. Also x86 server hardware was only a few years old and not real mature. Only Novell was used with it at the time. Many in IT viewed x86 as unreliable.


      So ace probably had to make a decision. Could we A.) Use this new WindowsNT with IIS 2.0 and make this work with questionable quality x86 hardware? or could we B.) Use a standard UNIX variant on standard UNIX hardware and purchase the software tools to make our site. or C.) have a third party company make the decision for us. (which they would obviously choose UNIX).



      Now in 2001 the x86 market has changed. Its now the opposite of the past. In the old days it was hard to find any UNIX for x86 besides skunkware. Anyway they have invested in solaris and I assume they do not own much of the software if a third party wrote it or it was sun specific. I assume it would be relatively easy to port it to linux but why go through the effort. This is why they went with sun. Also even in 1997 the old sun workstation was loaded with features you still could not get with a pc at the same price. I do agree that a dual processor x86 system might better sense then a single SPARC system. SMP machines appear to work so fluidly under heavy loads. But I guess they had there reasons. I would of spent more bucks for a dual SPARC system if I were them but they are alot more technical then I am.



      Also pentium4's are unreliable and have high latency rdram. The latency wouldn't affect regularly work performance running pc apps or games but on a server it will cripple it. Remember that is not just the speed of the ram but also the responsiveness when alot of users query it with requests. Also The pentium4's get really hot which is another reason why they chose a sun. If a fan dies and the board and cpu burn then shit hits the fan. . Also after a few days of downtime could cost ace most of its customers. Thats really bad and reliability is important.

    • Re:Why Sun? (Score:2, Informative)

      by Zog (12506)
      One of the great things about sparcs is their performance under load - they're a *lot* better at running under high loads than your typical pc.

      About pc's having more competition, it's not a hard argument that the competition isn't really what it seems to be - most of the competition is in price and how fast Quake will play. If Intel's processor is a little bit slower than AMD's, the fact that it still goes into most OEM computers will keep Intel alive. If Sun does not stand up to the competition with their processors, motherboards, and other components, people will leave them for something better, and Sun will be down the hole. They *have* to be better to survive - there's not much forcing people to stick with them.

      They're also a lot more solid in their roots (Sun servers have been around forever, so they've had a lot of time to work on tweaking things and getting processors to work well for their applications), and Sun's support generally ranges from fairly good to downright amazing, from what I've heard (not that I've needed it).

      But in the end, it's a lot different from PC hardware, and it can sometimes take a bit of getting used to.
  • by green pizza (159161) on Tuesday November 27, 2001 @06:25PM (#2621688) Homepage
    The SPARCstation 20 was one heck of a great machine back in the day, especially for its size (a low profile pizzabox). The design was a lot like it's older brother (the SPARCstation 10 from 1992)... that is, two MBUS slots (for up to 4 CPUs) and 4 SBUS slots (Sun Expansion cards, 25 MHz x 64 bit = ~ 200 MB/sec each, but 400 MB/sec bus total).

    I remember using a Sun evaulation model at Rice many years ago... the machine had two 150 MHz HyperSPARC processors (though 4 were avilable for more $$), a wide SCSI + fast ethernet card, two gfx cards for two monitors, and some sort of strange high speed serial card (for some oddball scanner, I think). Not to mention 512 MB of ram, in 1994! The machine was a pretty decent powerhouse and sooo small! I sort of wish the concept would have caught on, given how large modern workstations are in comparison. Heck, back then an SBUS card was about 1/3 the size of a modern 7" PCI card.

    Then there's the other end of the spectrum... one department bought a Silicon Graphics Indigo2 Extrme in 1993. The gfx cardset was three full size GIO-64 cards (64 bit @ 100 MHz = about 800 MB/sec), one of which had 8 dedicated ASICs for doing geometry alone. 384 MB of RAM on that beast. Pretty wild stuff for the desktop.

    Ahh, technology. I love you!
  • by Fesh (112953)
    Ace's Hardware? Why do I see a nasty trademark violation in some poor webmaster's future?

    *sigh* Probably because we've seen enough of it in the past...
  • Confusing the issues (Score:4, Informative)

    by Alex Belits (437) on Tuesday November 27, 2001 @06:36PM (#2621741) Homepage

    In a part about databases and persistent connections they confuse the issues more than a bit. The real problem is not too many processes, what automatically makes threads look better, but the symmetry among processes -- any request should be possible to serve by every process, so all processes end up with database connections. This is a problem particular to Apache and Apachelike servers, not a fundamental issue with processes and threads.

    In my server (fhttpd [fhttpd.org] I have used the completely different idea -- processes are still processes, however they can be specialized, and requests that don't run database-dependent scripts are directed to processes that don't have database connections, so reasonable performance is achieved if the webmaster defines different applications for different purposes. While I didn't post any updates to the server's source in two last years (was rather busy at work that I am leaving now), even the published version 0.4.3, despite its lack of clustering and process management mechanism that I am working on now, performed well in situations where "lightweight" and "heavyweight" tasks were separated.

    • I use FastCGI and Apache. With FCGI the app itself actually persists and waits for connections. The webserver connects to the app and gets the stuff back.

      The webserver doesn't increase much in size having it serve static pages isn't a waste.

      Since the app itself persists it can do what it want with DB connections. The app can be threaded or nonthreaded.

      In these sort of circumstances looking to some sort of buffering would help a lot too - stuff that sucks the page from your servers at 100Mbps and trickle it down to some poor 9.6 dial up user. That way the number of persistent DB/app connections doesn't really go up at your server, even when the number of persistent users connecting to your site does.
      • by Alex Belits (437)
        FastCGI is better than just a bunch of symmetric processes, however it has some serious flaws -- among them poor security model for processes that run on other hosts (fhttpd reverses the logins, backends' connect to the server, and those connections authenticate on the server), and a need to proxy the response through a server for processes that run locally (fhttpd passes a client's fd to the backend process).

        Other than that, FastCGI is a good idea.
  • by asn (4418)
    http://www.aceshardware.com/read.jsp?id=45000241

    500 Servlet Exception

    java.lang.NullPointerException
    at BenchView.SpecData.BuildCache.(BuildCache.java:96)
    at BenchView.SpecData.BuildCache.getCacheOb(BuildCach e.java:82)
    at BenchView.SpecData.BuildCache.getLastModified(Buil dCache.java:45)
    at BenchView.SpecData.BuildCache.getLastModifiedAgo(B uildCache.java:50)
    at _read__jsp._jspService(/site/sidebar_head.jsp:60)
    at com.caucho.jsp.JavaPage.service(JavaPage.java:87)
    at com.caucho.jsp.JavaPage.subservice(JavaPage.java:8 1)
    at com.caucho.jsp.Page.service(Page.java:474)
    at com.caucho.server.http.FilterChainPage.doFilter(Fi lterChainPage.java:166)
    at ToolKit.GZIPFilter.doFilter(GZIPFilter.java:22)
    at com.caucho.server.http.FilterChainFilter.doFilter( FilterChainFilter.java:87)
    at com.caucho.server.http.Invocation.service(Invocati on.java:277)
    at com.caucho.server.http.CacheInvocation.service(Cac heInvocation.java:129)
    at com.caucho.server.http.HttpRequest.handleRequest(H ttpRequest.java:216)
    at com.caucho.server.http.HttpRequest.handleConnectio n(HttpRequest.java:158)
    at com.caucho.server.TcpConnection.run(TcpConnection. java:140)
    at java.lang.Thread.run(Thread.java:484)

    Resin 2.0.2 (built Mon Aug 27 16:52:49 PDT 2001)
  • smoke test (Score:2, Funny)

    by anasophist (539487)
    brag up new server
    kind soul links us from slashdot
    looks like we eat crow
  • by green pizza (159161) on Tuesday November 27, 2001 @07:37PM (#2622088) Homepage
    There are more factors than just CPU and Bandwidth... like what's between the two. A new coworker recently told me of his major learning experiences in the mid 1990s running several popular news websites durring the beginning of the web boom. One of the more popular sites he ran originally had a T1 routed through a Cisco 4000 router. Things worked great until he had an additional, load balancing T1 added for added thruput and redundancy. Things didn't feel much faster, in fact, they were almost slower. After much investigation he learned that the router didn't have enough RAM or CPU to handle the packet shuffling that intelligent multihoming routing requires. A similar instance happined with a friend's company when they tried to run a T3 through their existing router. While the old cisco had enough cpu and ram in theory, its switching hardware and thruput couldn't handle the full number of packets the T3 was providing thru the shiny new HSSI high speed serial card.

    Now, I realize modern hardware (Cisco 3660 and 7x00 series, and pretty much any Juniper) can route several T3s (at 45mbps each direction) worth of data, but older routers and minimally configed routers do exist.

    There are MANY bottlenecks in hosting a website. Server daemon, CPU, router, routing and filtering methods, latency and hops between server and internet backbones, overall bandwidth thruput, and much more.

    It's not as simple as "lame server, overloaded CPU, should have installed distro version x+1".
  • Seems to running a tad slow ATM; getting timeouts on www.aceshardware.com...

    Mebbe they really needed a v880 or summat before they started getting posted on /. :)

  • it's the BANDWIDTH (Score:5, Informative)

    by green pizza (159161) on Tuesday November 27, 2001 @08:00PM (#2622183) Homepage
    If you haven't noticed by now, Ace's Hardware has a neat little indicator on each page that shows time processing and queue time it spent getting to you (very bottom left-hand corner of each page). Most are about 74ms - 112ms for me. This, plus the result of some pings and traceroutes leads me to belive they're heavily BANDWIDTH bound right now, not CPU bound. I do hope Ace puts up a summary of the Slashdot effect as well as some other data for us to pour over. Some MRTG router graphs of the bandwidth usage would be *really* nice, too.
    • As I go through the article, I see 90843ms and 117882ms, so I fear your are mistaken. Darn.

      Almost every server I've ever seen using JSP is dog slow. They have what look like very nice reasons for using it, but it sure doesn't look like they quite work out in practice.

      Anyone know why?

      D
  • The article mentions that it rebuilt all its web applications in Java as opposed to PHP, and it seems the main reason they did this was to go from multiprocess to multithreaded. However, PHP can be compiled as a shared object (for apache, or as an ISAPI extension for windows), and does do multithreading, database connection pooling, and all those other goodies. It seems they had been using PHP as a CGI...hrmmm. Not a good idea.

    Speed shouldn't be the reason you switch to Java. If anything, I've found that PHP has been faster for simple web applications and page serving (and loads faster to develop applications with), while Java stands out as being more robust and stable.

  • by shadoelord (163710) on Tuesday November 27, 2001 @11:53PM (#2623112) Homepage
    quote:
    This means an Apache web server using keepalives will need to have more child processes running than connections. Depending upon the configuration and the amount of traffic, this can result in a process pool that is significantly larger than the total number of concurrent connections. In fact, many large sites even go so far as to disable keepalives on Apache simply because all the blocked processes consume too much memory.

    ::end quote::

    lets see, anyone here hear of COW (copy on write) Linux uses this idea to save time on fork'd child processes, they get the ::same:: address space, and only get new memory when they try to touch a page that is write only (ie they can run and run and run, but once they try to access their memory they get new memory space with the contents copied). It saves time and memory.

    The only setback is when a process fork's a child, its current time slice is cut in half with half given to the child, so the main proc will run aground if to many requests come in and the server has more processes to worry about. :(

    -ShadoeLord
  • Multithread Apache (Score:3, Informative)

    by Zeinfeld (263942) on Wednesday November 28, 2001 @12:47AM (#2623226) Homepage
    The article preens itself over the use of multithreaded code over the multiprocess model of Apache. This is potentially a big win since the multiprocess model involves a lot of expensive process context swoitching and process to process communication which is expensive as opposed to thread switching.

    When I discussed this issue with Thau (or to be precise, he did most of the talking) he gave the reason for using processes over threads as the awful state of the then pthreads packages. If Apache was to be portable it could not use threads. He even spent some time writing a threads package of his own.

    I am tempted to suggest that rather than abandon apache for some java server (yeah lets compile all our code to an obsolete byte code and then try to JIT compile it for another architecture), it should not be a major task to replace the Apache hunt group of processes with a thread loop.

    The other reason Thau gave for using processes was that the scheduler on UNIX sux and using lots of threads was a good way to get more resources, err quite.

    Now that we have Linux I don't see why the design of applications like apache should be compromised to support obsolete and crippled legacy O/S. If someone wants to run on a BSD Vaxen then they can write their own Web server. One of the liabilities of open source is that once a platform is supported it can end up with the application supporting the platform long after the O/S vendor has ceased to. In the 1980s I had an unpleasant experience with a bunch of physicists attempting to use an old MVS machine, despite the fact that the vendor had obviously ceased giving meaningfull support for at least a decade. In particular they insisted that all function calls in the fortran programs be limited to 6 characters since they were still waiting for the new linker (when it came it turned out that for functions over 8 characters long it took the first four characters and the last four characters to build the linker label... lame, lame, lame)

"Our vision is to speed up time, eventually eliminating it." -- Alex Schure

Working...