Forgot your password?
typodupeerror
Hardware

Building a Better Webserver 286

Posted by michael
from the slashdot-will-beat-a-path-to-your-door dept.
msolnik writes: "The guys over at Aces' Hardware have put up a new article going over the basics, and not-so-basics, of building a new server. This is a very informative, I think everyone should devote 5 minutes and a can of Dr Pepper to this article."
This discussion has been archived. No new comments can be posted.

Building a Better Webserver

Comments Filter:
  • Good article, but... (Score:5, Interesting)

    by Computer! (412422) on Tuesday November 27, 2001 @05:56PM (#2621493) Homepage Journal
    Microsoft [microsoft.com] has written several white papers [microsoft.com] of this sort already. Of course, they're Microsoft, so that means I can kiss my +2 bonus goodbye. Seriously, though.
  • by NerveGas (168686) on Tuesday November 27, 2001 @06:05PM (#2621550)
    <I think the part about Java/Resin is the most crucial. Anybody can throw hardware at a problem, but their programming methodolgy makes tremendous sense (ie: dump this Apache/CGI garbage in favor of real multithreading)>

    Funny. What was our next closest competitor spent several million dollars on Sun hardware and everything done in Java. We spent less than $40,000 on some dual-proc Intel machines, doing everything with Postgres, Perl, and Apache. The result? Our servers have many times the capacity that theirs do, and they're almost completely out of business.

    steve
  • True (Score:3, Interesting)

    by john@iastate.edu (113202) on Tuesday November 27, 2001 @06:08PM (#2621573) Homepage
    Apache is cool and all, but I wonder if it is still the right tool for a lot of sites -- it has every feature under the sun, but it seems to me that more and more sites are getting more and more specialized and thus needing less and less of these features.

    Once upon a time, we had 1 web server that did everything, so it needed to be able to do everything. Now everytime we do something new we toss out a new webserver (or 2 or 10 of 'em). And they all basically need to do one thing (webmail, portal, whatnot) and do it well and that's it.

    So we've got a whole bunch of Apache servers which a bucket load of apache processes who basically spend all day doing little more than exec'ing the same CGI over and over (and copying the data around a couple of extra times).

    I'm pretty much now convinced that would my next step is going to be to franken-meld my cgi with something like mini-httpd [acme.com] so it is a single, persistant, app.

    I'm certainly not redoing the whole thing in Java though! :)

  • by green pizza (159161) on Tuesday November 27, 2001 @06:25PM (#2621688) Homepage
    The SPARCstation 20 was one heck of a great machine back in the day, especially for its size (a low profile pizzabox). The design was a lot like it's older brother (the SPARCstation 10 from 1992)... that is, two MBUS slots (for up to 4 CPUs) and 4 SBUS slots (Sun Expansion cards, 25 MHz x 64 bit = ~ 200 MB/sec each, but 400 MB/sec bus total).

    I remember using a Sun evaulation model at Rice many years ago... the machine had two 150 MHz HyperSPARC processors (though 4 were avilable for more $$), a wide SCSI + fast ethernet card, two gfx cards for two monitors, and some sort of strange high speed serial card (for some oddball scanner, I think). Not to mention 512 MB of ram, in 1994! The machine was a pretty decent powerhouse and sooo small! I sort of wish the concept would have caught on, given how large modern workstations are in comparison. Heck, back then an SBUS card was about 1/3 the size of a modern 7" PCI card.

    Then there's the other end of the spectrum... one department bought a Silicon Graphics Indigo2 Extrme in 1993. The gfx cardset was three full size GIO-64 cards (64 bit @ 100 MHz = about 800 MB/sec), one of which had 8 dedicated ASICs for doing geometry alone. 384 MB of RAM on that beast. Pretty wild stuff for the desktop.

    Ahh, technology. I love you!
  • by asn (4418) on Tuesday November 27, 2001 @06:45PM (#2621788) Homepage
    http://www.aceshardware.com/read.jsp?id=45000241

    500 Servlet Exception

    java.lang.NullPointerException
    at BenchView.SpecData.BuildCache.(BuildCache.java:96)
    at BenchView.SpecData.BuildCache.getCacheOb(BuildCach e.java:82)
    at BenchView.SpecData.BuildCache.getLastModified(Buil dCache.java:45)
    at BenchView.SpecData.BuildCache.getLastModifiedAgo(B uildCache.java:50)
    at _read__jsp._jspService(/site/sidebar_head.jsp:60)
    at com.caucho.jsp.JavaPage.service(JavaPage.java:87)
    at com.caucho.jsp.JavaPage.subservice(JavaPage.java:8 1)
    at com.caucho.jsp.Page.service(Page.java:474)
    at com.caucho.server.http.FilterChainPage.doFilter(Fi lterChainPage.java:166)
    at ToolKit.GZIPFilter.doFilter(GZIPFilter.java:22)
    at com.caucho.server.http.FilterChainFilter.doFilter( FilterChainFilter.java:87)
    at com.caucho.server.http.Invocation.service(Invocati on.java:277)
    at com.caucho.server.http.CacheInvocation.service(Cac heInvocation.java:129)
    at com.caucho.server.http.HttpRequest.handleRequest(H ttpRequest.java:216)
    at com.caucho.server.http.HttpRequest.handleConnectio n(HttpRequest.java:158)
    at com.caucho.server.TcpConnection.run(TcpConnection. java:140)
    at java.lang.Thread.run(Thread.java:484)

    Resin 2.0.2 (built Mon Aug 27 16:52:49 PDT 2001)
  • Re:Why Sun? (Score:2, Interesting)

    by Skuld-Chan (302449) on Tuesday November 27, 2001 @07:24PM (#2622031)
    I use two sun's at home - one is a SS10 (my firewall running 2.4.14) and the other is a SS20 (my file/web/webcam server) - it does samba, nfs and ftp for files - and it has a videopix digitizer for webcam stuff.

    Why did I chose sparc? Well its a tad quieter then a X86 box, smaller, and (and this is the point) it uses up a lot less power. The SS10 ships with a 65 watt ps (at least mine did). Considering you can get these things for less then 25~65 dollars they are a bargain (I paid 25$ for the SS10 and 65$ for the SS20). Anyhoo I kept my SS10 running for 30 minutes on a 300 va ups when the power went out last week - I doubt its drawing more then 25 watts peak. The software is still free since it runs debian linux well (and you can get sloaris for it too for free)

    Plus - I have the added advantage that for some reason sun equipment is like a geek's dream - they look kinda cool sitting on the table next to the cable connection. Everyone who has ever come by has to comment on them somehow - either "whats that" - or "wow - you have one of those?". Don't get me wrong - there slow, (the SS10 has a cacheless microsparc in it), but the SS10 seems to keep up with the 4 megabit cable connection okay.
  • by green pizza (159161) on Tuesday November 27, 2001 @07:37PM (#2622088) Homepage
    There are more factors than just CPU and Bandwidth... like what's between the two. A new coworker recently told me of his major learning experiences in the mid 1990s running several popular news websites durring the beginning of the web boom. One of the more popular sites he ran originally had a T1 routed through a Cisco 4000 router. Things worked great until he had an additional, load balancing T1 added for added thruput and redundancy. Things didn't feel much faster, in fact, they were almost slower. After much investigation he learned that the router didn't have enough RAM or CPU to handle the packet shuffling that intelligent multihoming routing requires. A similar instance happined with a friend's company when they tried to run a T3 through their existing router. While the old cisco had enough cpu and ram in theory, its switching hardware and thruput couldn't handle the full number of packets the T3 was providing thru the shiny new HSSI high speed serial card.

    Now, I realize modern hardware (Cisco 3660 and 7x00 series, and pretty much any Juniper) can route several T3s (at 45mbps each direction) worth of data, but older routers and minimally configed routers do exist.

    There are MANY bottlenecks in hosting a website. Server daemon, CPU, router, routing and filtering methods, latency and hops between server and internet backbones, overall bandwidth thruput, and much more.

    It's not as simple as "lame server, overloaded CPU, should have installed distro version x+1".
  • by Anonymous Coward on Tuesday November 27, 2001 @08:56PM (#2622442)
    But they didn't really go through a validation process. They appear to make numerous assumptions, but they don't have any test data to prove them.

    Our company has worked similarly, but what we've begun to do recently is test our assumptions. We throw a test load against the server and then watch each component to see how it performs in reality.

    I would have liked to see a real comparison between the two machines serving live web content. They don't do that. :(
  • by srvivn21 (410280) on Tuesday November 27, 2001 @08:59PM (#2622449)
    If you can't live without Apache, there's always mod_bandwidth [cohprog.com].

    Not quite as elegant a solution, but it's nice for preventing your web server from taking all of your bandwidth (if, say, you run it off your cable modem, and wish to continue gaming...).
  • by closedpegasus (212610) on Tuesday November 27, 2001 @10:44PM (#2622878)
    The article mentions that it rebuilt all its web applications in Java as opposed to PHP, and it seems the main reason they did this was to go from multiprocess to multithreaded. However, PHP can be compiled as a shared object (for apache, or as an ISAPI extension for windows), and does do multithreading, database connection pooling, and all those other goodies. It seems they had been using PHP as a CGI...hrmmm. Not a good idea.

    Speed shouldn't be the reason you switch to Java. If anything, I've found that PHP has been faster for simple web applications and page serving (and loads faster to develop applications with), while Java stands out as being more robust and stable.

  • by shadoelord (163710) on Tuesday November 27, 2001 @11:53PM (#2623112) Homepage
    quote:
    This means an Apache web server using keepalives will need to have more child processes running than connections. Depending upon the configuration and the amount of traffic, this can result in a process pool that is significantly larger than the total number of concurrent connections. In fact, many large sites even go so far as to disable keepalives on Apache simply because all the blocked processes consume too much memory.

    ::end quote::

    lets see, anyone here hear of COW (copy on write) Linux uses this idea to save time on fork'd child processes, they get the ::same:: address space, and only get new memory when they try to touch a page that is write only (ie they can run and run and run, but once they try to access their memory they get new memory space with the contents copied). It saves time and memory.

    The only setback is when a process fork's a child, its current time slice is cut in half with half given to the child, so the main proc will run aground if to many requests come in and the server has more processes to worry about. :(

    -ShadoeLord

"I have just one word for you, my boy...plastics." - from "The Graduate"

Working...