Building a Better Webserver 286
msolnik writes: "The guys over at Aces' Hardware have put up a new article going over the basics, and not-so-basics, of building a new server. This is a very informative, I think everyone should devote 5 minutes and a can of Dr Pepper to this article."
Why not get the server version - Netra X1? (Score:2, Informative)
Re:OSDN: Please read this (Score:4, Informative)
good but... they discounted x86 to fast (Score:2, Informative)
Re:Best webserver to generate traffic (Score:2, Informative)
Re:OSDN: Please read this (Score:2, Informative)
Also, don't confuse the CGI protocol with short-lived CGI binaries. Slashdot uses modperl, whcih is NOT a short-lived process, but Apache is still a forking server in the 1.3.x branch.
Compression for dialup connections??? (Score:4, Informative)
Except that 56 Kbps modems get 5 KBps thoughput by compressing the data! If the client and server compress, the modems won't be able to; the net effect is lots of extra work on the server side, and probably no increased throughput for the modem user.
The server might or might not see a decrease in latency, and in the number of sockets needed simultaneously; it depends on how much it can "stuff" the intermediate "pipes". The server will see an overall decrease in bandwidth needed to serve all the pages.
Ironically, broadband customers (who presumably don't have any compression between their clients and Internet servers) will see pages load faster. (And the poor cable modem providers from the previous story will be happy.)
Re:New Webserver? - not good (Score:5, Informative)
One thing that does seem to work against the onslaught is a throttling webserver [acme.com]. If you haven't got the bandwidth etc to serve a sudden onslaught of requests, probably the best thing to do is to just start 503'ing -- at least people get a quick message 'come back later' instead of just dead air.
Re:OSDN: Please read this (Score:3, Informative)
In other words, with the right hardware architecture, threads could be very useful for sites such as Ace's Hardware (though they happened to go with a uniprocessor) and Slashdot.
Java threads are also easier to program than C and C++ threads, though not easy. (Manual memory management is hard; thread programming is hard; manual memory management in a threaded program is very hard. I'm not speaking hypothetically on the last point; I've really envied Java programmers the last few weeks.)-:
Re:Compression for dialup connections??? (Score:3, Informative)
mod_gzip is your friend.
Confusing the issues (Score:4, Informative)
In a part about databases and persistent connections they confuse the issues more than a bit. The real problem is not too many processes, what automatically makes threads look better, but the symmetry among processes -- any request should be possible to serve by every process, so all processes end up with database connections. This is a problem particular to Apache and Apachelike servers, not a fundamental issue with processes and threads.
In my server (fhttpd [fhttpd.org] I have used the completely different idea -- processes are still processes, however they can be specialized, and requests that don't run database-dependent scripts are directed to processes that don't have database connections, so reasonable performance is achieved if the webmaster defines different applications for different purposes. While I didn't post any updates to the server's source in two last years (was rather busy at work that I am leaving now), even the published version 0.4.3, despite its lack of clustering and process management mechanism that I am working on now, performed well in situations where "lightweight" and "heavyweight" tasks were separated.
Re:New Webserver? - absolutely (Score:3, Informative)
Yes, a throttling server is a great idea. If you recognize that there will always be a load too high for you to handle (10 requests per minute for my site, yes minute, it is a physical device), then you must either decide to deal with the load or let the load crush your machine.
Consider a typical web server. When it gets overloaded it slows down, each request takes longer to handle, there are more concurrent threads, overall efficiency drops, each request takes longer to handle.... welcome to the death spiral. (on my site-which-must-not-be-named-less-it-be-slashdott
The key decision is to determine how many concurrent threads you can handle without sacrificing efficiency and then reject enough traffic to stay under that limit.
This is where optimism comes in and bites you in the ass. You remember that every shunned connection is going to cost you money/fame/clicks whatever so you set the limit too high and melt down anyway.
Re:Compression for dialup connections??? (Score:5, Informative)
it's the BANDWIDTH (Score:5, Informative)
Re:Why Sun? (Score:2, Informative)
About pc's having more competition, it's not a hard argument that the competition isn't really what it seems to be - most of the competition is in price and how fast Quake will play. If Intel's processor is a little bit slower than AMD's, the fact that it still goes into most OEM computers will keep Intel alive. If Sun does not stand up to the competition with their processors, motherboards, and other components, people will leave them for something better, and Sun will be down the hole. They *have* to be better to survive - there's not much forcing people to stick with them.
They're also a lot more solid in their roots (Sun servers have been around forever, so they've had a lot of time to work on tweaking things and getting processors to work well for their applications), and Sun's support generally ranges from fairly good to downright amazing, from what I've heard (not that I've needed it).
But in the end, it's a lot different from PC hardware, and it can sometimes take a bit of getting used to.
Re:Compression for dialup connections??? (Score:2, Informative)
Re:Why Sun? (Score:2, Informative)
You mean people like Google [google.com] who run their highly-regarded search engine/translator [google.com]/image indexer [google.com]/Usenet archive [google.com] on a server farm of 8,000 inexpensive [internetweek.com] PCs [google.com] with off-the-shelf Maxtor 80GB IDE HDs?
Multithread Apache (Score:3, Informative)
When I discussed this issue with Thau (or to be precise, he did most of the talking) he gave the reason for using processes over threads as the awful state of the then pthreads packages. If Apache was to be portable it could not use threads. He even spent some time writing a threads package of his own.
I am tempted to suggest that rather than abandon apache for some java server (yeah lets compile all our code to an obsolete byte code and then try to JIT compile it for another architecture), it should not be a major task to replace the Apache hunt group of processes with a thread loop.
The other reason Thau gave for using processes was that the scheduler on UNIX sux and using lots of threads was a good way to get more resources, err quite.
Now that we have Linux I don't see why the design of applications like apache should be compromised to support obsolete and crippled legacy O/S. If someone wants to run on a BSD Vaxen then they can write their own Web server. One of the liabilities of open source is that once a platform is supported it can end up with the application supporting the platform long after the O/S vendor has ceased to. In the 1980s I had an unpleasant experience with a bunch of physicists attempting to use an old MVS machine, despite the fact that the vendor had obviously ceased giving meaningfull support for at least a decade. In particular they insisted that all function calls in the fortran programs be limited to 6 characters since they were still waiting for the new linker (when it came it turned out that for functions over 8 characters long it took the first four characters and the last four characters to build the linker label... lame, lame, lame)
Re:New Webserver? (Score:2, Informative)
Most people are unlikely/too lazy to follow the comment link above so I've repeated the first part of the response below:
Re:Confusing the issues (Score:3, Informative)
Other than that, FastCGI is a good idea.