Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Hardware

Building Big Sites on a Budget 78

Joe Mamma writes: "There is a good article on Anandtech.com about how they upgraded their backend. They are running a bunch of AMD chips in their servers and make good use of the Linux Virtual Server Project software for their load balancers. Anyway its a good read for those who are looking to expand their web backend on a budget."
This discussion has been archived. No new comments can be posted.

Building Big Sites on a Budget

Comments Filter:
  • by Anonymous Coward
    Submit a story to /. and wait for the flood of hits!

    Here's [founderscamp.com] an example.

  • by Anonymous Coward
    Anandtech.com...now there's an unbiased opinion.

    And in reply to one of those first posts, I'm sure
    a dual-1Ghz p3 will smoke a 1.3 Athlon. easily.
  • by Anonymous Coward
    When sun implements mosix they call it severalmoresix, the marketing dept. decided using ebonics like "mo" in a product name associated with sun would be innapropriate, so they have renamed it severalmoresix.

    Also they have renamed bugzilla to issuezilla again at recommendation of the all knowning marketing dept.
  • by Anonymous Coward
    Anandtech once tried to get me fired!!

    I had noticed someone in the "for sale/trade" forum was trying to sell something that was obviously not what the seller claimed it to be, when I told the seller this I was flamed and banned! Apparently I had broken a little known rule called "thread crapping", meaning anything said that could have an adverse effect on the sale of an item.

    after my banning i e-mailed the mods and told them i was sorry, i didn't know the rule (it's not posted anywhere unless you search the forum) and i won't do it again. They said too bad, and blocked the IP I had posted from, the IP at work. I then wrote back and told them they were being very childish, that I was sorry and banning me would do no good since I could still post at home.

    so what do they do? THEY CALLED MY WORK AND COMPLAINED! I then received threatening e-mails from IT telling me to stop whatever I had been doing, but they quickly changed their minds when I forwarded them the e-mails.

    Here's the e-mails. The original thread has been removed, but I have printed copies if anyone wants to read 60 pages of flames.


    Trolls? please. You guys are way out of line. Read what happened. I can't believe you are trying to defend your actions. The president of Netzero would be appalled if he saw what you were doing and asking for his support.

    qoutes from others on the internet:
    "I frequent anand's forums, but I keep out of the conversation. I find the mods are a tad jumpy, especially in the hot deals section"

    "Anandtech's forum mods can just be plain bitchy. I remember when a bunch of people who frequented the hot deals section got pissed off and left because the mods forbid anyone from posting coupon codes and were rather liberal in their bannings."

    I guess I'm still the one wrong here though, did you guys even read what happened?

    I already have an account and have been using it, and what exactly do you mean by "ill intent"? By using your forum further I'm commiting ill intent?

    (removed)



    >From: moderator@anandtech.com (AnandTech Moderator)
    >To: (removed)@hotmail.com
    >Subject: Re: AnandTech
    >Date: Wed, 20 Dec 2000 12:39:31 -0500
    >
    >Mr. (removed),
    >
    >Thank you for your threats to return under other names. Your attitude is exactly why we will not consider allowing you to return to AnandTech.
    >
    >We will keep your email on file as evidence of your ill intent. When we catch you, we will file a formal abuse complaint against you with your ISP. NetZero is no exception. We have dealt personally with the president of NetZero. He has previously pledged his support in dealing with trolls like you.
    >
    >AnandTech Moderator
    >
    >
    >(removed) wrote:
    > >
    > > alright i'll just sign up for a new account, but you blocked my IP at work > > (and we're behind a proxy so every terminal has the same IP) so I can't post > > from work anymore :( i've already got some other logins so that's solved.
    > >
    > > I really wish you guys would just give my account back, I don't know what > > you intend to prove by making me not post from work. I said I wouldn't post > > "crap" anymore and I was not aware of the no crapping rule. You're being > > very childish and it's a poor reflection of anandtech in general.
    > >
    > > and if you do happen to block my IP from home their's always netzero.
    > >
    > > (removed)
    >


    I attempted to sue through small claims court for defamation of character, slander and harassment, but was informed I can not sue in small claims for offenses such as that. Decided it wasn't worth pursuing unless I was actually fired from my position, which I wasn't, i'm still here, 5 months later.

    if you do decide to visit anandtech's forum just be very careful not to say anything. You'll also notice many others on the forum feel the same way i do about the mods there, they're not called "Banandtech" on Fatwallet.com's forums for nothing! [fatwallet.com]

  • by Anonymous Coward
    at www.camarades.com we use a server called 'yawwws'
    (for yet another www server), it runs as a
    caching front end to a linux box with php/apache
    on it. It is lightning fast, we get all the
    benefits of scripting and we use very little
    hardware (two boxes run the majority of the
    site and they aren't even breathing very
    hard). www7.camarades.com and ww9.camarades.com
    are doing some 130 million pageviews monthly
    between them... the servers were benched at
    slightly more than 3000 hits / second, using
    ab on the localhost because the NIC was too
    much of a slowdown ! (each,
    not both)
  • by Anonymous Coward
    Check this [microsoft.com] out. Microsoft just published some new information on their migration of Hotmail from *BSD to Windows 2000.

    Interesting read. I'm sure some of you have read about this before, but the new article talks about budget constraints, among other topics. The project was tightly budgeted to begin with, but they still managed to bring it in way under budget. Some of the strategies Microsoft used would apply to a small, independent website as well.

  • by Anonymous Coward
    I think my backend is expanded enough. After all,
    all I do is sit here are read /. all day and night

    :-)
  • And remember that if you buy the $200 case, you'll have to spend another $100-150 to replace the powersupply, because the one that comes with it is crap and will likely fail in weeks/months.

    That is just so much bullshit. Make sure the power supply has a ball bearing fan ($12 instead of $6) and you're golden. There is no magic $300 price adder for a good power supply. Hell I'd go the distance and get a cheap case with no power supply, since I prefer to use the 48VDC provided at my colo instead of 120VAC and a shit-heavy UPS with batteries that need replacing every few years.

    BTW: I've had cheap-ass sleeve-bearing fans which have lasted over 3 years, and I've also had expensive ball-bearing fans last a week. There is no magic formula that says that $300 more for a case will get you anything but a lighter wallet. Yes, ball bearing fans are much better, but not $300 better.

  • But what about getting a power supply with good filter caps, etc? Your ps might not fail, but it might cause your computer components to, especially if your UPSs have non-sine wave output or aren't online.

    The regulation has to be pretty damn tight or the system will be flaky to begin with and would likely never make it into its coloation rack.

    Filter caps don't do too much with switchmode supplies: the ripple is into the hundreds of kHz and smaller caps can take care of the job for the most part. (Smaller meaning

    While I understand where you're going, I don't believe that crap that bad is out there. I've yet to see a shitty power supply that doesn't make itself known within the first few hours of plugging it in. If it works then, chances are it'll work for at least a few years before something going. And for $300 I'll replace a power supply every few years instead of having one costing $300 more and lasting a couple more after that.

  • I must be dreaming, then.

    But as you're discovering, on the back end -- the database machine, where failure is much less tolerable -- it is going to be much, much harder to spec out a machine. Your example of hardware RAID is the most suprising area where Linux simply falls down. Even though cards like Compaq's SmartArray RAID controller has working Linux drivers, there are no Linux utilities.

    I am sitting 15' away from a Linux machine running with a DPT Century UW2 RAID controller [adaptec.com] and I've got no problem running the Unix utilities. I can shut drives down, tune both the RAID and cache performance, silence alarms, rebuild RAID volumes, you name it. (It looks like DPT was bought by Adaptec so that may not be the exact card, mine is a hardware cache/RAID controller)

    Don't spew FUD. Hell even my ancient P90 had a DPT controller in it with working utilities and that was 5 years ago!

  • Sorry, I missed some of your trolling.

    For far too many of the RAID controllers out there, reconfiguring or rebuilding the arrays either involves shutting down the machine and rebooting into a stand-alone utility, or "echo"ing pretty much undocumented commands into your /proc/scsi filesystem. If a drive fails, you will have downtime. The Linux RAID solutions might protect your data integrity, but they will not protect your uptime. I couldn't even imagine trying to hot-plug some new drives into the array, and then resizing the live filesystem when I needed more room. (Someone will try to refute this, and will use the word "ReiserFS" in their post, but all I have to say is that there are a lot of possible things I couldn't imagine actually doing).

    Hmmm. Reiserfs will support this but I'd want to layer LVM on it before Reiserfs. As far as I know ext2 won't handle online resizing. With LVM and Reiser though it will. I've been playing with it on my home system. In a month or two I'll upgrade the office fileserver to support it. We've already got the hotswap drives in and the hardware RAID/cache, so this will just complete the upgrade path.

    Things like hot plugable CPUs, or network cards, or drive controllers, are simply non-existant on Linux. Stuff like monitoring utilities that will page you in the middle of the night when one of your redundant power supplies fail, or when one of your CPUs burns out, or when one of the drives in your storage arrays dies, simply don't exist. There's a lot of stuff that you need if you want to run 24x7 that just doesn't work in Linux yet.

    Interesting. I get paged if my server goes down or the internal case temp gets too high. I don't see it as much of a stretch to page if a CPU temp gets too high. Watching for casefans and whatnot is just a matter of temperature monitoring. And I do this on an external server, how about that... The main fileserver could explode and I'd still get paged that something was wrong.

    A big part of that is because of posts like yours, of course -- you don't want to spend more than $2,000.00. It's pretty damned hard to build a well supported server for $2,000.00, and the people who are willing to spend the money for real support often don't care as much about Linux. You're left with a lot of after-thought solutions -- stuff that got built for something else, and just happens to work on Linux by an almost happy accident.

    I'll agree here: Five 9.1G UW2 SCA drives aren't (weren't) cheap. 64M of ECC cache memory weren't cheap, and neither was the hardware RAID/cache controller it goes in to. The 20G DAT wasn't cheap, either. I don't have redundant power supplies but the server isn't that critical that I can't bring it down for 15 minutes to swap out a power supply. I've never had a power supply die on me in 5 years of network administration but then again we have a 300lb online UPS which keeps most of the nasties away from the computers. What was absolutely mission critical was that the data was safe in case of drive failure, and that it could be expanded as needed.

    In short, go away troll. I"ve got a perfectly maintainable Linux server which has worked superb over the last 6 years. Had a drive fail, had a CPU cook, never lost data. I don't remember arcane commands which are echoed to /proc, I run dptutil. All I seem to do on the server is keep an eye on the logs.

    You are right about one thing though: You need to consider what you're doing this for before you put a price on it. I was able to get away with about $3000 to $3500 after all the bills came in. It hasn't got hot-swappable CPUs or power supplies, but that wasn't mission critical for me. The datastore is solid and that's what mattered.

  • K5, for example, has been able to take several direct Slashdottings on 1 VA Fullon box.

    Now I like K5 as much as the next guy but I do NOT consider not being able to serve up pages as "taking a slashdotting."

    I think it was last week or the week before when the last slashdotting took place over rusty's culture post. I couldn't get so much as a squeak out of K5 until hours later. Sure the server might have stayed up but it sure as hell wasn't serving up too many pages...

  • Buying new hardware is much cheaper than training or recruiting new staff.

    And if your site is very data driven, ColdFusion is a better option than PHP or raw JSP (use an application server like Enhydra, Dynamo or Websphere if you really want to use Java).
  • by Niac ( 2101 ) on Wednesday April 25, 2001 @07:36PM (#265016) Homepage
    For those that don't want to wade through many "click here for the next page of ads" I bring you this link. [anandtech.com]

    Enjoy! :-)


    "We have the right to believe at our own risk any hypothesis that is live enough to tempt our will."

  • Have you looked at 3Ware [3ware.com] IDE RAID controllers. Excellent Linux/BSD support You'll find most dual mobo's have intel eepro's on them (support is okay under Linux) though you'll find many a thread under lkml where problems are attributed to eepro's. 3Com's 3C905C have support for hardware checksumming and zero-copy networking (incorporated in 2.4.4-pre6)
  • For network card, you should also consider Intel EtherExpress 100. From what I heard, It's been supported for the longest time and it's driver is the most mature.
    Where did you get a 2U case?
    ___
  • Coldfusion is sucks on UNIX platforms...

    shit, they even emulate the registry!!!

    It was built in UWin or WindU or whatever port system that takes win apps to Unix, not natively coded.

    Given what I've seen of ColdFusion sites, it doesn't surprise me that they need so many web boxes.. CF is unstable and buggy as hell..

    ColdFusion needs to die a slow painful death. Hell, I'd recommend Servlet/JSP before CF... I'd almost recommend ISAPI!

    Your Working Boy,
    - Otis (GAIM: OtisWild)
  • One of the things I wonder about though is the "dual processor" factors, which has many people going gah-gah over. Dual 700mhz's may sound nice, but to only serve up web content I wonder how is that better than just 1 700mhz chip or a 1ghz Athlon for that matter (anyone care to comment?)

    Call me a heretic, but I'm wondering if for static page service a P4 might be a good idea? Webservers like memory bandwidth, particularly if you're serving pages out of RAM. P4/RDRAM has buckets of RAM bandwidth, and in theory it should prove an excellent caching/static platform.

    Static webservice is mem/net bound, while dynamic webservice is dependent on your app's requirements..

    Your Working Boy,
    - Otis (GAIM: OtisWild)
  • onboard video for x86 server boards is actually a Good Thing. It means I don't have to bother looking for the cheapest possible AGP card at a computer show because it would kill me to spend $$$ for features that a server box will never use.

    onboard NICs aren't so bad either, though they tend to be intel which is sad. You'll need both onboard NICs and video if you want to go 1U..

    Your Working Boy,
    - Otis (GAIM: OtisWild)
  • by Lazy Jones ( 8403 ) on Thursday April 26, 2001 @04:01AM (#265023) Homepage Journal
    Just check this page: Advertising [anandtech.com]. They wouldn't lie on that page (they claim >40 million page views monthly), it would give them a bad reputation with advertisers if they found out.
  • by Lazy Jones ( 8403 ) on Thursday April 26, 2001 @02:10AM (#265024) Homepage Journal
    ... that AnandTech gets somewhere between 500.000 and 2 million hits / day (page impressions).

    We serve between 200.000 and 250.000 dynamic page views / day from 1 single-cpu front-end box and 1 dual-cpu mod_perl box and have room for approx. 3 times higher traffic. In the end it's all down to programming, page size and cacheability ...

    For a good example of an very scaleable configuration look at Google - their software must be extremely well designed.

  • So, how does a high-traffic NT-site such as anandtech compare to a high-traffic Linux-site such as slashdot? Both are serving dynamic content from database backends, both have the ability to add user comments to stories.

    Does anyone have any numbers on the amount of traffic and the amount of servers both sites have?

    It is my impression anandtech is running more and faster servers than slashdot (slashdot has around ten webservers, IIRC). I'm not sure about the amount of traffic they generate.

    Anyone?

  • on-board venture capitalists!

    now THERE's a hot seller!

    :)

    Peter
  • Actually, it's more stable on Windows than on Solaris. Linux wasn't even mentioned as a web serving platform.

    Caution: contents may be quarrelsome and meticulous!

  • Caseoutlet.com had some reasonably priced 2U servers at around $200 I think with a power supply. Havn't heard any first hand accounts, but seemed like a reasonable price for the product. You can pick up nicer ones for $500, but then the cost of the case becomes pretty significant in the system cost.
  • You could get dual 700's (total of 1.4Ghz) or a single 1.3Ghz Athlon that will be much cheaper and outperform the dual 700's as well. There is an overhead in multi-processor systems, let's say you take at least a 15% loss against a simple linear scale. And CPU is not the entire equation. Loading up on duals you might find that your disk subsystem is the hangup rather than your chips... Not convinced at all that big beefy boxes are the way to go, unless you really know what your doing. Cheap boxes are easy...
  • Are you kidding? The markup at most vendors is large. Would love to grab VA's stuff, but it's considerably cheaper to build yourself. I mean, these folks want to charg 3-4,000 for 2G of RAM, which would cost me $400 to buy myself.
  • Listen, I RUN a server farm with Quad Xeons, and I can TELL you that the niave view that performance scales linearly with processors is wrong. Have you EVER tried benchmarking a single processor machine against a quad? Using the stable 2.2 kernel series? Somehow I doubt it. You will quickly find that 4 procs != 4 times web content served.

    For frontend throwaway webservers cheap single proc can be great.

  • Excellent. Yes, found the 3Ware's and they look like the right solution. Cost effective, solidly behind linux. They look like *it* from the ATA RAID perspective

    Love the fact that the 905C's get their features used in Linux, just what I was interested in hearing. It is curious that even the linux vendors seem to like using the eepro.

    zero-copy networking yum, just wish we could wait on these servers till Apache 2 shakes out with PHP support.

  • by augustz ( 18082 ) on Wednesday April 25, 2001 @08:20PM (#265033)
    Making a decision tonight on 6 servers, and so far, couldn't track down a good linux buying guide. I'd love to hear what the slashdot hoards have to say, and perhaps we can come up with a good system.

    For 5 frontend webservers looking at 1Ghz AMD Athlons, 512MB RAM, with an ASUS KT133A motherboard. Looks like the 3COM 905C for a NIC and an IBM Deskstar rounds out that package nicely. $900/each with the 2U case.

    The backend database server is tricker. The FastTrak 100 RAID controller, a nice IDE raid solution is not supported under Linux. What are the good (and cheap) alternatives for RAID 1 or 10? Will Dual 1GHz pentiums really beat out a 1.3 GHz athlon? How about a nice NIC that works under Linux? $2000 would be fine...

    I suspect I'm not alone in wishing their were some good solid sites that had some recommeneded systems for Linux. As is, I end up wading through a lot of Windows tech sites to find that something is not supported under Linux? The hardware compatability lists don't descriminate between worthwhile products and overpriced junk. And it's be great to know this products works great and these manufactures actively track kernel development etc...

    Pointer, comments and experiences welcome.

  • I have one, running 2.2.18. It works fine... at 10Mb/half duplex :-(
  • I'm always super interested when I see an article like this. Is there a site that is dedicated to having a free and open discussion about this sort of thing like "Hey, I'm using MySQL on a Dual PII-500 and I'm having a problem with XXXX, do you think I should upgrade to dual 800s, or just add a second database box?"

    I think this would be beneficial. Thanks ahead of time.
  • It is my understanding, based solely on hearsay, that ColdFusion on any platform does not scale at all.
  • Let me preface this by saying that I know fuck-all about using Linux in a high-availability environment. Most of my experience comes from running Solaris.

    I can't comment about RAID controllers. I've used Network Appliances. You've heard of them right? Bigass $100k file-servers that allow you to rebuild, resize, and just generally do anything you would like to on the fly. Nothing short of a double-disk failure will cause data-loss. I lost a motherboard, the box was down for a whole day (8 hours, an hour or so to troubleshoot, 4 hour response time on the hardware, and another hour or so to get it running again with a new motherboard) effectively causing around $50k of lost productivity. The fact is, no matter how elegant a system you have, if you put all your eggs into one basket, your basket will fail.

    I don't know of any server vendor that offers hot-pluggable CPUs. I know Compaq offers some hot-pluggable PCI stuff which hasn't been supported under Windows until just recently. Apparently it's still buggy. Novell has had support for years. According to your logic, we should all be using Novell?

    Yeah, Linux doesn't have monitoring utilities. I suppose MRTG doesn't count. It's not really a Linux tool as much as it is just a Unix tool. It's open-source and it *will* run on NT, though it's difficult, at best. Most if the ISPs I know use it as a monitoring tool. As a matter of fact, most of the best management tools that I'm aware of are based on either UCD SNMP, which is open-source and ships with most distros of Linux, or perl in some fashion or other. Apparently perl works great for juggling OIDs around. Oh, I'm sorry, you were referring to the Industry-standard monitoring tool, the bloated, expensive, difficult-to-use and really-only-runs-well-on-unix, HP Openview.

    "People who are willing to spend the money for real support often don't care as much about Linux." I'm sorry, your conclusion is wrong. People that have been marketed to effectively by the MS juggernaut often don't care about Linux. People that are wise and are concerned about completing the task at hand will weigh the benefits of using an open-source solution vs. using a closed-source solution. Often this excuse for using proprietary software is that "we'll have somebody to sue if it breaks" which is bullshit. Indemnification clauses in contracts prevent that. Sometimes it's just easier to be able to modify the source-code and fix the problem yourself. Sometimes it's easier to hire someone to do it. Sometimes you'll need things like massive scalability and transactional processing capability and you'll end up using something like Oracle on a 64-processor unix box.

    I'm afraid I can't address that last paragraph without being a troll. I'll just list a few of the benefits of using a W2K w/ MS/SQL:

    PC/Anywhere Administration (ok, windows terminal services, it's a gui nonetheless)
    MS Support
    IIS
    next->next->next->finish (mouse elbow)
    VBScript
    ASP
    Random Reboots
    ODBC

    I'm sure there's a host of other things I could come up with, but fortunately, I don't have to use Windows that much (as a server).
  • There is a good article on Anandtech.com about how they upgraded their backend

    And now they want to test their upgrade with a good old front page Slahdot link.

  • $400??? Please tell me where I can get 512 meg registered ECC SDRAM for $100 a stick.
  • "You can pick up nicer ones for $500, but then the cost of the case becomes pretty significant in the system cost."

    ...only if you're building crap servers and don't care about reliability. Get a good system. $500 is not that much if you're using it to enclose good components. And remember that if you buy the $200 case, you'll have to spend another $100-150 to replace the powersupply, because the one that comes with it is crap and will likely fail in weeks/months.

  • But what about getting a power supply with good filter caps, etc? Your ps might not fail, but it might cause your computer components to, especially if your UPSs have non-sine wave output or aren't online.
  • Our website handles about 90000 hits a day. It is a dual P2/450 with 512MB of RAM running Cold Fusion/Mysql. I agree that without hit statistics, this is more of a gee whiz article than anything else
  • For network card, you should also consider Intel EtherExpress 100. From what I heard, It's been supported for the longest time and it's driver is the most mature.

    Blah. I had troubles with this card a few years ago, mostly with multicast AppleTalk (netatalk [sourceforge.net]) traffic. The driver may be mature now, but a year or two ago it wasn't, as I found a lot of list postings from others having problems with it. Swapped a 3Com in instead and had no problems. In fact, I have never, ever had a problem with a 3Com card under any OS, so that's what I've been sticking with over the years.

    --
  • Heh, yeah.. Rusty actually said "do you remember what happened last time?" to me. Although I was able to (barely) use the site when it happened.

    We're working on adding some more hardware over the next two months.
    --
  • by Inoshiro ( 71693 ) on Wednesday April 25, 2001 @11:49PM (#265045) Homepage
    This is kinda useless. Yes, they tell us that they are running 15 servers total all on 1Ghz PCs, but they do not tell you what kinda hits they take on it.

    K5 [kuro5hin.org], for example, has been able to take several direct Slashdottings on 1 VA Fullon box. 1 box which does MySQL, Apache w/ Mod_perl, and plain image serving Apache. (DNS is handled by other boxen). We handle about 65,000 to 70,000 hits a day (on average, mod_perl only.. no images traffic) with that one dual processor box. Vs the two dedicated dual proc DB servers, 11 web servers, two load balancers, etc of Anandtech. And we're at 8 months uptime [uptimes.net] with our single server. Sounds a bit better than requiring a load balancer which has to remove downed NT servers from the pool..

    I could theorize on how well their Cold Fusion/NT solution stacks up against my Slackware/Apache/mod_perl/MySQL solution IF they were so kind as to give info on hits. Without that, this is just another point-and-drool at some RAQmount stuff which performs a job somewhere, somehow.
    --
  • Ok what about slashdot what does this page runs on? As I recall much less then what anandtech does with.

    Now don't flame me but...

    I have stop reading anandtech cause most of their "reviews" and "interviews" are just time wasters. Yes I visit their site once in a while now but I no long post to their boards nor do I visit their site each day.
  • From what the article mentions, this isn't a matter of improved stability over linux, rather that Coldfusion for windows was more stable than the less mature Linux port.
  • Definately worth a good read -- And a good lesson to those of you who would do something as crazy as trying to skimp on the motherboard when building your athlon system!!

    It's unfortunate that the article doesn't go into further technical details, but it's an interesting and useful read all the same. In particular the round robin DNS good and bad sides and the linux virtual server project.

  • I never said that Linux was more unstable, I said that Cold Fusion + Win2K was more stable than when they used it with Linux (which, I stand corrected, was Solaris before. My bad)
  • My personal preference would be to drop OpenBSD on the web servers. The security by default makes tweaking it to your secure settings pretty simple. Yeah, the performance isn't up there with FreeBSD or Linux, but to me it is adequate. If I'm throwing beige x86s at the problem anyway, I can just throw one more if I want to compensate for the speed.

    For the database, what are you doing? Are you deploying administrator entered data (in which case transactions are meaningless, go throw MySQL up there), or are you really getting and processing data.

    One compromise (for added complexity) is to run two database servers. Put a FreeBSD or Linux Server with MySQL (I prefer PostgreSQL for development, but MySQL DOES Scream, Postgres is starting to annoy me on one of my platforms) hidden from the Public network. Load it with RAM and RAID 0 drive system. Just run the backup after you make changes, and you're good to go.

    Then, for the live, interactive data, buy a real database server. Design it, pick your platform (DB2 or Oracle are both fine), and then pick your platform (IBM, Sun, and HP all make nice machines) and put it there. Hire someone who knows what they are doing to put this machine together and configure it, then DO NOT touch it. :)

    This way you don't need to shell out big bucks for an Oracle Server for the content that MySQL is fine with. As you pay for the #Processors*MHz*RISC with the big boys for the database license, if MOST of your content is going to be "static dynamic" data (where you update once an hour/day/week, if that), there is no need for Oracle to touch it. Let Oracle process your transactions, and stick your "content" on MySQL.

    Consider something, most database backed websites aren't really database systems. Most use a database as a convenient place to store info and retrieve it. With a "database system" I'm using relations to develop complex systems that can be cross-referenced with my keys, monitoring transactions, real-time changes, etc. If I am writing articles to dump on a server, yeah the DB is convenient, but Oracle isn't necessary.

    As all of us here can admin a PHP/MySQL or PHP/Postgres solution, keeping the stuff that we work with there makes sense, then pay someone else for the real iron.

  • I think the statement that "mysql does a great job" is like saying "a chainsaw does a great job". A great job at what? Does a chainsaw do a great job of cutting holes in metal? Does a chainsaw do a great job at cutting tomatoes in the kitchen? Does mysql do a great job storing transaction-sensitive data (such as accounting data)? Does mysql do a great job on committing data across multiple database servers (multiphase commit)?
    The REAL statement is "choose the right tool for the job". In some cases it's mysql, in some postgres or interbase, in some it's sybase or oracle or the like. Same with OS and web server software. Sometimes its apache on linux, sometimes its Zeus on AIX, sometimes it may even be IIS on NT (I dont know when that would be, but I guess the situation could theoretically exist). Whatever is the right tool for the job is what should be used.

    maru
  • After learning a few hard lessons, AnandTech moved on to the plaftform that ColdFusion was the most mature on, "NT". AnandTech also switched from Oracle 8i to a SQL7/2000 environment, which greatly improved administration time. There was no performance drop from Oracle 8 to SQL 7. In fact, in some cases, performance improved.

    Moving from Solaris to NT just to overcome some problems with ColdFusion??? Why didn't they just dumped ColdFusion and moved to a better technology like PHP or JSP/Servlets? I know changing the scripting language needs a lot more effort than changing the O/S, but in the long run it pays back...

  • by enneff ( 135842 ) on Wednesday April 25, 2001 @10:38PM (#265053) Homepage
    Rather than upgrading all their hardware ($$$), they could've just switched to apache/php and upped the efficiency of their existing hardware. It is hardly surprising that they were having trouble with solaris Cold Fusion implementations. The engine is flaky enough on it's native platform!

    IMO, php, perl, or perhaps python (I have no experience with it, though) would all make better alternatives to Cold Fusion. (aka Server Side Scripting For Dummies, not that it doesn't have it's place)

    I suppose they could also install a BSD or Linux to cut down on useless cruft (like GUI, etc) running on their web servers, (flame-retardant suit on) but then again I suppose many would argue that Linux contains a huge amount of cruft in the first place ;)

  • Aw, jeez people give me a break!

    Shrimpi writes:

    "Randomly we'd get calls from the datacenter saying that one of the AnandTech Web Servers would be locked, and connecting a console to the machine would reveal nothing more than a black screen, no video, nothing. Resetting the machine would almost never work, often times resetting the BIOS would be necessary in order to get the machine to POST again.

    Quick searches through Deja News and our own AnandTech Forums revealed that quite a few users have had similar experiences with MSI boards, particularly the MSI K7T Pro2A. And most recently, we actually duplicated the problem in the lab with the MSI K7T266 Pro. It was clear that this wasn't an isolated incident to our servers, rather a much larger problem."

    So he runs production servers on $75 mobos and $100 cpus and NOW he is surprised they don't work well and others are having the same problem?

  • by bellings ( 137948 ) on Wednesday April 25, 2001 @10:29PM (#265055)
    It sounds like you're building a webserver farm where the failure of a webserver is a non-issue -- you're willing to deal with the possible loss of even a few dozen transactions if one of those servers abrubtly fails a critical time. If so, your farm of a few generic white boxes running linux or BSD is ideal.

    But as you're discovering, on the back end -- the database machine, where failure is much less tolerable -- it is going to be much, much harder to spec out a machine. Your example of hardware RAID is the most suprising area where Linux simply falls down. Even though cards like Compaq's SmartArray RAID controller has working Linux drivers, there are no Linux utilities.

    For far too many of the RAID controllers out there, reconfiguring or rebuilding the arrays either involves shutting down the machine and rebooting into a stand-alone utility, or "echo"ing pretty much undocumented commands into your /proc/scsi filesystem. If a drive fails, you will have downtime. The Linux RAID solutions might protect your data integrity, but they will not protect your uptime. I couldn't even imagine trying to hot-plug some new drives into the array, and then resizing the live filesystem when I needed more room. (Someone will try to refute this, and will use the word "ReiserFS" in their post, but all I have to say is that there are a lot of possible things I couldn't imagine actually doing).

    Things like hot plugable CPUs, or network cards, or drive controllers, are simply non-existant on Linux. Stuff like monitoring utilities that will page you in the middle of the night when one of your redundant power supplies fail, or when one of your CPUs burns out, or when one of the drives in your storage arrays dies, simply don't exist. There's a lot of stuff that you need if you want to run 24x7 that just doesn't work in Linux yet.

    A big part of that is because of posts like yours, of course -- you don't want to spend more than $2,000.00. It's pretty damned hard to build a well supported server for $2,000.00, and the people who are willing to spend the money for real support often don't care as much about Linux. You're left with a lot of after-thought solutions -- stuff that got built for something else, and just happens to work on Linux by an almost happy accident.

    Carefully consider what it will cost you to lose one transaction, or all your current transactions, or all the transactions since your last backup. Carefully consider the cost of downtime. Consider the cost of your time, when you're trying to figure out how to reconfigure your RAID controller at 2 in the morning on a Sunday, and the tech support guy has never even heard of Linux. It's certainly possible that Linux is going to be fine for you -- it's fine for a lot of websites. If you need more, and you want to spend under $10K, consider running W2K and MS SQL. As much as it pains me to say it, I'm afraid I'm serious. If you really do need more than that, be prepared to spend some cash...

  • Exactly. Well said. Use the vendors. It will cost you an extra 5%(I don't really know for sure what the markup is but it can't be that much). What happends if you build the box yourself and find theres some wierd chipset incompatabily? I think that 5% is pretty good knowing you'll get a tested box and have someone to call.
  • by Queuetue ( 156269 ) <queuetue@gm a i l . com> on Thursday April 26, 2001 @04:28AM (#265057) Homepage
    Almost every car has a water pump. Any schmuck can design a $400 water pump that does it's job. A real engineer can design one that does it's job for 20 bucks.

    As a network engineer, coder, or architect, your job is to make your client's project a success and deliver the most value possible. Your job is not to cover your ass, or to to get the project done with as little of your sweat as possible.

    That's the unstated contract that drives tech jobs to get the salaries they do, and the "Let's just throw Solaris at it," or "Microsoft is easier - they say so!" attitude is undermining the tech inustry.

    For web hosting, networking and network servers, Linux or a BSD on X86/AMD gives more bang for the buck than any other offering right now. Even though they have less froofy tools and don't have as much in the clicky GUI department. Yes, you have to be careful with hardware choices - You don't with NT? Yes, you need to know what you're doing - why do you have this job if you don't know what you're doing?

    MySQL does a great job, no matter how much it may suck "theoretically", or how it may fail the ACID test. Compare the number of MySQL horror stories to the number of MySQL positive anecdotes you hear. It works, it's fast, it does the job.

    Building solid, cheap boxen and deploying them in a stable configuration at low cost is an art. Call them 'frankenstien boxes' if you want, but they get the job done, and they let me come in with quotes at half of what my competiton does, and the stability and value get me return business.

    Buying Solaris and having them do all the heavy lifting for you generally means you haven't done the value calculation for your clients. Solaris is good, and Sun hardware is good - but neither is *that* good.

    Buying Microsoft - I hate to bash MS for the sake of bashing MS. Win2k *is* thier best offering yet. But it still bluescreens, DLL Hell still exists, it still needs to be reformatted every few months, it costs too much, it eats RAM, and not being able to remotely maintain it or to see the source so I can figure out what's causing strange behavior, coupled with MS' predatory business practices just makes it a non-option for me. YMMV.

    The days of the Mercedes dot-bombs is over, and tossing cash at problems isn't the way things are being done today. Look around you and start gearing up for the Toyota world of working fast, smart and as cheaply as possible to get the job done.
  • As for Coldfusion taking your server down... Bullshit.

    Ooh..them's fightin' words. Obviously you've never attempted to perform any kind of complex regular expression matching in ColdFusion. It crashes every time. I have absolutely nothing to gain by wasting my time typing this out on a web forum, and therefore have absolutely no motivation to embellish the facts...and the fact is, specific ColdFusion code has proven to crash the ColdFusion server process without fail.

    It also has no way of handling complex recursion...if you throw it into a recursion loop, even one that would be nothing short of *trivial* in any other language, it dies. Dies as above with regexp and has to be restarted.


    Coldfusion has it's strengths and weaknesses... But bad stability is not one of them.

    Have you ever written a web-based app in Perl or PHP on a Linux/UNIX box running Apache web server? Somehow I seriously doubt it...you just simply wouldn't've said such a thing if you had. The bar is raised far, far above the one you're used to friend. Give it a spin sometime. You'll love it.


    (Not great stability, but what the hell, I'll take that with the ability to put together a site or application rwice as fast as the next guy.)

    Twice as fast as the next guy coding in what language? COBOL? It's nothing short of fact that ColdFusion requires more typing to do less work than, I use the example once again, PHP or Perl. You never see any Obfuscated ColdFusion Contests, now do you? No, that's because you're still working inside this tag analogy...building tag structures in addition to your logic structures is an unnecessary pain in the ass. If you want this kind of ease of use, you might as well use Zope or ArsDigita so you'll be delightfully abstracted from the big mean world of "real" programming as a "real" programmer might tend to feel when working in ColdFusion.


    "Sweet creeping zombie Jesus!"
  • The phrase "runs rings around" seems appropriate.

    Amen, brother.

    <CFQUERY NAME="whyOhWhy" DATASOURCE="twentyTwentyHindsight">
    • SELECT reason FROM brain WHERE coldfusion='good idea' ORDER BY expense,stability,speedOfDevel DESC
    </CFQUERY>


    "Sweet creeping zombie Jesus!"
  • From what the article mentions, this isn't a matter of improved stability over linux, rather that Coldfusion for windows was more stable than the less mature Linux port.

    <rant>
    I love comparative terms like "more stable"...they do a great job of avoiding the messy details, like how ColdFusion (my experience is on NT) will take your server DOWN if you try to perform just about any kind of simple regular expression matching. To say nothing of the fact that you're not going to get it to return the matches...no, you get an offset of the starting point of the first match. How bloody useful!

    Functions you say? What are those???
    </rant>

    Can you tell I currently have to "code" in CF for my day job? Thankfully our next project there will be done in PHP or Perl. The (new) lead programmer finally got the suits to realize that ColdFusion, and originally hiring a guy who didn't know the first thing about programming to put it together, is why their site sucks.

    I'll see you in hell, ColdFusion...


    "Sweet creeping zombie Jesus!"
  • One of the things I wonder about though is the "dual processor" factors, which has many people going gah-gah over. Dual 700mhz's may sound nice, but to only serve up web content I wonder how is that better than just 1 700mhz chip or a 1ghz Athlon for that matter (anyone care to comment?)

    At my company, we made the switch from using sparc servers to x86 hardware for our front end a couple of years ago. The sense I got from the operations guys was that the second processor helped to take the edge off some of the load due to I/O. Since we're going with semi-cheap hardware we used IDE drives which does add some load on the CPU as opposed to using SCSI drives.

    With all things you need to evaluate your trade offs. For a front end web site, I don't think you should have more than 2 processors in a server. Now I haven't been pricing or shopping for servers lately, but I think it would be hard to have a dual processor server in a 1U case. So if your tight on rack space, buying single processor 1U cases may be the way to go. Or maybe your not tight on rack space and the budget will allow for the second processor in the server, then you go with dual processor 2U cases. Hopefully this setup might give you a few more minute against the /. effect. :)

  • For a web DB there's no need for any RMDBS[sic] features so - with SQL - MySQL's speed would be better than features unused in Postgres.
    I beg your pardon, are you just completely fucked in the head? Where did you form the opinion that everything DB programmers have been working with for 30 years is made irrelevant the moment you connect the DB to port 80?

    What the fuck is the matter with you?? Please, leave your name & email so I can make sure never to work with you.

  • I wasn't trolling, just you expressed perhaps the most flippant non-AC opinion I heard on /. today, so I was staying on the same level.

    Moving on....

    most website DBs don't need RMDBS features like rollback or anything at all fancy. All they require is blob, a few strings, and a key.
    ...and indexes of many shapes and colors, smart query optimizers, & full-text indexing. Row-level locks for speed, MVCC for even more speed. And that's for a /. -type content site. Going to program anything complex? Referential integrity to the max; stored procedures; at least *trying* to implement SQL-92. Is your data in any way, shape, or form valuable? Transactions, (& write-ahead logging for speed) are not a fancy feature, they are *life* for anything critical.

    ...pant..wheeze...

    Bottom line is I spend way too much time dealing with people who treat the DB like a flat-file, then code all the DB-logic into the application, making every concievable error in the process. (and of course it's slower in the end.) After 8 years web developers seem to agree that we should build on high-level substrates, like a real DB, rather than reinventing the wheel and watching our companies go bankrupt in the interim. Please don't advocate going in the other direction.

    These guys didn't need a powerful database - so, considering the needs - why not use something less featureful but faster (like MySQL).
    And do you know the kind of hardware they throw at it? Have you noticed how often /. serves up non-comment pages b/c mySQL is deadlocked? Did you notice when sourceforge dropped MySQL in favor of Postgres b/c MySQL just can't scale? Yeah, obviously /. doesn't need to pay Oracle a million USD$, but a little knowledge of mature databases would go a long way.

    Good jive @ the top, btw. Later.

  • For a 'webDB', which you seem to be insinuating has similar semantics to /., content is, indeed, not in the same league as a bank's DB. No question. And if your needs are almost entirely read-only (something like 1000-1 read-vs-modify), then mySQL will do your job. But if you take lots of user input, or logging, or smart personalization, then mySQL will kneel and die under the load b/c of poor granularity locks. If you need to write data into multiple columns, and the aforementioned load is being brought to bear on your site, then your data will be corrupted b/c you have no transaction support.
    I would use this less-featureful software and throw money at hardware.
    A fantastic approach, and one I always take. But you need make smart cost decisions here: "does this product have, for its low price, acceptable speed & reliability?"; eg, "under heavy load, will mySQL force me to spend more on hardware than I would have saved on software licenses?". In my experiences, Postgres vs. mySQL gave exactly the expected performance: mySQL started strong but buckled under several concurrent users, where Postgres gave a more linear growth pattern. Less code does not equal more performance; smart architecture designs for heavy demands. And don't we *want* lots of users?

    BTW, if you want to really talk specifics, I'll explain how MVCC gives an easy speed crown to Postgres over mySQL. But be prepared to be bored.

    I have never received a comment-less slashdot page, though I'm quite sure it happens. I don't think one example makes proves MySQL's shoddyness as you insinuate (this is what you are saying?).
    Just giving you an anecdote; but it is an anecdote of one of your pieces of evidence. If you want to see /. kneel, read a lot of /. around 2-4 pm EST. If you were here in 1999, you would remember /. growing by leaps & bounds and often suffering under bad DB performance until they got /. onto a larger machine (and then machines - I think they're still on quad-xeon VA Linux boxen). You might like this:
    Here's a quote from an article by the Slashdot crew posted on April 28th. It's primarily describing their new hardware setup, but they talk briefly about increasing database reliability:

    "As for fault tolerance, we're working on two fronts:

    First we're funding development efforts with the MySQL team to add database replication and rollback capabilities to MySQL and that's coming along on schedule (and yes, these improvements will be rolled into the normal MySQL release as well)...(I know what you're saying 'Why not use Oracle...', well just because... you know... the zealousy thing...)"

    Note the fact that the Slashdot crew think that transaction semantics are so important that they're FUNDING DEVELOPMENT. The MySQL crew might not get it, but one of their most visible users do.

    That's quoted from openacs.org/philosophy/why-not-mysql.html. The arguement is still running; you could provide your 2cents.
  • Right now my company is doing great on 1 Dell Poweredge server. With dual 1GHz's and 1GB of ram we use it to serve up 6,000,000+ impressions a month. The server also has Raid-5 with 4 9.1GB 10k SCSI drives and those seem to help a lot, they are all hot swappable too, our hardware raid has 256MB of onboard memory too. The server also does all the database operations and the biggest load we really see on the server is .91 at peak time. We move somewhere along the lines of 400GB of data monthly but this was definitely a pretty interesting read. If banners weren't doing so badly right now I would love to have a similar setup.
  • We crashed several 75GXP deskstar drives on ASUS A7V motherboards running on the promise controller - this configurations seems to be similar to what you are suggesting. IBM suggests not to run these drives with the promise controllers on these ASUS boards since they seem to get heaps of the drives back.
  • BRL [sourceforge.net] is easier and more powerful than CF, especially for database-driven apps. You can also make use of Java objects easily. I'm biased, but I'm right.
  • All i say: Do not underestimate the power of a 10mbit hub It's the best way of handling traffic load, and it's REALLY cheap nowadays.
  • A word of warning on the NICs: use kernel 2.2.19 or a 2.4 kernel if you want to use 905Cs. 3com updated some hardware, which led to timing issues in older drivers. I know, I have one of the newer 905Cs and it didn't work until now (works great now though).

    HTH,

    Mart
  • by jsse ( 254124 ) on Wednesday April 25, 2001 @10:09PM (#265070) Homepage Journal

    under different brand name, Microsoft proudly present Clustering on a Budget [zdnet.com] - and the meaning of the term on a Budget in Microsoft's dictionary($28,000 for two clustering PC).

  • Can you provide more info or links to related info? Are the drives getting corrupted because of problems with the Promise chip? Are the interfaces dying due to out of spec signalling? What? I just put a new system togther with dual 75GXPs connected to my A7V133 Promise controller in a RAID-1 config. Any more info you might have would be very much appreciated!

    --

  • In looking at the pics on the page, it seems the first boards they used (MSI K7T) were not only MSI boards, but indeed had onboard audio aswell....what's next? onboard NIC's and VC's?
  • I get about 2mill + a month, and never bothered putting up banners since their so full of shit its not even funny. Maybe you should try target marketing, contact local businesses around your area and tell them the scoop... X Amount of visitors with X amount of hits for X price a month.
  • by deran9ed ( 300694 ) on Wednesday April 25, 2001 @08:05PM (#265074) Homepage
    Nice little chunk of money saved by using Linux virtual servers over Arrowpoint, however I would like to know how a high content site would hold up with a lot of those perl scripts running to cache, one of the possible problems you won't find with Arrowpoint, Alteon Ad Directors, Netapps, not to say they're better, but the article did mention "Big Budget", aside from that some information on traffic handling would have been nice to show, e.g. amount of data passed into the network would give insight as to why they may have chosen to go via certain routes (not routers, or routing protocols.. choices) versus others.

    I remember some of the guys where I'm at did some overhauling, and when we were doing the firewalls, instead of ordering 4-5 Nokia's or looking into other fw's, we ended up getting one Nokia 650, and since we were running FreeBSD we threw on ipf on all the boxes and created rules to eliminate the load of ACL's, and the FW load which was actually cheaper than buying x amount of new firewalls, and since we jumpstarted most of the machines, we had a slew of tightened security scripts for Sun, and BSD's to have an auto locked down network no matter how much shit was upgraded.

    One of the things I wonder about though is the "dual processor" factors, which has many people going gah-gah over. Dual 700mhz's may sound nice, but to only serve up web content I wonder how is that better than just 1 700mhz chip or a 1ghz Athlon for that matter (anyone care to comment?)

    As for switching from Oracle to SQL7, sounds like a good move, however again there's no mention of how much data goes into their database, so while it may suit them, what about mega sites like Yahoo, I wonder how they would stand up to SQL over BSd's, Linux versus a nice Sun E10k running Oracle?

    Well they certainly have a pretty cool network, I wish they would have included actual network information as well such as router info, traffic stats, etc., now they would have blew my mind had they said, they're running strictly Zebra [zebra.org] on a Nix box versus a Cisco or Juniper ;) but then again this was a semi "Big Budget" article, not a Poor Man's Network which in my case would be my Cisco 1xxx series running Zebra and GBGP (what you know about that.. Ghettotized BGP werd), 400mhz i386 running OpenBSD for the website, my spanking U1 for db stuff, ghettotized rj45's I found, with stolen bandwidth running out "Moving Day to Day Networks" run from my garage, and a C64 for DNS (fear)

    Blackbox Themes [antioffline.com]

  • Actually they didn't switch to NT from linux. They switched from Sun Solaris to NT. Furthermore, they had switch over to NT because the Cold Fusion port to Solaris was unstable, not because of the operating system.

    As far as I can recall from reading their previous server article a year ago the load balancing nodes are the first time Anandtech has used Linux.
  • by janpod66 ( 323734 ) on Wednesday April 25, 2001 @08:06PM (#265076)
    Load balancing via IP routing tricks is kind of nice. Mosix [mosix.org] goes one step further and allows live processes to migrate across a cluster. Experimental add-ons also will do socket migration and havedistributed file system support. I think that's the kind of approach to clustering you are going to see in the long run.
  • I very very very very much doubt that atech gets that much. CF & NT are pretty slow compared to highly tuned *nix boxes, and 5 of those servers are for forums too. (Makes you wonder about their forum system... )
  • Linux, and BSD (my favourite) make excellent use of minimal resources.
    I ran an ISP off of simple single-processor 200MHZ machines. All ran FreeBSD, and there was never a complaint.
    I just wish I could've convinced my bosses to get an AMD system last year. What a buy for the buck.

  • I support the increasing shift of IT people into less dominant market areas in hardware and software.

    AMD instead of intel, Linux instead of Windows.

    This big sites application is a sort of branding model.

    The technical applications of each hardware or software solution vary. what is concludable at this point is that monopoly based business models are disfavored the more people use less dominant technologies.

    If folks can switch from Colgate to Aqua Fresh for stupid reasons, people can save money and take a little more time with their technology to break out of the monopoly mold.

    It's basically a matter of building brand loyalty.
    This spork says, Fuck the FBI [wired.com]
  • I've been working with Cold Fusion to create a large scale web site at work, and it simply isn't up to the task. I've found that it is poorly documented (CFEXECUTE), and I've run accross functions that do not work as described (CFHTTP).

    Trivial, databased sites can be created very quickly, but creating anything of substance is like kicking a dead whale down the beach. Furthermore, the snytax is inconsistant, and there's not a strict separation between code and HTML output. I can see how the syntax might please an art director who only knows HTML, but it looks like a mess to an actual programmer.

    I've crashed the Cold Fusion server (on NT 4, not UNIX) with simple regular expression matching and recursive custom tag calls. I've never had to contend with shell scripts, perl, PHP, or C 'going down.' In order to complete the project, I've written core functions in Perl. Given the time and opportunity, I would rewrite the entire site with ASP or PHP.

    It's a dark day Microsoft can out do a well funded, independent company. It's slightly less of suprise when the open source community bests them. The phrase "runs rings around" seems appropriate.

    <CFCRAP>Cold Fusion #blows#</CFCRAP>

It's a naive, domestic operating system without any breeding, but I think you'll be amused by its presumption.

Working...