Database Clusters for the Masses 279
grugruto writes "Cluster of databases is no more the privilege of few high-end commercial databases, open-source solutions are striking back! ObjectWeb, an Apache-like group, has announced the availability of Clustered JDBC (or C-JDBC). C-JDBC is an open-source software that implements a new concept called RAIDb (Redundant Array of Inexpensive Databases). It is simple: take a bunch of MySQL or PostgreSQL boxes, choose your RAIDb level (partitioning, replication, ...) and you obtain a scalable and fault tolerant database cluster."
WOOHOO! (Score:2, Funny)
Re:WOOHOO! (Score:2)
Non-Java Implementations? (Score:5, Interesting)
So, the question is - is anyone working on anything like this for Perl, C, or generic implmentations?
Re:Non-Java Implementations? (Score:4, Insightful)
Exactly -- given that the RAIDb itself sits elsewhere, I can't imagine it would be that hard to take the source itself and make a Perl DBD::Module out of it.
If only I had the spare time...
Re:Non-Java Implementations? (Score:3, Funny)
You don't have a very good imagination.
Re:Non-Java Implementations? (Score:4, Insightful)
Seriously though, this may reduce the costs for some users but I don't think it will get a wide take up. Most people will not want to leave the deniability you can have with large corps like Oracle. Oracle is a 'safe' solution for the purchaser with their ass on the line, which is most corperate users these days.
And the more entrepenrial users will not usually have the hardware to use this properly anyway.
Anyone who is financing this lot will want proven standards.
Just my flawed £0.02
Sigh - Looks like I have my work cut out for me... (Score:5, Funny)
Re:Sigh - Looks like I have my work cut out for me (Score:2)
Re:Sigh - Looks like I have my work cut out for me (Score:2)
Re:Non-Java Implementations? (Score:2)
Re:Non-Java Implementations? (Score:2)
Re:Non-Java Implementations? (Score:3, Insightful)
Am I the only one a bit saddened by the fact that Sun botched it with java that much, that we now exclude java from 'generic implementations'
Build once, run anywhere, riiiiight.
Re:Non-Java Implementations? (Score:5, Informative)
When I said "generic implementation" I meant "an implementation which doesn't require your programs be written in a particular language." Which is probably a bit of a pipe dream, you'd still need some sort of glue code (ODBC, JDBC, DBD, etc). But, as was alluded to above, I was trying to beat the Beowulf comment when I asked my question.
Re:Non-Java Implementations? (Score:4, Interesting)
Please don't take my previous post as a flame, I completely agree with your point. What I was whining about was the fact that java doesn't play nice with system libs, as it is 'easy' to import other libs, but exporting java classes to other languages is ... :)
Let's say that few people feel like embedding a JVM to their C app
Re:Non-Java Implementations? (Score:2)
Granted, I would have liked to see a more generic implementation but whats it going to be generi
Re:Non-Java Implementations? (Score:2)
Re:Non-Java Implementations? (Score:3, Interesting)
You should check out some of the Java technologies post 1999, they're entrusted with a lot of sensitive computing nowadays.
Re:Non-Java Implementations? (Score:3)
Re:Non-Java Implementations? (Score:2)
Re:Non-Java Implementations? (Score:2)
Re:load balancing apache? (Score:2)
hmmm (Score:4, Interesting)
Re:hmmm (Score:2)
Re:hmmm (Score:2)
You mean 'being run by a privacy-hating megalomaniac like Larry Ellison'?
Open source RDBMS's are good solutions for many, perhaps even most, problems. But there are still some situations where I'd want to stick with Oracle's strength and maturity and not take chances.
Re:hmmm (Score:2)
PostgreSQL isn't mature? It's a direct descendant of Ingres, the original relational database. Ingres was written in 1977 at Berkeley. Bob Miner, Ed Oates, and Bruce Scott saw the commercial potential of RDBMS and founded a company later in 1977 called Software Development Laboratories. Larry Ellison joined up
Re:hmmm (Score:3, Insightful)
One is that since most open source databases lack some feature, they will never replace any Oracle servers. Most of the people who believe this also believe that Oracle servers are always used in high parallel load transactional systems that have to be up 24/7 and never go down. While plenty of sites that need that use oracle, it is not inversely always true. Many places put Oracle online because it's what their develo
If only replicaton was so trivial (Score:4, Insightful)
Running many databases is easy. Organizing and serializing replication is hard. Even if one have distributed transactions handy - not present in this case. But let's read their code...
Performance? (Score:5, Interesting)
I wonder how much slower my query will be when the data is spread across several machines. I'd imagine that a few complex queries that aren't correctly optimized would bring this system to it's knees rather quickly.
Re:Performance? (Score:5, Informative)
There are better ways to improve the performance of a database, horizontal partitioning, federated servers, etc.
This would be very cool if there was a generic implementation; we build many Microsoft SQL clusters and just the hardware requirements for an MSCS cluster easily exceed $50k, let alone the licensing...as an MCDBA I'd consider an open source solution if I could use it as a back-end ot an ASP/VB.NET application, just to save the licensing $$ for consulting! ; )
Clusters aren't performance? Just not true! (Score:3, Informative)
Re:Clusters aren't performance? Just not true! (Score:2, Insightful)
Re:Performance? (Score:2, Informative)
The idea is that with full replication you have to broadcast the write to all databases (to be consistent) and you can only balance the reads. By controlling the replication of each database table, you can have scalable performance. Look also at the nested RAIDb levels [objectweb.org], it's pretty cool to build large configurations.
Some tests have been done with TPC-W [tpc.org]
Re:Performance? (Score:2)
Total read query throughput will scale with the number of machines in the cluster, given (from the website):
"The database is distributed and replicated among several nodes and C-JDBC load balance the queries between these nodes."
For writes, the data must go to every machine replicating the
This is a threat to the big vendors (Score:5, Insightful)
But Oracle shops are dealing with expensive boxes they would love to replace, not to mention expensive Oracle licenses. Often the only reason they use Oracle (other than Oracle salesmen licking their buttholes) is because only Oracle has the horsepower to meet their requirements. Give them a cheaper alternative with the same capabilities and they will bail out faster than you can say 'Geronimo'.
Expect Larry Ellison to start talking about the dangers of using Open Source software now...
Re:This is a threat to the big vendors (Score:3)
We were using MySQL and it was working fine but somewhere along the line some Oracle salesman convinced someone that Oracle was better and we switched. I have seen some minor good things,but not as assload of $ worth.
Re:This is a threat to the big vendors (Score:4, Insightful)
What does proprietary software have that Open Source doesn't? Insurance.
The best way to knock over oracle is to start up a company that supports open source for a fee (which is cheaper than running oracle for a year).
Re:This is a threat to the big vendors (Score:2, Informative)
they do consultant work for there products.
Re:This is a threat to the big vendors (Score:2)
You should read a commercial software EULA some time, because you might be in for a suprise. I'd bet you have about as much recourse for broke code with Oracle as you do with MySQL.
The people who claim that an angel from Oracle decends down to magically fix their problems probably aren't aware that their boss just signed away another $10,000 in support fees. Running a fully supported Oracle server is expensive!
Re:This is a threat to the big vendors (Score:2)
interestingly I was at a meeting the other month where it transpired that there is a recent version of oracle out there that plain refuses to run on a machine with 2 ip addresses.
who let *that* showstopper of a bug out of the door? how much does oracle cost again?
oracle did at least admit to that one and promise to fix it,
Re:This is a threat to the big vendors (Score:2)
And I can back up a MySQL database and offsite/onsite copy the tapes as necessary, just like SQL Server or Oracle. Generally I can start a server rebuild/restore in less than the time it takes to give some level one tech support asshat my phone number for
Re:This is a threat to the big vendors (Score:2)
Most closed source databases are expensive. I like to think that most people can pick the best database that works best for their company. That could be MySQL, PostGreSQL, FileMaker, Access, DB2, Oracle whatever...
In my opinion th
Re:This is a threat to the big vendors (Score:3, Informative)
Re:This is a threat to the big vendors (Score:3, Insightful)
Which is exactly what MySQL AB does for MySQL. Their support is not particularly cheap, (though I be that it is a lot less than Oracle's), but I recommend it highly. The original designers are still leading the development/support team (is that true for many of the alternatives?) and make a living *only* because of their superior product, not because some salesma
Re:This is a threat to the big vendors (Score:4, Informative)
Josh, know what you're talking about before you post. MySQL [mysql.com] (the company which does the vast majority of development of MySQL) offers a variety of levels of support and consulting, regardless of the number of systems that you admin. For $48,000/year, you get:
Does Oracle match that for the price?
Re:This is a threat to the big vendors (Score:2)
Max
Re:This is a threat to the big vendors (Score:2)
I call bullshit. Unless you are a fortune 500 company AND are paying more then a hundred thousand dollars per year in support costs then there is no way Oracle is going to send someone over there to fix anything.
Re:This is a threat to the big vendors (Score:2)
If you have a problem with your database you phone Oracle, and they talk you through it. If it turns out to be an OS problem, then they tell you go talk to your OS vender except when it's Linux. If it's Linux they will deal with it directly.
I was very impressed. They are moving their whole company onto Linux and are more then 50% there now.
Re:This is a threat to the big vendors (Score:2)
with Oracle you won't get any kind of insurance. Read their "EULA" for details. The only thing you have with commercial software is "someone to blame". (OK, of course you can buy any kind of support from most commercial vendors, but you must pay a lot of money and the only thing you get is, that their support tries harder (e.g. faster)). Try to set an agreement with Oracle of someone else where they pay you any money you lose from their faults.
Say: 1 hour of downtime o
Re:This is a threat to the big vendors (Score:2)
Bye egghat.
Re:This is a threat to the big vendors (Score:5, Insightful)
Prior to Oracle taking off in a big way people used to say:
Then Larry E. shamelessly put together a cool SQL database which copied every major innovation IBM had made and added in a few more for good measure. He also cut the price by a third, IBMs database customers deserted in droves, after all if this Oracle thing turned out to be shit, they could always get IBM to come clean up the mess. It turned out though, that Oracle wasn't and isn't shit.
That does not mean that Oracle is immortal and will always be top of the pile, Postgres now replicates almost all of the major features and is proven in the reliability stakes, tools like this are only going to make it more likely that corporate data departments will dip their toes into the Free software waters, after all if it turns out to be shit, they could always get Oracle to come clean up the mess.
Re:This is a threat to the big vendors (Score:2)
Re:This is a threat to the big vendors (Score:2)
I would love nothing better (Well besides two hot brunettes) then to be able to tell oracle to shove it. They throw you over the barrel and rape you, and as long as management wants oracle you have to take it.
Re:This is a threat to the big vendors (Score:4, Informative)
If you want to cluster Oracle, use Oracle RAC (Real Application Clusters). It's based on Parallel Server so is mature enough to put forward for consideration... and even then it might be eschewed from above. Cheap databases are not going to ring the bells of the people with the say-so simply because Oracle (and DB2 etc) are proven over the years, and the cost of losing your data because you went for the cheap option is going to lose your company a lot of money, and you your job!
Technically better, cheaper and all those good things does not mean better for a business. Databases are predominantly used for *business*, and as such a *business* reason it used when choosing one over another, not technical reasons.
Re:This is a threat to the big vendors (Score:2)
Re:This is a threat to the big vendors (Score:2)
Yes, but if a large portion of the people that previously bought Oracle find they can get by with PostgreSQL then you will find that their decision does effect those of you that stick with Oracle. The people that don't need all of Oracle's neat features are effectively subsidizing the product for those of you that do. If these users are siphoned off by another product then either your prices will rise, development will slow, or Oracle will die.
That's one of the reasons that techies tend to be advocates
Re:This is a threat to the big vendors (Score:2)
As my boss says, "you don't need an aircraft carrier to go bass fishing, but it certainly helps if you need to land fighter jets out at sea."
Most people don't need to "land fighter jets at sea" and for those folks PostgreSQL gives you most of Oracle's neat features at a very low price. A redundant set of PostgreSQL servers can even get you Oracle-like availability at a much lower price.
Re:This is a threat to the big vendors (Score:2)
Re:This is a threat to the big vendors (Score:4, Insightful)
That's exactly the point. Who needs all the features of Oracle? Maybe the IRS or Mastercard, but the vast majority of Oracle users are getting just one feature: the Oracle reputation that their marketing has built.
And with all those features comes the big problem of managing them: no matter how small the application is, once you choose Oracle you need a team of experienced DBAs to correctly and reliably configure the system.
Re:This is a threat to the big vendors (Score:2)
Re:This is a threat to the big vendors (Score:2)
Tactical outsourcing is the easy, inexpensive, reliable and reproducible solution to this problem.
For example, you could hire Pythian [pythian.com]. We outsource Oracle DBA and make running and managing Oracle into the long-term completely turn-key. We run some of the largest and most challenging Oracle environments in the world, including distributed architectures and shops where the cost of downtime is 5-figures per hour. We run
Re:This is a threat to the big vendors (Score:2)
I wonder if Oracle is still motivated by per-CPU licensing. Even if the hardware and OS are $2,500 per node, will Oracle still be another ten grand per node? And it would probably be wise, for responsiveness/throughput, to have two CPUs in each node. This can get expensive very quickly.
Add the fact that you still need a paid Oracle DBA to keep everything running smoothly, and I'd bet the real savings of using a Linux cluster is relatively small. Add annual suppor
Quick thru the docs... (Score:5, Informative)
1. A Controller. It looks as tho a single controller is used by the clients to communicate to the various RAID'd dbs. I'm sure there can be multiple controllers since there would be little point to make some db's redundant, yet the access to them not. Still looking into this.
2. And also, it looks as tho the default port is 1099 - RMI. If you have, for a web app, your EJBs and web app local to that containter, that might not be a problem. However, I happen to have my EJB server on its own box and this might very well cause probs. I think it said you could specify our own ports, but I haven't seen any examples in the docs yet of this being the case. Also, still looking.
A few other things exist as well which are in the docs as known limitations:
* XAConnections
* Blobs
* batch updates
* callable statements
These could be serious issues for some. My last project used CLOBs/BLOBs, batch updates and callable statements, so this would rule that out. Of course, all the db stuff was strictly tied to Oracle, so I think that would rule this all regardless.
All in all tho, this looks like a good start. As my current project progresses, clustered dbs will become more and more of an issue. I've looked into some other projects out there for Postgres, but nothing yet really satisfactory. I think this is a good step in the right direction - for Java developers. It'll be interesting to watch.
Re:Quick thru the docs... (Score:2, Informative)
1. Yes, you can have multiple controllers that synchronizes using group communication. In the driver, you give a list of coma separated host names running controllers. The driver has built-in failover and load balancing among multiple controllers (check the doc here [objectweb.org]).
2. Yes, all ports are customizable when you start the controller (check the doc here [objectweb.org]).
This is just an alpha version, so as you mentioned, there are still many features missing but it is a good starting point and contributions a
Where are the benchmarks that they speak of ? (Score:5, Insightful)
Supposedly, This new version has been successfully tested with Tomcat, JOnAS, MySQL and PostgreSQL. Excellent results have been obtained with the TPC-W and RUBiS benchmarks.
Don't get me wrong, I like the idea, and I have been wanting something like this for years, but I sure would like to _see_ the test results, even if they are preliminary.
How about a meta-database adapter? (Score:5, Interesting)
Here's an example of an application: I have a database-driven Web application [slashdot.org] that allows my onsite clients to register network services for openings in the firewall. Another software component probes the registered hosts for daemon version information and records it in the database, so that we can send out alerts when security holes are discovered in particular versions. I use PostgreSQL on Debian and Solaris. Independently of my work, our networking office has a Microsoft SQL Server database of IP addresses, MAC addresses, and physical switch ports and jack numbers.
What I'd like to do is mount both my database and the networking office's database into some sort of "meta-database" -- analogous to mounting filesystems from two different hosts via NFS -- and run SQL queries that span both data sets. I wouldn't expect to be able to write to this conjoined database -- locking would be a nightmare -- but being able to SELECT across the two sets would be incredibly valuable.
Re:How about a meta-database adapter? (Score:3, Informative)
Create a database link (for example to an AS400) and you can query the remote tables just like local tables.
select * from somelib.sometable@as400
Oracle will pass as much SQL as it can to the remote DB engine in order to keep things speedy.
Re:How about a meta-database adapter? (Score:2, Informative)
Check out This page [postgresql.org] for the postgresql ODBC Driver.
You should also look at the linked servers documentation in SQL Server Books Online (under sp_addlinkedserver) as well as the interface in enterprise manager (security -> linked servers)
As I was searching a bit, I see that pe
Re:How about a meta-database adapter? (Score:2, Informative)
Actually, if you look at RAIDb-0 [objectweb.org], it is very close to this, maybe even identical. They show having different tables on different database servers. They also indicate that C-JDBC can be used without modifcations to the application. This would imply that if you get a JDBC driver for MSSQL, a JDBC for PostreSQL, and write your code using JDBC, you should be able to do the type of selects you are talking about.
More info on transactions (Score:4, Interesting)
Re:More info on transactions (Score:3, Informative)
If one backend fails in the cluster, it is automatically disabled and the controller always ensures that data that are sent back to the application are consistent.
By the way, you can tune how you want distributed queries to complete (return as soon as the first node has commited, wait for a majority or safer wait for all nodes to commit). There are many options that helps tun
Why? (Score:2, Insightful)
Re:Why? (Score:2)
Re:Why? (Score:2, Interesting)
Just put a replica on a second node and you will have fault tolerance (even just for maintenance) and you will be able to handle peak loads. 2 nodes is already a cluster, don't need to have hundreds of nodes.
Another usage could be to keep a single Oracle instance and put a bunch of open-source databases to offload your main Oracle database. You could have all the write queries (orders, ...) handled by your [safe] main Oracle database and have all ot
supposed to be at RDMS level (Score:5, Insightful)
I mean, this is neat and all, but I really don't want to have to use this interface just so that I can cluster my database. You're much better off placing clustering functions within the database itself. Then you can access the data by any method (ODBC, native libraries, hell even with the provided command line interface).
Take a look at how MS SQL Server performs clustering sometime. Everything (and I mean EVERYTHING) is performed via triggers and tsql. All the clustering setup does is set up a bunch of known working trigger scripts to propagate the data. You can even edit them to your liking afterwards if you wish. Now I'm not saying that MS's solution for clustering is the cat's ass. Personally, I think it is kind of hackish, but then again I believe that clustering should be something you simply turn on, and shouldn't be able to fuss with. Realistically, I can't think of any good reason to change the cookie cutter tsql scripts that perform the clustering, so I only see the ability to modify them as a potential way to fsck it up (that being an obviously bad thing).
Clustering really isn't that hard to implement. I'm pretty surprised that MySQL and Postgres don't have better support for it. Especially Postgres, since transaction support is really the one big key that makes clustering possible. Maybe no one has really had an itch to make it heppen yet. Hopefully it will happen soon, since I'd love clustering to be another argument for why OSS databases can play with the big kids just as easily.
Re:supposed to be at RDMS level (Score:3, Insightful)
You are wrong saying that implementing clustering isn't hard.
If we are talking about REAL DBMSes (no, MySQL is not a real DBMS) enabling every form of clustering which maintains the ACID properties we expect from a DBMS is a major step, it means becoming a distributed application, and it is one of the most complex thing to implement.
Just for example, suppose you have two machines in a master-to
Re:supposed to be at RDMS level (Score:2)
This is why you have transaction logs that are timestamped. When the sytems resync, they merge their transaction logs, rollback to the last synced state, and then re-execute every transaction until they are current. The end result is that the newer row updates will overwrite the older row updates. This may or may not be th
Re:supposed to be at RDMS level (Score:2)
Think of RAID as it's hard-drive counter-part.. Data-integrity could be most efficiently handled at the hard-drive layer.. Having multiple redundant controllers and disks, etc. It would be a generic disk as far as the SCSI/IDE card was concerned.. But it turns out to not be the cheape
"Shared-Nothing Architecture" (Score:2, Insightful)
Know what? There are a ton of deep issues beyond just making the different partitions transparent to the application level. Think about joins across partitions for sec...
Re:"Shared-Nothing Architecture" (Score:2)
Don't see how it limits you here. If you have n fully redundant RDB's, a single controller, and m clients, the dispatcher load balances you such that if all m clients are performing non-destructive reads, they all read from different machines.. (preventing resource starvation). Each machine either put everything on one disk or segments the data acr
Slightly Offtopic.... (Score:3, Insightful)
My view is that it may be difficult to migrate OSes or even hardware, but its almost darm impossible to migrate existing Databases.
A Database is the most fundamental and most cared about aspect of a major business. There is a lot of time and effort and MONEY spent to incorporate it in to the company.
Lots and lots of critical business applications are written using the propritory extenstions of these vendors. Is it very easy to migrate this code ?
May be interesting for a future pilot project, but if expect business to change their database vendors.. that's not going to happen very soon.
Re:Slightly Offtopic.... (Score:2)
If you have alot of PL/SQL stored procs, and you are moving to MySQL (no stored procedures yet, no PL/SQL) then it's tough.
If you are moving to Postgres, then it gets easier.
It really depends on how you coded your application. Even if you use a bit of non-standard SQL, there are usually equivalents.
How does clustering improve performance? (Score:2, Interesting)
How do you join one table to another when they are on two separate boxes?
Well. I know how to actually use SQL to join two tables from two separate databases. But what is actually happening inside the RDBMS at the low lever. Does one just bring over the entire other table. How does it use indexes.
Seems to me this really is doing at best, a reference implementation that may actually degrade performance.
Re:How does clustering improve performance? (Score:2)
Several people have asked this question. Have you looked at the white-paper? It's possible to do RAID-1 which is m fully redudant DBs with all tables being fully accessible from a given DB. In RAID-1, therefore, there is "zero" problem with joins / updates / transactions, because it's literally just pretending to be accessing a single machine.. (I quoted zero because you might have synchronization issues if one machine somehow res
their site is not slasdotted... (Score:2)
DB Clusters of the world, unite! (Score:2, Funny)
Finally, my grandmother can have that database cluster she has been bugging me about.
Also new! (Score:5, Funny)
RAID -- Redundant Array of Inexpensive Developers
RAID 0
Multiple developers work on the same project but none of them has any idea what the other is doing at the same time. One developer failing (caffeine dehydration, severe electrostatic shock, sex, etc) will cause the entire project to screw up and become a mess.
RAID 1
Extreme Programming.
RAID 2
Inefficient way to keep track of what developers are doing. For every 10 developers, 4 are needed to keep track of them and recover any error by the aforementioned 10 while they don't work together at all. Level of efficienty comparable to a modern goverment.
RAID 3
Equal to RAID 2, except all responsibility for checking the code is now granted to one person. The rest has been budget-cutted away. A bite more effective but considering people still don't cooperate, not too good.
RAID 4
Equal to RAID 3, escept people are finally working together now. Kinda efficient and fast, except it all still relies on that one person who checks the data.
RAID 5
Everyone knows what everyone else is doing, they all work perfectly together and they can easily miss one person because of that.
Fine-grained caching question (Score:2, Interesting)
This is not that novel of an idea (Score:2)
Esse
Re:This is not that novel of an idea (Score:2)
good idea--just not new (Score:2)
It's good that these are becoming available in open source form, but the concept is not new at all. IBM and Oracle both have had commercial versions for a while (I suppose the "inexpensive" part is new).
Thorough rundown (Score:5, Informative)
After actually reading the documentation, here's my informed take on this:
1) In it's current incarnation, it's only useful for very very simple database access. No transactions, no blobs, etc. Basically if you're just storing some simple weblication tables and doing single-statements against them for selects/updates (no big cross-table transactions), you can use it.
2) It's JDBC only. Perhaps someone could port the concept to ODBC though.
3) There's a new middle tier between the JDBC driver and the database itself, which is the bulk of their code. This tier actually re-implements some database constructs like recovery logging, query caching, etc. Of course this is neccesary, as trying to do replication from the client-code side alone would be impossible (what do you do when one of 3 DB mirrors goes offline for an hour? have every jdbc client cache the requests and replay them later, hoping those clients are even stilla round later?)
For some applications and some companies, in it's current state this is a godsend - but it's not a general solution yet. Making it ODBC (or even better, having the front of it emulate a native postgresql or mysql listener) would broaden it's applicability.
Supporting transactions would be a big win too, although I'm not sure how feasible this is - I think at that point they may as well just write their own new database engine which is parallel from the start, seeing as they'll be re-implementing in their cluster tier almost everything the database server does except for actual physical storage.
Still, it's nice to see that someone did this and made it work - and for a lot of simple databases behind java apps it's all you really need.
PostgreSQL has all the transaction support in place already, so of all the free DBs out there it would seem they have the best shot at doing their own native parallelism, if they would just get it done someday.
Tried this before... its a tough sell (Score:2)
1st... multiple points of failure. By increasing the number or databases your increasing the potential points of failure. What features are there to automatically backup data? If the data is spread randomly across the dbs a
Re:Tried this before... its a tough sell (Score:2)
Re:Tried this before... its a tough sell (Score:2)
If I bought 3 dual CPU linux boxes there is a 3x better chance of having a Power Supply die, a network connection die etc a
Not there yet... (Score:2, Insightful)
The shared disk array (RAID, etc.) is just a part of implementating HA.
My recommendation is for the developers to take a look at how it is implemented in the enterprise DBMSs (Sybase, Oracle, MS SQL Server, DB2) first.
jason
This is a very very old idea.... (Score:2)
First, they should move more and more features of the DB to the controller layer. The goal should be that you can call plain SQL statements and complex joins directly. Later, you could even have stored procedures execute there and use the cluster as if it were one db.
Then, they should try and work it so that you make low level calls to the DB layer, this would save time in having the seperate DBs compile the SQL statements.
Next, make some kernal mods ala Tux to make the DB ca
I wouldn't get too excited (Score:3, Informative)
Furthermore, to scale up systems generally take advantage of stripping. At the IO level that means striping across multiple disks (modern convention is to stripe across all!). In a parallel database one usually stripes a single table across multiple nodes for parallel query processing. While it is possible with C_JDBC to put table X on node A, table Y on node B I don't see any provision for striping the data. It will be very difficult to use your hardware efficiently in this scenario.
If you are going to go through the trouble of implementing a complete query processor (that can handle jobs larger than ram), a full update/query scheduler (lock manager), and a journalling mechanism that can (somehow) even maintain atomic transactions (even in the face of multiple failures) then why not just build your own database. This system might be useful in certain rare cases but I wouldn't use it except possibly for replication.
JJ
Re:I'm 100% Confident (Score:2, Informative)
When you look at Oracle pricing policy, you can have Oracle RAC for the price of just Oracle (+ a free RAIDb), which is already a 50% discount!
Re:I'm 100% Confident (Score:2)
Re:Hehehe... (Score:2, Funny)
*laughter*