Michael Chute on OS X, Research
“In the end, ease of use and data security was the rule,” Michael Chute told MacNewsWorld. “I spoke with people using Linux clusters, and while they were effective, they entailed more significant IT resources. No need for a bunch of scientists mucking around when they do not have the time to learn system administration.“
Clustering typically refers to a bunch of machines working in parallel on a single problem. Most biological computations can’t be easily made parallel.
What they seem to be talking about is a fallover system for finding the least busy machine to ship a single job to – and the article makes it sound like the BioTeam has sorted that out for its products (though i haven’t looked into them).
Given that, the better support for clustering on Linux shouldn’t make any difference. Besides, the article went into some detail on why Linux wasn’t the right solution for them.
JT
True linux clustering requires you to do system administration IF YOU DO IT ON YOUR OWN AND ARE NOT PAYING SUPPORT!
But if you go through a company like HP, IBM, or the thousands of other companies availible that provide you with excelent support for running linux at prices cheaper and/or more expensive then the OSX counterparts, depending on what you need, then the underlying OS becomes a technical, and unimportant issue. The important issue is how good the technical support is, and how well the system performs (where linux performs amazingly well, but so does the OSX clusters)
One of the advantages of Linux is due to the open source nature, some amazing things can be done. For example, PlanetLab( http://www.planet-lab.org/ ), a way to allow different reaserch teams to run programs in thier own ‘virtual linux’ (without the slowdowns of UML, VMWare, or any of that, but with all of the security), but allowing labs around the world to test things like massive P2P networks, and so forth.
Most biological computations can’t be easily made parallel.
Really? As a geneticist doing lots of bioinformatics I am quite sure that a lot of problems in biology are prime examples of parallel computation.
Examples (Google for more):
http://mpiblast.lanl.gov/README-1.1.0.html
http://pr.fujitsu.com/en/news/2002/02/25.html
http://www.hgmp.mrc.ac.uk/embnet.news/vol7_1/body_ebi.html
Thanks for the links! Reading them over, though, the only one that’s clearly addressing running a single job in parallel is the MPI-BLAST (i’d be interesting to hear how the database is internally split.
I realized that splitting a single BLAST query of multiple databases into multiple queries of single databases allows for a lot parallelism. Not knowing about MPI-BLAST, i hadn’t realized that it could be done on a single-job level.
Anyway, maybe i’m being a bit too strict about what’s a single parallel calculation….
JT
One thing that you have to realize is that clustering might not make sense once you factor in development time too. A lot of labs do not have a full time person writing programs, so the scientists themselves have to. My brother noticed while doing his phd thesis that it was usually faster for him to develop a program (non threaded) to run on his own desktop and run it there than develop a program to run in parallel on the lab’s cluster. Even if the end result can be obtained 10 times faster on a cluster, you have to account for the time spent developing a parallel solution to the problem. Obviously, there are some problems that are very condusive to parallel solutions, but many and I would actually say most are not.
Running on Unix was a must, their budget permits them to buy just about whatever they want and in the end, ease of use and data security was the rule. So they buy 4 G4 X servers and set it up themselves and then realize they really needed eight more G5 X servers? Sounds like somebody didn’t do the research to size the cluster to their needs. They could probably have used a IBM System running a number of VMs with either AIX or Linux (virtual cluster), saved some floor space and got that important security too. Remember, money wasn’t an issue.
“No need for a bunch of scientists mucking around when they do not have the time to learn system administration.”
No, but they’re apparently willing to waste some time setting up X servers instead of paying someone to do it for them.
They didn’t get a tape backup system right away but they did get RAID. Unfortunately, RAID isn’t a substitute for a tape backup system – not a smart move.
Also, we’re talking about scientists here. Since when are scientists afraid of learning a little system administration – it’s not rocket science.
They mentioned ease of use yet most of their apps are command-line. I wonder if they wrote any of those in FORTRAN?
The article ends up sounding like someone who wanted to play with some new Mac toys.
I don’t get it.
Why whenever someone uses something OTHER than Linux, all these people come out of the woodwork going “But they could have use Linux, why didn’t they?” Or “They should have used Linux, because I like Linux and think it’s the only operating system that should exist. Because I like it.”
Honestly, Linux is *just an operating system*. It’s not really anything special. Sure, it has hype, but hype doesn’t get work dne. It has industry support, but so does Solaris, AIX, IRIX, and others. It’s free and open source, but so are the *BSD operating systems. In fact they are even more free.
Really, if people want to use some other type of system for whatever reason, why do you feel the need to point out that they could have used Linux?
I agree, it gets old after a while. They decided to use something other than linux because they wanted to manage the servers. It is simple as that, they looked at Linux and looked at Mac OS X and they wanted Mac OS X– it is as simple as that. They could have used a number of different OSs.
Obviously, you didn’t read what I wrote and you didn’t read the article. Yes, they could have used Linux, Tru64, Solaris, HPUX, AIX, or BSD or some other Unix/Unix-like OS. However, I don’t think I would choose a MAC cluster over a 8,16 or 32-way processor system when money isn’t an obstacle. Now the question of what OS to run becomes important as it has to be SMP capable and the company that you bought it from is going to have to support it long after the one year warrantee expires. It would probably be an OS with a proven track record of reliability, scalability and robustness.
>Linux is *just an operating system*. It’s not really anything
>special.
And that is where you are wrong. Its is the only OS in the world wich is mature enough to compete with Unix on big irons and clusters wich is GPL, free downloadable and completely FREE!. Linux is very special yet most people do not realize wich impact it had/has on the IT industry. It is LInux wich made it Microsoft despretatly seek for new oppertunaties and better software. Its Linux wich runs on over 50% of the worlds webservers and made the prices go down for hosting, its Linux wich made it possbile for people and organisaties wich do not have enough money to buy Mac and/or Win to use a computer and to be an active part of our society. So saying Linux is nothing special is wrong. Its special.
… they chose OS X server over linux.
Didn’t say that Linux was better. Said that cluster support was better for Linux. That’s a market observation, not technical.
You can buy ready-made Linux Beowolf clusters from dozens of different outfits, all set up and ready to go. No sysadmin-programmer needed. You can also get beaucoup de cluster numerical / MPI codes for *nix.
If these folks don’t need a true cluster, OK.
My Mac OSX box is not much better than my Linux boxes when it comes to terminal command lines. Getting anything useful done requires prompts and geekish config files on either system, esp. in networking.
Here’s another example:
http://cmgm.stanford.edu/~cparnot/xgrid-stanford/index.html
It’s someone at Stanford taking advantage of xGrid to run a highly parellizable (sp?) protein related simulation. Apparently it was easy for him to write the application. You even get a neat screensaver showing the power of the cluster.
“Linux is for computer geeks. Not all scientists are computer geeks.”
I agree with you, Safety O. 🙂
The point of OS X, is that, generally speaking, you don’t NEED an effin full time IT person to get the system up and running and doing well.
Any person with an idea about how a GUI works can set one up to serve. A few things in 10.3 server require trips to the command line (but only if you’re really tearing into the system), and if you’re doing something extremely complex, you might need to have a few hours of IT staff time.
With the HP, IBM, etc solutions, unless you are very familiar with *nix, you will need to have their IT person, or your own IT person there to set it up and configure it. If you need to change the system, you will have to call this person in to work on it.
Not so with OS X. OS X works for you, you do not work for OS X.
—
Incidentally, where I work (academic library) I know several people with scientific based PhDs who are complete computer techno phobes (I know because I’m always straightening out their computers) and all of them consider windows a big step up from the DOS machines they had back in the early 1990s.
Geology, Chemestry, or Biology PhD does not equal a burning desire to learn *nix, nor does it mean a person’s going to have any talent whatsoever with computers. (And I’ve got a cum laude ivy leaguer in that group, so it’s not like he’s lazy or stupid.)
And frankly, my colleagues over in the Biology department just want the computer to run BLAST. Time spent making BLAST work, or waiting for the campus IT staffer to show up … is time spent not working in their view.
They’re not interested in the “liberation” of their software or the hardware, or the fact that the OS is superflexible and can do everything including blend your daquiris or wax the floor if you’re willing to spend several weeks tweaking the code.
They want to plug it in, turn it on, drop in the install CD, make with the Double Click, and get to doing REAL work.
And in that regard, Linux just doesn’t cut it.
err? the reason for x86 cpus burning cycles when idleing was that win9x didnt tell it to go into a proper wait state but rather sent it a lot of junk when it didnt have anything better to do. linux and winnt (2k/xp) have allways told the cpu to go into a proper wait state.
allso, isnt it the latest g5 that comes water cooling from the shop? sounds like its burning more power then a x86 at the moment…
as for the just works reacord, its starting to grow old. yes linux isnt the perfect one but like people have allready pointed out: the article states that they allready had apple and a cluster specialist company (bioteam?) at hand.
so how is that diffrent from getting a set of boxes from say ibm, plugging them together and follow a small step by step guide for hooking them together (power goes here, network cable goes here and into this box over here…)? i know you can do allmost anything in gui in the mac os but dont tell me the gui is so intuitive that you can do a cluster config in it without any prior knowhow about how a cluster works and is supposed to be configured?
please. you really think that the latest G5 is so hot it NEEDS water cooling? Apple has always been focused on getting the quietest posable computing experience for its pro line because they are used in audio production. water cooling is super quiet and you are just using it as a means to bash apple. water cooling only becomes a necessity when you reach 120 watts or there abouts and the G5 is no where close to even 100 watts.
An OS is an OS, some are better for certain things than others but no, there is nothing special about one over another. There is nothing special about Linux technically. It’s just software that performs a function.
Now, you may argue that it is special because of non-technical reasons (GPL, whatever), but that wasn’t the point of the post you replied to.
i never have used liquid cooling and i probably never will unless it can be garanteed that a leak will not fry my system. so why they are useing it or not i dont know but most of hte time when someone comes around with liquid cooling its for temprature control, not noice.
and personaly if i wanted to have noiceless when working with sound (i have a friend thats dabbles in that area) i would recomend tossing the computer into a diffrent room or similar and run a small formfactor fanless terminal at the work table, like say one of those via eden systems. with a 12 volt external supply your looking at as littile noice as you can get. whiel liquid is less noicy you still have a pump asembly, and that means a electric motor.
as for bashing apple. well any type of ignorance from apple zealots are used to bash any other os so maybe im just fireing back with the same under the belt punches?
you know they are using a non conductive liquid right? you also know that it is self contained, unlike those kits you buy with tubes all over the place right?
and you also know that by saying you are never going to use water cooling that you are limiting yourself in the future right?
when have I ever bashed another OS? I am critical of Windows XP because I don’t see it being anything other than Windows 2000 (a good OS) with a pretty GUI, and I am critical of of the zealots over there who refuse to move forward in many key areas. I however am waiting in great anticipation for Longhorn because MS will have finally have moved beyond the old garbage and into a new era and Gates is most certainly trying to do it with as little cruft and bugs as possible.
ignorance has nothing to play in my criticisms.
Please. They are using “LIQUID” cooling and I seriously doubt that the liquid is water. If anyone really knows the specific liquid that is being used I wish that you would educate all of us and stop this silly talk about water.
i know you can do allmost anything in gui in the mac os but dont tell me the gui is so intuitive that you can do a cluster config in it without any prior knowhow about how a cluster works and is supposed to be configured?
indeed you can, http://www.apple.com/acg/xgrid/
never having set up a cluster before i set one up in about 15 minutes, of course, it was pointless beings as i had no cluster work to do, but the tachometer of GHz is pretty cool none the less.
> They want to plug it in, turn it on,
> drop in the install CD, make with the Double Click,
> and get to doing REAL work.
>
> And in that regard, Linux just doesn’t cut it.
You’re going to need an IT person to port the application
software from *nix to OSX, as a general rule, if we are
still talking about clusters, with their very specialized
software applications and codes.
ah, rendezvous, no wonder it was simple. that and a special signal package for telling the diffrent hardware who is boss and who is not…
i have a feel that the same can be implemented with linux at the base very fast if wanted to. but somehow im not sure i want to trust a autodetecting and multi/broadcasting protocol with my cluster. atleast not in any other cluster setup then one where you can split the math needed in singel chunks (rendering basicly is tossing 1 frame to every cpu and then putting the parts back together in the end). for stuff like weather prediction you need high amounts of cross cpu talk and i dont think this fits the bill for that kind of stuff…
No, the GPL is a license for defining the terms of distribution. It has nothing to do with the technical aspects of the linux kernel (i.e. process scheduling, memory management, security, etc.). So again, there is nothing TECHNOLOGICALLY special about linux.
Sometimes I feel like I’m talking to children.
I think the “liquid” is mostly pure water with percentages of lube and non-conductive mix. As you guys well know, pure water is not a conductor.
“Sometimes I feel like I’m talking to children.”
Well, what do you know?
More often than not, you are.
🙂
no, it cannot be water of any type. not because of the none conductive nature of pure water, but because it is so hard to keep water pure and none conductive, especially inside an active system. when the water hits the radiators it can pick up some loose atoms of the metal and all of a sudden it is conductive. if it leaks, that water will absorb garbage from the air and then it will be conductive and in contact with your computer parts.
I do like Apple’s new way of using resources on-demand, their new XCode system that when compiling goes and finds space resources on your network and uses them, distributed builds they call it. Its kinda nice.
Me and my freind setup a small beowolf cluster on 10 old AST Pentium 133’s running seti, was pretty easy tbh, just used standard 100mbps ethernet.