Linked by Thom Holwerda on Wed 7th Mar 2007 17:53 UTC, submitted by SReilly
Google Google is developing a program to help academics around the world exchange huge amounts of data. The firm's open source team is working on ways to physically transfer huge data sets up to 120 terabytes in size. "We have started collecting these data sets and shipping them out to other scientists who want them," said Google's Chris DiBona. Google sends scientists a hard drive system and then copies it before passing it on to other researchers.
Order by: Score:
Beware
by fretinator (4.24) on Wed 7th Mar 2007 18:19 UTC
fretinator
Member since:
2005-07-06
Fans: 6

Beware the USPS compression format for this data transfer!

Wagon still wins
by MightyPenguin (1.92) on Wed 7th Mar 2007 18:39 UTC
MightyPenguin
Member since:
2005-11-18
Fans: 0

As the old saying goes, nothing can beat the bandwidth of a truck full of hard drives ;) Though maybe someday we'll get something faster someday.

RE: Wagon still wins
by jessta (3.76) on Thu 8th Mar 2007 10:09 UTC in reply to "Wagon still wins"
jessta Member since:
2005-08-17
Fans: 3

bandwidth indeed. By the latency sucks.

ZFS?
by TaterSalad (2.68) on Wed 7th Mar 2007 19:27 UTC
TaterSalad
Member since:
2005-07-06
Fans: 3

Would something like Sun's ZFS aide in this data transfer or is it strictly trying to get a protocol to transfer the files from one site to another? In which case you'd need some hardware to do route the packets, like carrier grade or whatever its called.

RE: ZFS?
by zbrimhall (2.32) on Wed 7th Mar 2007 20:45 UTC in reply to "ZFS?"
zbrimhall Member since:
2006-08-21
Fans: 0

Unrelated problems.

ZFS, being a filesystem, is unrelated to data transfer. I'm sure it could be useful to the project in otherways, though: ext3 filesystems, for example, have a size limit in the neighborhood of 8-16 TB. You could probably use some kind of logical volume manager to concatinate a bunch of filesystems together, but why do that if ZFS can manage such datasets withought breaking a metaphorical sweat?

Of course, the article says nothing about the actual technology Google is using for these "hard drive systems," and I can't recall off the top of my head what the state of ZFS on Linux is (is it working now through FUSE?).

As for the problem at hand--transfers of enormous datasets--it's really just a Google implementation of the old proverb "never underestimate the bandwidth of a station wagon full of backup tapes speeding down the highway."

RE[2]: ZFS?
by Mathman (1.72) on Fri 9th Mar 2007 04:33 UTC in reply to "RE: ZFS?"
Mathman Member since:
2005-07-08
Fans: 0

Correction. You'd use LVM to join a bunch of physical volumes together into a logical volume. You'd still have to put a filesystem on the logical volume.

RE: ZFS?
by Ford Prefect (4.2) on Wed 7th Mar 2007 20:50 UTC in reply to "ZFS?"
Ford Prefect Member since:
2006-01-16
Fans: 6

You can't route 120 TB of data from one university to another. You just have to ship it on hardware.

To all MS haters
by CrazyDude0 (-0.48) on Wed 7th Mar 2007 21:41 UTC
CrazyDude0
Member since:
2005-07-10
Fans: 3

Mr DiBona, open source program manager at Google, said the team was inspired by work done by Microsoft researcher Jim Gray, who delivered copies of the Terraserver mapping data to people around the world.

that which is old is new again
by sn0n (1.76) on Thu 8th Mar 2007 02:11 UTC
sn0n
Member since:
2005-08-09
Fans: 0

sneakernet for the win.

It's cheaper too.
by chaosvoyager (1.48) on Thu 8th Mar 2007 16:35 UTC in reply to "that which is old is new again"
chaosvoyager Member since:
2005-07-06
Fans: 0

"sneakernet for the win."

Both in terms of bandwidth and cost.

http://www.codinghorror.com/blog/archives/000783.html

I think the economic value of a network is based on its latency, not its bandwidth. It's just much more difficult to measure the former.

A new saying...
by fretinator (4.24) on Thu 8th Mar 2007 14:43 UTC
fretinator
Member since:
2005-07-06
Fans: 6

The Checksum is in the mail!