Microsoft and Apple Computer are searching for the same thing with their next operating systems: a better way to find stuff on an increasingly cluttered hard drive. The article fails to mention the similar Beagle & Storage projects for GNOME (albeit not as integrated to the filesystem) and the original effort (that Apple’s Spotlight is inspired from as it is developed by the same engineers), Be’s BFS & Tracker.
OS X 10.4 and longhorn are gonna ROCK for end users because the new systems will allows for data centric computing rather than application centric computing.
BeOS was the inventor of this idea and as Eugenia has said, Apple seems to have made it what it could have been if Be was still around.
Who knows exactly what WinFS and Stacks are gonna be, but if they work like spotlight does with its Smart folders, then it will rock as well.
BTW, DJ Jedi jeff:
http://www.census.gov/hhes/income/4person.html
read up and learn.
Apple has great people working for them from BeOS, Next, and opensource. So I’m sure they will nail it to the wall. I hope Apple holds to it’s roots on icon-base computing(ui), it has changed everything. Kinda cool to if Apple put voice commands into it, that would be great for people who can’t see, or just the your regular joe.
Is KDE doing anything like this?
Kinda cool to if Apple put voice commands into it, that would be great for people who can’t see, or just the your regular joe.
People have been saying that for 20 years. They haven’t figured out how to stop people from yelling over the wall “newfs /dev/ad0s1a”. I suppose some sub-vocal thing would work though (a la http://amesnews.arc.nasa.gov/releases/2004/04_18AR.html)
I’m hoping that osnews will have an article comparing all Solutions to the search issue.
Someone check the dates, then check the history on Copland – Apple was demo-ing smart folders in the dev previews of the _original_ OS 8 – Copland.
“Smart folders” doesn’t have much to do with what you see today at Tiger. What you see today is simply the evolution of BFS. It does exactly what BFS was doing, with the added feature of searching inside files.
The greatest feature of BFS is its “live queries”, no other fs could do that back then. And today Tiger does it. And it is not a surprise really. The people who work today on Apple on this very thing are the same people who created it for BeOS in 1998/9: Pavel and Dominic.
I don’t know of any projects that are working on it, but conceptually, it shouldn’t be very hard. KIO is already pervasively supported in KDE. It’s powerful enough to support a file-centric interface into stuff like Debian’s APT database, so I don’t see why it couldn’t be extended to offer search functionality.
The point is to have *filesystem support* for this though, so indexing happens automatically. I don’t know about Reiser4, but the rest don’t seem to support “live queries” or immediate indexing the way BeOS & Tiger do it (I don’t know about WinFS).
That’s what’s missing from the Linux implementations: integration.
hmm, I think I talked too soon ( http://mirror1.macosxrumors.com/images/040630/dashboard.jpg ). Apple’s Spotlight also seems to do manual indexing that takes hours. Bleh…
Smart Folders certainly predate BeOS. Besides not offering the slick UI of Piles, Smart Folders are Piles. Piles were patented in 92.
As for the indexing, of coure the first time would take several hours. The question is: once it is indexed once, does the index need to be rebuilt frequently or does it truly remain up to date on a live basis… and when updating live, does indexing create any slow downs…
I don’t see any reason to go “Bleh” at a service which will need to build an index of thousands of files initially if it can keep up from there on out.
94, not 92… but still predates Dominic’s work.
Yeah, but the thing is: is smart folders same or even similar enough to BFS or Spotlight?
Also, Dominic worked at SGI in 1995, on XFS, before the BFS. XFS has some of the features of BFS.
“is smart folders same or even similar enough to BFS or Spotlight?”
That doesn’t make much sense to me. Can you explain?
As you said, Smart Folders are the equivalent of Live Queries which are also equivalent to Piles. Most attempts to have metadata- or db- driven filesystems have had stored queries.
Not to say that Dominic’s BFS wasn’t the best, earliest, most workable version … whatever. I’m just saying that I don’t necessary see Smart Folders as based on Lve Queries since Apple has history with the concept predating Dominics work.
“Also, Dominic worked at SGI in 1995, on XFS, before the BFS. XFS has some of the features of BFS.”
And? 95 is still after 94… and presumeably Apple employees were kicking around the concept before they patented it… Still doesn’t make Dominic the progenitor of Smart Folders.
Any comment on the indexing? Don’t you think it’s reasonable that it may take some time the first time around?
> Smart Folders are the equivalent of Live Queries
I never said that.
>which are also equivalent to Piles
Actually no. Piles and Live Queries *are not* the same. You are confusing some things here.
>Still doesn’t make Dominic the progenitor of Smart Folders.
But what apple showed with Spotlight is not “smart folders”. And looking at the new Finder search UI, it is just the Tracker “find” panel. This is obviously the work of Dominic and Pavel more and foremost.
>Don’t you think it’s reasonable that it may take some time the first time around?
No. On BeOS it was automatic. When you installed the OS or added a new file to the system it would index immediately and automatically. It was a feature of the file system.
“I never said that.”
Whatever… I have no idea what this means then:
“”Smart folders” doesn’t have much to do with what you see today at Tiger. What you see today is simply the evolution of BFS. It does exactly what BFS was doing, with the added feature of searching inside files.
The greatest feature of BFS is its “live queries”, no other fs could do that back then. And today Tiger does it.”
Smart folders definitely have quite a bit with what we are seeing in Tiger. Maybe you and your husband should have gotten a hold of the DP and installed it or went to a session or talked to someone who did….
“Actually no. Piles and Live Queries *are not* the same. You are confusing some things here.”
Piles are a stored query. Apple also has a specialized UI to view them or to perform contextual functions on them, that they haven’t implemented yet. But the basic premise remains a non-folder storage area based on a query that auto-updates based on the query criteria. I don’t see the difference.
“But what apple showed with Spotlight is not “smart folders”.”
I’m talking about what I am seeing with Smart Fodlers. Smart Fodlers are definitely related to the FS improvements and Spotlight search functionality. I don’t see your rationale from trying to remove them from the discussion.
“And looking at the new Finder search UI, it is just the Tracker “find” panel. This is obviously the work of Dominic and Pavel more and foremost.”
I haven’t disagreed with the fact that these guys work for Apple on these features.
“No. On BeOS it was automatic. When you installed the OS or added a new file to the system it would index immediately and automatically. It was a feature of the file system.”
Whatever. I could care less if there is an initial indexing period if it can occur in the background and only happens once. From there, I haven’t noticed any problems… rather automatic and immediate.
I don’t know if you are simply talking a fresh imstall or an installation on a machine which already contains many files. For me, the indexing took about 2 hours for a rather overloaded and older iMac. No biggie.
No. On BeOS it was automatic. When you installed the OS or added a new file to the system it would index immediately and automatically. It was a feature of the file system.
But I though it was not a feature of the FS until now, so a few hours to index the lot to start with is not bad.
Eugenia, during install it has to index it, but it does so with out user input. how much more Automatic can you get?
“Piles are a stored query. Apple also has a specialized UI to view them or to perform contextual functions on them, that they haven’t implemented yet. But the basic premise remains a non-folder storage area based on a query that auto-updates based on the query criteria. I don’t see the difference.”
Piles are folder based. The user decides what goes in them and what doesn’t. The closest thing to a query is sorting them, which is no different than ‘sort by’ in a folder.
I was talking about a clean installation, not about upgrading. If you are upgrading, then YES, it has to index the thousands of files you already had with the old filesystem. But if it is a clean installation –and any subsequent new files added to the system– they should be indexed automatically.
there is a communication problem here.
to me, automatic means that the system does it without the user telling it to.
so an index time after fresh install is fine, and as long as it indexes any new files when I create them with out me telling it to, that is automatic.
to you, it seems to me that on a clean install, you want the indexing to take place when the files are being written to the file system, not afterward.
either method is automatic. one just builds the index while the files are being written, the other makes the index after the files are written. it takes the same time to index not mater if it is in line or sequential.
You are using the word “automatic” on a user level, while I am using it on a system level. The BeOS filesystem wouldn’t index on the background after it got itself a fresh install, everything would be indexed and ready to use from the moment the OS was booted for the first time. If for OSX requires the system to [manually-]automatically index its fs after the OS gets installed, for me, that’s not very automatic, because it shows me that the filesystem does not have that capability. I am interested in seeing the capability on the fs itself.
10.4 indexes the files AFTER they are written to the disk and automatically indexes any new files as they are created without the user needing to tell it to. spotlight is a live query system that is system wide and can be used in any application. “Smart Folders” is what I called them at first because they seem to have the same features as smart playlists and stuff like that. basically they will be a stored Query that will use the “folder” metaphor, if Apple calls these Piles, or smart folders, I don’t care. all I care about is being able to automatically sort my data the way I want automatically in a way that is meaningful to my work flow.
like I said, you are talking about inline indexing during install. apple is doing it afterward. and I really do not care as long as it indexes my new files with out me knowing.
“Indexing new files in the background” and taking CPU time and memory away from me *for hours* while I want to fully use the newly installed OS (clean install), is NOT the same as “fully ready to use”.
You might be happy with it, I am not. If BFS could do that in 1998, I expect OSX’s fs to learn to do that too.
well, there is no reason that apple could not implement this in the install routine. if enough people write in about this needing to be done, then I am sure it will be trivial to do.
Okay here, I just indexed my entire hard drive (40 gigs) in Panther, and it didn’t take hours. It took under an hour. If they do this during a clean install and auto-update as files are created and altered, which they already said it will (a single file is pretty much instant) then there is no problem. If it adds an hour to the upgrade time (based on what happened when 10.3 was released I’d suggest a clean install anyway) then that’s not really a big deal…if you are rushed for time then it’s probably not a great time to install a new OS anyway. Most people put aside at least half a day for a new OS install and to play with the new features.
Hans Reiser has apparently given a great deal of though into some of the issues surrounding search-based systems.
http://namesys.com/ (See the ‘Reiser4’ and ‘Future Vision’ links)
The two papers are very detailed, in-depth, and insightful. They’re not quite bed-time reading, so before diving in, make sure you have a dictionary (and perhaps a database algorithms text) handy
The single huge difference between the two systems is that BFS queries dont search the contents of files.
Given an existing ( pre install ) hard drive full of files they need to be indexed.
BFS got off easy here because it doenst actually index very much by default. Im happy to give OS X a few hours to index all the contents of my drive.
The next thing to note is that BFS _isnt_ automatic. If you have an attribute on a file type ( eg: artist name attribute for mp3 ) that isnt already indexed, setting that attribute to indexed will _not_ index your existing files. You have to manually run ( command line ) the indexing utility.
As far as Im concerned spotlight _is_ a huge improvement over BFS queries ( because you get content searching ). The UI for spotlight also looks better, when you create a smart folder you get a live query view which you can edit the parameters of, without the crap back and forward between query and results window that BeOS had ( not a big thing to change Im sure, but really obvious ).
One of my biggest problems with XP is not being able to find stuff as I am able to in 2000, 9x, linux, etc…
If the file type “extension” is not “registered” with the win XP you can pretty much stand on your head and look at the stupid dig waging it’s tail telling you that the search found nothing !
simple test:
create a text file “test.php”
using your favourite text editor type in “hello world”
Now try to do a search for a file called test.*
it will not find it !!!
Now try to do a search for all files containing the phrase “hello world”
it will not find it
you just get the stupud dog (I like real dogs, so don’t flame me for calling this one stupid)
Is it really that hard to organize your files and make a search that really searches “all” files ? or is the idea of simple is good just beyond MS ?
Spotlight is based on a Metadata API (just browse the description of WWDC sessions) which is layered *over* the HFS+ filesystem. As Steve Jobs said during, the Keynote, special importers support the current formats (iTunes and iPhoto catalogs, mailboxes, icalendars, vCards) and don’t use the special fork for extended attributes already available in HFS+. The technology is totally different form Be’s one: Apple is using a powerful indexing engine but doesn’t have introduced the concept of “FileTypes” definable by users. Spotlight is not integrated in the Finder windows: there isn’t a listview mode where you can edit the attributes and you can’t custimize which columns to display. Giampaolo and Cisler have been working on totally different problems than the ones they solved in Be.
I’m curious why you feel indexing is such a terrible idea? I mean, having this stuff integrated into the file system is essentially like having the filesystem act as a metadata index itself.
If you look at it from a database perspective, you have a bunch of data with a number of properties attached. For instance, you have a filename, a create date, and now you have various metadata fields as well. If this were an abstract database we were talking about, to make searching fast, you’d need potentially multiple indicies, one for each field. I’m not sure what advantage having them all be a part of one index (the metadata-filesystem) would be. But then, I’m not a database/filesystem expert either.
Ok, after reading the rest of your posts, I think what you are wanting is for the system to auto-index new data transparently. This doesn’t need to be part of the “filesystem” to have the effect you want. It only needs to be part of the system in such a way that it doesn’t require any special application calls or periodic reindexing.
From the sounds of the posts here, OS X does exactly what you want. It’s unavoidable that it has to do an initial massive index upon upgrade. And it’s not really important if it has to do it once after a clean install either, I mean how often do you clean install. (Hopefully just once?)
I don’t see the point in caring whether it’s part of the FS or just implemented as a virtual FS layer on top of the existing file system. I’d be willing to bet that in fact it is implemented as a virtual FS layer which is an interface to the original FS + some extra index code. We already know this is how WinFS will work in Longhorn.(WinFS is implemented on top of NTFS.) It actually makes very good engineering sense to build it this way in my opinion.
I can’t help feeling the real question here is: “How do we keep users upgrading their hardware now that we reached the point where their current systems are already ridiculously over-powered for most of their needs?”
The answer: use 10,000 processing cycles where 10 used to be enough…
I have no trouble at all finding my files – I keep them organised in folders, and I need nothing more from my filesystem. I just hope there will be a way of turning all this stuff off when the time comes.
The next thing to note is that BFS _isnt_ automatic. If you have an attribute on a file type ( eg: artist name attribute for mp3 ) that isnt already indexed, setting that attribute to indexed will _not_ index your existing files. You have to manually run ( command line ) the indexing utility.
That’s right, adding new attributes does not trigger reindexing on BFS. However, I believe the reindexing is done always when copying files (very useful if you dump a whole bunch of mp3s from a FAT partition or something onto your BFS partition) and when selecting “identify” (seems to work recursively on folders) in Tracker – not sure though, can somebody confirm? Of course, you can force reindexing from the cmd shell too… All in all queries and indexed attributes work quite well in BeOS, the only problem is that just like probably all BeOS users I have non BFS partitions on my HD and I copying files to BFS just to have the advantage of indexing and custom attributes is out of the question…
>I have no trouble at all finding my files – I keep them organised in folders,
>and I need nothing more from my filesystem. I just hope there will be a way
>of turning all this stuff off when the time comes.
But the beauty of this system is that it works not only on the ‘traditional’ filesystem, but also your mailboxes, iCal calendars, iTunes etc. etc. Your search doesn’t have to return just files — so if I wanted to find all correspondence from my mate ‘Fred’, it would include not only OOO documents, but also emails, calendar entries, songs composed by Fred etc.
Pretty cool if you ask me.
My opinion on all these new searching technqiues is that they are pretty useless to the common user. Well, not useless, but redundant. I mean, a searche engione for the internet, logical, but for your own harddrive? I just think people won’t use it…
Most people do know where they keep their stuff. I’ve never heard anyone complaining “I can’t find this and that photograph.” Of course that doesn’t really say a thing, but I think you can generalize that to the larger crowds. People are pretty handy themselves; and besides, operating systems and/or special utilities put pictures downloaded from cameras into seperate folders anyway.
As to music, most people tend to throw all their mp3’s into one directory anyway, and then simply use a music player to do their work for them. They don’t go looking for individual music files; why should they? If they want to play a specific song, they’ll just open WMP/WinAmp, sort-by-name and find the song. No need for a huge intergrated, maybe even resource devouring search system.
It seems to me that the people at Microsoft and Apple think that users just throw all their files at random on their HDD’s, but that just isn’t true, as far as I can tell.
I think that this is an example of needs being created by corporations. I have my music really arranged nicely, and I have a player (Muine) that is actually able to make sense of the way I hae stored it. I did not write this player, but someone actually wrote the player to take advantage of how people actually organise their files, mainly that they usually put albums in separate folders.
I have heard people who now have a flat directory to keep all their music, and I cannot imagine how ugly that must be, but by encouraging such behaviour, it is my opinion that companies such as Microsoft are creating a need for such live query type systems, because soon people are not going to care for order on their hard drives. But I think as long as hard drives are as slow compared to other components on a PC, users are very well served by having a good heirarchical structure on their drives, because it removes the need for software to find stuff for them. can you imagine you are trying to copy over thousands of small (text) files, and your PC is trying to index and copy at the same time. These things make your computer slower, because it has more to do.
My PC, when running linux, comes by default configured to run updatedb every morning, when presumably I am supposed to be sleeping (they don’t know), and it indexes just the files. I can only imagine if it had to index mp3s and their tags, movies, text files and their contents and so on. It would have to run much slower.
I think it is better to make tools that encourage good behaviour, rather than encourage dependency on something that will be inherently slower, but masks that in the name of convenience.
hmm, I think I talked too soon ( http://mirror1.macosxrumors.com/images/040630/dashboard.jpg ). Apple’s Spotlight also seems to do manual indexing that takes hours. Bleh…
But of course it does, hazoula 😉
THERE IS JUST NO WAY to search a hard disk (or anything big) *fast* without making indexes first.
But this indexing should be done only once, when you begin to use Tiger or when you add a new disk. From them on, indexing should happen incrementaly, i.e as you change, add individual files.
By the way, IIRC, all this indexing capabillities come courtesy of the V-twin engine, implemented by Doug Cutting (?), of Lucene fame.
An interview with this guy would be nice. Lucene is one of the best implemented and managed open source apps/libs in the world. If only OSS where this good more often.
Hi little gray, if you download TweakUI for Xp you can make Explorer use the classic search. I think it’s under one of the General or Explorer options. It’s the best search interface I’ve used.
Also I agree with those who say the indexing thing is overrated! However, I can see how it’s useful for new users. I remember getting lost when Windows 2k came out and suddenly there was this Documents & Settings thing… Although I think that a lot of effort is being put towards something that could be negated with a little experience & education!
Why did moderated comments get deleted?
I thought you didn’t delete comments, just moderate them?
🙁
Isn’t just defined by what a user puts in. THe user can set a query similar to a search (modified by, date range, file types, attributes, etc…) and then all files which match it will be in the Smart Folder. It isn’t just a Sort. This is equivalent to Live Queries and Piles.
Secondly, I’m glad someoen pointed out that BFS did not index the contents of files. This is all the OS X indexing engine index… the full content of files. So… for you… you have done clean installs of BeOS and didn’t notice any indexing, although it did, although it did not index file contents… Others of us have installed Tiger on existing computers which already contained large numbers of files and there was an initial indexing period of anywhere from a half hour to two hours… but this was all contents of all files. Once this was completed, all idnexing appears transparent and automatic.
>>I have no trouble at all finding my files – I keep them organised in folders,
>>and I need nothing more from my filesystem. I just hope there will be a way
>>of turning all this stuff off when the time comes.
I agree 100% !!!
>But the beauty of this system is that it works not only on the ‘traditional’ filesystem, but also your mailboxes, iCal calendars, iTunes etc. etc. Your search doesn’t have to return just files — so if I wanted to find all correspondence from my mate ‘Fred’, it would include not only OOO documents, but also emails, calendar entries, songs composed by Fred etc.
>Pretty cool if you ask me.
Not really, you will just end up with a thousand resoults just like Altavista and how manu of them are useful ?
If I am searching for an email I go to my email client.
If I am looking for appointment, I go to my calendar.
If I am looking for an mp3 I go to my mp3 directory.
Simpler, easier, faster, and I may be thick but to me you can’t improve this much more….
If I am looking for a song by “john”, I don’t want all the emails that I got from “john”, just use the appropriate tool for the job…all in one solutions have a habit of complicating things and not working in the end.
Some research please.
You don’t need to search the contents of files, for one.
Secondly, the results are broken out so that you get apps first, docs second, emails third, contact entries fourth, etc… (This is just a rough hypothetical) So results are broken out by type and are quite clear.
You can also further refine the searches quite a bit so that you don’t get the results you are complaining about.
Thirdly, the search features are specialized per each app: in mail, it just searches mail (but provides the robust options), in iTunes, just tunes, in iPhoto, just photos, etc… Each app can build the search features into their app, or you can search from the Finder, or you can search from the system (Spotlight field).
Fourthly, this is an improvement. If you don’t want to use it, you don’t have to. I don’t see how it complicates anything.
>You don’t need to search the contents of files, for one.
Huhh….?? how do you know if I need to search the contents of files or not ??? do you know something about me that I don’t ???
> You can also further refine the searches quite a bit so that you don’t get the results you are complaining about.
Exactly my point you can refine your search, meaning extra work. If I open the right tool for the right task in the beggining, I don’t have to do that and I get the “refined search” for free…
>Some research please.
—
yes …
—
“Huhh….?? how do you know if I need to search the contents of files or not ??? do you know something about me that I don’t ???”
You jsut said you don’t want to search, that you don’t want search results to show all contents… If you don’t want to, don’t. If you do, do.
That’s all I’m saying. Not saying d1ck about you. I don’t know sh!t about you and I don’t want to either.
“Exactly my point you can refine your search, meaning extra work. If I open the right tool for the right task in the beggining, I don’t have to do that and I get the “refined search” for free…”
Blah, blah, blah… What I am telling you is you have a number of options to pursue… I give you all your choices and you start whining again about organizing and finding files because you know where they are… Whatever, man.
This isn’t just whining, and it isn’t as simple as saying, “if you don’t like this feature, just don’t use it”. Building all this meta-data and searching into the file system takes its toll on performance, requirements and stability whether I use search function or not. *That* is the problem, and that is why it makes perfect sense to argue whether we really need it in the first place… and to demand that it can be left out altogether if user desires. It will be less of an issue on Linux – if I don’t like using Reiserfs4 I’ll probably be able to use any other filesystem instead – no problem. But you Windows users… be afraid, be very afraid.
Then it is just whining. I am. I notice no speed difference. Searches are quicker and more convenient. Smart folders, mailboxes, groups, etc… are very useful.
I’ve seen no performance hit but I have seen a performance boost.
I would imagine that it’ll be rather trivial to disable the indexing feature, which is the only thing I could imagine creating a performance hit, although I have not noticed one.
Ah yes – Tiger, that OS renowned for its excellent performance on low-end hardware… thank you for proving my original point: this is not about user’s need for indexing – this is about selling hardware.
Tiger is running on a 400MHz iMac at the same speed as Panther back at my house. And this is at an early Developer Preview stage. Renowned for its excellent performance on low-end hardware? Only 4000 some odd developers have it, dude, and I have seen zero reviews about its performance on low-end hardware. Are you making sh1t up? Because, quite the contrary, everyone I have spoken to at the WWDC is shocked by how well it does perform at this early stage.
You clearly do not know what you are talking about and enjoy whining.