“PC users have volumes of information saved on their computers, most of it disconnected and disparate save for a basic directory system. The answer to connecting all the information into a local semantic Web of information is closer than you might think. Thanks to the open source NEPOMUK (Networked Environment for Personalized, Ontology-based Management of Unified Knowledge) effort, the Semantic Desktop isn’t a dream; it’s an emerging reality and will be here with the upcoming release of KDE 4 for the Linux desktop.”
Open Source Semantic Desktop Is Coming
50 Comments
Well tracker is a semantic web store minus the rdf complexity
it implements the triple store which is unified across indexer and user defined metadata (unlike strigi and nepomuk which have incompatible dbs and have two daemons)
tracker has the ability to do almost anything nepomuk does + everything strigi does but using only one DB and one daemon.
The end result is that tracker is faster, lighter and a lot easier to use (very little complexity)
The second iteration of XESAM will cover metadata storage and semantics so hopefully nepomuk and tracker will be compatible that way.
I don’t know about Gnome, but they (and why not other desktops and projects too) should sure consider using this kind of technology. This NEPOMUK thing seems really interesting, a big step forward in search technology as far as I can see. Also one of the still relatively few cases where semantic web ideas are becoming actually useful for common PC and net users.
for microsoft’s ‘extended’ NEPOMUK….
In other news, it’s a big plus to KDE to be at the forefront of the implementation efforts – even if applications aren’t taking advantage of it yet, the environment’s usually tight integration probably means we won’t be left waiting for long.
-
2007-07-14 9:28 pmronaldst
for microsoft’s ‘extended’ NEPOMUK….
Why would they need to extend something they already been offering for a while? Microsoft’s stuff is already industry standard.
…but I have to say it again… KDE 4 is shaping up (in my mind) to be an OS X killer. I’m starting to get chills thinking about (wish I could reuse my mini here) running PC-BSD or maybe Slackware with KDE 4.
And it isn’t just this, but the fact that KDE (and GNOME) really are continually improving and not afraid to take a chance by including some new unproven technology or methodology.
Meanwhile OS X gets uglier, but at least it seems to be more or less pure cocoa now. And yeah, they added that cool new view, but other than that Finder has not changed much in 5 years and by change I mean there have not been a lot of useful features added.
I can’t speak to windows since I only use XP for work.
-
2007-07-16 2:08 amjdrake
So you are saying that the talk of KDE 4 is reaching new heights? An OS X killer? This is the epitome of hype.
I am cautiously waiting KDE 4 myself. I am fairly good with GNOME right now, but not seeing them doing anything *interesting*.
-
2007-07-16 2:25 amTuishimi
I think as Linux and free BSDs grow more reliable and their driver base grows, UIs become easier to use and configure and the tons of (what I consider to be) office-grade software grows and matures, I say yeah, it could be considered that.
Now KDE4 is looking slicker than ever (lets face it, a big part of why people like OS X is because it LOOKS GOOD), it is being simplified with a useful out-of-the-box look and feel, full of drag and dropness (which will improve as all the frameworks/api’s are developed to make UI cross development improve with GNOME and KDE) but still has the capability to be tweaked by confident users, etc.
I don’t know, what more does OS X offer? It is only now becoming more of a gaming platform…
I am not suggesting the masses would dump their macs just because of KDE 4… I am saying that for ME I think it has a lot of appeal.
i don’t understand how much VALUE semantic desktop can bring to the desktop.
i am not really an organized person – i usually have 30 temp folders on my desktop and dump all the files into them. Once in awhile I burn them to a DVD and store them on my desk. Repeating the process – it’s a process that works for ME.
the thing is NOTHING is related at all. I’ve got some C# source files, pictures of jessica alba, some trial edition apps, rss feeds, mp3s, etc.
in regards to the poster stating KDE may be a OS X killer – i highly doubt it. Leopard is WAY ahead in terms of UI – look at Core Animation (and no i’m not a apple fanatic; i switched from a mac to vista…i miss using thinkpads and i’m excited about .NET 3.5)
my 2 cents
-
2007-07-15 1:05 pmmaxjen
I’m not sure, but isn’t Quasar something like CoreAnimation? And Lars Knoll said: “CoreAnimation will give us a lot of great ideas to help build our version of this technology in the future.”
http://arstechnica.com/journals/apple.ars/2007/06/12/ars-at-wwdc-in…
-
2007-07-15 4:34 pmaseigo
> Leopard is WAY ahead in terms of UI
and far behind in other ways. and we’re catching up with the UI advances; in some areas we’re at parity, in others we’re approaching parity and we’ll just continue until we are undeniably ahead.
as for CoreAnimation, we’ve been toying with similar things in KDE. Quasar and Phase/Animator are a couple of examples; Quasar, once released and ready to use, will replace the Animator bit of Phase/Animator and we should at that point have something that quite successfully competes with CoreAnimation.
-
2007-07-15 9:51 pmdjames
I haven’t used KDE since 3.0 – when I think of KDE I envision “clunky” QT components. It appears 4.0 is a major change…and change is good (when done right).
I wish all the KDE developers best of luck.
-
2007-07-16 5:45 amlemur2
I haven’t used KDE since 3.0 – when I think of KDE I envision “clunky” QT components.
KDE gained very significant improvements in speed in each major version update from about 3.3 through to 3.5, and some minor speed tweaks in between.
KDE4 promises to be faster again.
If you have tried only KDE 3.0, then you will be in for a very pleasing surprise if you were now to try KDE 3.6 or KDE4.
Windows gets slower and slower … while KDE is getting faster and faster.
http://kde.org/announcements/announce-3.5.3.php
http://kde.org/announcements/announce-3.5.4.php
http://kde.org/announcements/announce-3.5.5.php
http://kde.org/announcements/announce-3.5.6.php
http://kde.org/announcements/announce-3.5.7.php
GNOME also has had some speed-ups in recent times I am lead to believe, although not as significant as KDE.
Edited 2007-07-16 05:54
For OSS in generall it’s important to get all noses point in the same direction without scaring dev’s into forks. In my opinion people shouldn’t look to much at what the neighbours are cooking. Instead do what you do best and enjoy it. KDE seems to have a clear goal in mind that certainly has potential.
In Windows Vista, Windows Search already has all these features. With Windows Vista you can add metadata to your files (use predefined metadata or add new personalized metadata), find files using complex queries based on boolean operators, attributes, metadata, or use natural language, etc.
Semantic Desktop is not innovative, it’s a copy of Windows Search.
Edited 2007-07-15 15:47
-
2007-07-15 4:30 pmaseigo
actually, windows search is essentially equivalent to strigi. except that you’ll find that as time goes on many more applications will use strigi on the platform than will use window search due to how these things disseminate in the open source world (and perhaps KDE even more so)
nepomuk adds a few new angles to it, being oriented towards social metrics, using common ontologies and providing graph-based rdf stores.
it is easy to glance at two systems in the same general space (“search”) and go “oh, they are the same”, but if you actually educate yourself as to what each does then you may find differences. in this case they certainly exist and windows search is not comparable to nepomuk+strigi.
enjoy your vista, though.
At first glance this does not do anything that I can’t already do using BeFS. However, I have run into enough problems to know how some things could be improved.
Example, presently I can search all my email, Usenet mail, and some download web forums by an author’s name. But this they use different versions of their name on different services I have to make an expanded search. And if I do not save it or I just remember one name I have no way to couple the different authors names together so a search on one name automatically searches for the other ones as well.
Another problem I face on BeFS is it presents two different ways of storing the date attribute, this effects which searches work and which will not almost at random.
Are improvement like that available?
Hi Aaron
I been following the progress of plasma a bit closely and things are really looking great and I really appreaciate all the work that you and the KDE 4 team are doing in all this.
I have a question thought, I’m just an KDE user and I been using it since KDE 1 and from an user perspective it would be nice if Plasma can play videos and is multimedia oriented, it would be cool for example if plasma could play videos on the background without using external tools like xwinwrap-mplayer while you still have all your icons and widgets visible on top of it and also videos in widgets for mini-players, youtube etc, or real time previews of videos on the desktop, I think all this would be cool and not just for the desktop but for other things like media centers and touch screens, etc, I don’t know if this fits in the plasma purposes or agenda, but is just a thought that I had and I wanted to share, and also considering that you have all that cool infrastructure for multimedia with phonon-gstreamer, etc, it would be cool if plasma takes advantages of all that stuff.
I wish you all the best and keep up the good work, I can’t wait to use KDE 4 on my desktop and this will rock =)
Edited 2007-07-15 22:57
That’s one of my critisisms for the open source movement…OSS almost never leads, it always follows commercial stuff. Now with this project, OSS seems one step ahead of commercial software, which is very nice.
…towards a semantic web, if I understand the article correctly. Apparently it’s supported by the European Union as well.
What I find especially interesting is that the NEPOMUK specs, as truly open specs, are being implemented for GNOME and Windows. It’s nice to think that, if this turns out half way as good as it sounds, nobody will have to be left out.
From the very first mention of spotlight, I have been waiting for people using DB file systems to actually start innovating on the same tired ideas we have been using for years now. At first, I was sure Apple would give it to us, as they tend to lead the way in such things. Instead, we have just gotten better and better searches with spotlight. Then MS starts talking about all the cool new things they will do with it, and I thought “Man, what a slap in the face for apple, to be beaten to the punch for once by MS”. Instead, we got something that is significantly WORSE then spotlight. Then beagle gets announced, I try it out, and it seems like a simple “Me Too” type of technology that will forever be a few generations behind the other two.
Fast forward a few years, and its Linux who is leading the way. I honestly didn’t think it would happen, but we are seeing far more progress on this front with NEPOMUK then with anything else. I still cant wait for the day when folders are no longer used in place of metadata, allowing for fairly flat filesystems, and file names are only one of many identifiers for what you are looking for, but at this point any step in that direction is reason for celebration.
I’m pretty much in agreement with that. During Vista development, I was really put off by Microsoft dropping meta data for everything but Office files. Losing WinFS and then any semblance of meta cut the heart out of the new Explorer. Now Apple may have delivered detailed meta searching (and decent saved searches), but there is still no up front way of tagging and editing meta on the filesystem, and this is where NEPOMUK is stepping up the game.
My only real concern is that It’s going to be near impossible to breakdown the desktop/folder metaphor that everybody is so used to, and go fully meta.
I think it would be fairly easy to emulate folders using a metadata-based system. For the end user everything could look as usual, while under the hood a folder is implemented as a query for files with some specific attributes.
> It’s going to be near impossible to breakdown
these sorts of changes will inevitably be multi-generational. but they need to start at some point, right? =) we shouldn’t be discouraged if we don’t get to the final destination with the first iteration. we should only be discouraged if we aren’t moving towards that destination as quickly as possible.
there are several things in kde4 that are like this, btw.
Application developers can give the notion of semantic organization a kick-start by guiding users toward contributing metadata. For example, a media player could offer to add a common tag to files in a playlist. The “Save As” dialog for any application would be an obvious place to “remind” the user to add tags (maybe this done already).
Also, I think it might be useful to introduce generic key/value pairs to the possible metadata types. MIME types could be associated with default keys, and multiple applications could use well-known keys to communicate through metadata without requiring action from the user.
There may be some potential worries about malicious applications doing bad things with metadata, but I’m not sure there’s any way to prevent that even in the current design. Maybe applications should be allowed to create private keys that no other application can modify. This would be a convenient way for applications to save per-file state.
Just some ideas. Keep up the good work!
> useful to introduce generic key/value pairs to the
> possible metadata types
from the pure metadata POV work on standardizing this is being done already (and common sense has already brought us a long ways there) and from the RDF side we have shared ontologies.
The thing is people already use folders and file names for this purpose. You have pictures/family/vacation/2001/franksallybeach22.jpg. Instead, it could be /personal/nice day.jpg tagged with family, vacation, vacation 2001, frank, sally, beach. Using folders for this purpose, you end up with extraordinarily long mazes with your information at the end, with very little flexibility in navigation. Using a meta approach is such a jump forward in every way, i really think people would jump on it, even though it is quite different.
I mean, look at flickr, facebook, gmail, digg, etc. People get tagging, they love tagging, and they use tagging. The technology is here for operating systems, and it boggles my mind that neither Apple or MS is moving forward with this.
Hmm. I personally don’t add tags to anything at all. It’s just so much easier and faster to give files a proper filename and save them in a logical place. I’d be annoyed a bit if I had to f.ex. specify tags for a file when saving it. Oh well, I guess this is just not my thing.
Manual tagging is just one of the sources envisioned here.
Manual tagging allows you to enhance meta data an relations by information that is closer to your way of thinking, thus allowing you to take this personal information into account when searching.
But semantic information can also be derived automatically, by the software handling your data.
The article offers the example of relating a file to the email it was saved from. Such an information can be added by the email program automatically on save.
Together with the also automatically derivable information about the email’s sender (e.g. by indexing the emails), you can then search for the file by just knowing who sent it.
Proper integration into applications will ensure that (less) information is lost during operations.
Manual tagging can be used to add information not available to any of the involved applications, especially subjective values like “uhh, pretty!”
I just started thinking about something..If you f.ex. save a file from an email, and the saved file automatically acquires metadata such as the sender’s email address, couldn’t this be considered potentially a threat if there’s several users on the same computer? If the file’s metadata is readable by other users and they have read-access to even some of your files, they could learn email addresses of the people you stay in contact with etc. In that case it’d help if the metadata was accessible only by the owner of the file, but what if sooner or later f.ex. system files are populated with metadata? The only way I can think of how to fix that would be to have two kinds of metadata: private and public.
metadata currently is private to the user. sharing/merging metadata will be a by permission thing only, and we’re still a couple years away from that anyways.
Aaron already answered this, but just a bit more detailed: the acquired relation data is stored in an relation database separate from the file. Thus the file itself remains “clean”.
For some kind of data it might make sense to also store it in the file or in extended file system attributes. This is currently not implemented AFAIK and as Aaron pointed out, would be subject to policies.
For some kind of data it might make sense to also store it in the file or in extended file system attributes. This is currently not implemented AFAIK and as Aaron pointed out, would be subject to policies.
IMHO all such metadata information should be stored per-file, ie. in extended attributes cos then it could be shared by all users. And I don’t really like the idea of a single database with all the metadata cos it could get corrupted, or if it got deleted one way or another you’d lose ALL metadata. But to have public and private metadata saved per-file would most likely require modifications to existing filesystems, or a completely new one. Hopefully this will happen, I’m sure a lot of people would find such a thing useful even if I can’t imagine myself gaining much from that.
I agree. The problem this addresses is easily seen on current systems. On XP, if I save information such as keywords, subject, category for some file formats, the information is lost when I copy the file to my keydrive. PDF docs are one common example. If I dl a white paper at work and wish to transfer it home, I can count on reentering the info at a later date. With jpegs, that information can be written to ITPC and MP3s can store the info in ID3 tags.
The bonus to having an index is that it speeds local searching and that attributes can be entered on a per-user basis. It supports things like shared music off a file server. I can rate a song 5 stars and my wife rate it 3 stars. If it were per file, she would overwrite my rating.
The problem isn’t the file system per se. Most any file system can have filters added for extended attributes. The problem is in file formats. Some support them, some don’t and some are implemented poorly (PDF).
The problem isn’t the file system per se. Most any file system can have filters added for extended attributes. The problem is in file formats. Some support them, some don’t and some are implemented poorly (PDF).
Umm, it’s not the files themselves which should have support for extended attributes..That would require modifying _every_ single filetype that’s supposed to have any metadata at all. Not gonna happen. It’s the filesystem which should handle metadata, that way the filetype wouldn’t matter at all as metadata would be possible on any and every file.
Umm, it’s not the files themselves which should have support for extended attributes..That would require modifying _every_ single filetype that’s supposed to have any metadata at all. Not gonna happen. It’s the filesystem which should handle metadata, that way the filetype wouldn’t matter at all as metadata would be possible on any and every file.
What you say is perfectly correct, and file system level metadata would be good.
But the point of the semantic desktop is to use concepts from the semantic web such a RDF as a common meta data format, and held in a local triple store SPARQL end point that can be queried. Where the meta-data is held before it is extracted isn’t important, because adaptor code can be written to extract it from all the existing file formats.
The semantic desktop builds on concepts from the semantic web, and this paper (which I’m still trying to understand), describes how they differ:
http://protege.stanford.edu/conference/2007/presentations/12.04_Cai…
The NEPOMUK Representational Language will allow you to describe the links between multiple RDF vocabularies that you might have in your metadata. And so you would be able to combine, say FOAF data about people, with VCARD data about business contact addresses for specific useful purposes.
Well, speaking as someone who’s lost my ratings and scores on my files in Amarok several times now due to reinstalls, etc, I worry about relying on metadata cached elsewhere. I mean, in the long run, I still know which songs in my collection I like and don’t rely too heavily on Amarok’s rating system… but supposing we were, this metadata would have to be stored somewhere.
I think what we eventually need is a filesystem wherin each file consists of a file and a hidden folder (or maybe just a data block, with a maximum size, to prevent abuse) that can contain a flexible amount and type of metadata specific to the file. I think that’s what HFS+ and Reiser4 do, but that’s not how Linux is set up to work. Yet, anyway.
Somewhere in all of this, of course, what metadata is connected with each file will have to be automated, but by then we’ll need enough tags and categories that files can be properly categorized. My own music collection, again, is a mess. Genres don’t always line up, though occasionally I go through them and try to fix the capitalization or remove a confusing type… Is it pop or alternative? Soundtrack, or Classical?
Edited 2007-07-16 16:28
The only issue I could see in terms of a filesystem based meta system would be that it would make any such feature filesystem dependent and KDE prides itself in being system independent which is why you see abstractions of technology such as HAL. Otherwise this would seem a more robust way to go. There are pros and cons to having a less dictated software stack.
Uh? I isn’t stored at “/home/your_username/.amarok” ?
And besides, if you install your “/home” in a separate partition/disk you don’t loose any configuration after reinstalls.
One time, I completely wiped my computer so I could make a larger /home partition; another time it was my KDE configuration that was causing the problem, and I ended up wiping out everything EXCEPT $HOME/.kde/apps/amarok (or wherever) but somehow it still seems to have reset and removed things.
“IMHO all such metadata information should be stored per-file.”
Bingo! What we need is for filesystems to unilaterally adopt meta-attributes. We have that with EXT2/3 using xattr.
If a system wants to keep an image of those attributes in a database for fast searches, thats one thing. But if the metadata exists only in the database it will just be a mess.
Take GNOME’s emblems. It’s great that I can put a little image on a file to help with organization, but if I move that file to another folder manually, I loose the emblem, becuase the emblem is not tied directly to the file. So it sucks.
If fundamentals like this aren’t addresse, I have a feeling this NEPOMUK stuff will prove to be all hype.
“Then MS starts talking about all the cool new things they will do with it, and I thought “Man, what a slap in the face for apple, to be beaten to the punch for once by MS”
Actually, MS started talking about integrated search long before spotlight. This was one of the rationals behind the failed “WinFS” DB driven filesystem. Apple came out with spotlight to counter that, and they succeeded. MS couldn’t make it work, and had to drop it from Vista.
I wasn’t talking about search, I was talking about using relationships between data and metadata for new forms of organization over the traditional file/folder thing we have been using for decades.
“I wasn’t talking about search, I was talking about using relationships between data and metadata for new forms of organization over the traditional file/folder thing we have been using for decades.”
What do you think is the basis of spotlight and WinFS? Rich metadata on file and directory structures to enable these “new forms of organization” The point of WinFS was to base the filesystem on a SQL store, to allow those relationships to be defined.
I just wrote the NEPOMUK article on Wikipedia a few hours ago.
http://en.wikipedia.org/wiki/NEPOMUK_%28framework%29
http://en.wikipedia.org/wiki/NEPOMUK-KDE
Feel free to improve.
Since I use GNOME, does anybody have links that show that GNOME is discussing the use of nepomuk?
Also, what does this mean for beagle and tracker?
The Nepomuk group is involved in the freedesktop.org discussion to create a standard for metadata databases which would allow them to be interchangeable and would allow any compliant front-end to be used to search any database. this is the general direction most projects are looking to move in, they just need to sort out the framework.
> to create a standard for metadata databases
you are referring to xesam? if so, that has very little impact on the ability for apps to tap into the sort of framework nepomuk offers.
it’s a step in the right direction as it starts to bring some commonalities into the search space, but it’s the first step of many, many more.
of course … what could end up happening with xesam, is xesam is adopted and the strigi+nepomuk combo gets chosen by more and more users and so full text searches return information stored by nepomuk enabled apps.
however, for proper integration to occur in non-kde apps they need to be instrumented to be able to add to the metadata and traverse the rdf stores… there’s a lot of work there, but i hope it starts to happen.
does anybody have links that show that GNOME is discussing the use of nepomuk?
I don’t know how this is related to official Gnome and Beagle, but you already have Beagle++ http://beagle.kbs.uni-hannover.de/ an extensions to Beagle using NEPOMUK.
Edited 2007-07-14 23:17
The demostration is here:
http://www.youtube.com/watch?v=Ui4GDkcR7-U
Beagle++: “