The number and variety of computing devices in the environment are increasing rapidly. Real computers are no longer tethered to desktops or locked in server rooms. PDAs, highly mobile tablet and laptop devices, palmtop computers, and mobile telephony handsets now offer powerful platforms for the delivery of new applications and services. These devices are, however, only the tip of the iceberg. Hidden from sight are the many computing and network elements required to support the infrastructure that makes ubiquitous computing possible.
“While data management has become almost synonymous with RDBMS, however, there are an increasing number of applications for which lighter-weight alternatives are more appropriate.”
What, exactly, is lighter weight than RDBMS? What about RDBMS is heavy-weight? Is it merely that lack of understanding that “weighs” heavy on the mind?
Yes, there are other ways to store data. RDBMS are the only game in town, however, when you actually want to use a lot of data again and again, any time of the day while storing it all on inexpensive media.
Just because you may want to do less, do it in memory or do it on the network stack, doesn’t mean you are doing anything that a RDBMS couldn’t do faster and cheaper with less resources when you consider that the RDBMS will eat your lunch when you want to do the same thing twice but with slightly different parameters!
Well, maybe the title of the article is right, but the rest of it is inconsequential and misleading fluff. Errors committed:
1. Mentioning the history of relational databases without once mentioning E.F. Codd, who started the whole thing, and not even once mentioning that there is a solid mathematical model involved, not just a conceptual framework.
2. Assuming that SQL==relational, while it is at best only a partial implementation of the relational model and violates it in many ways. At worst, it “unsolves” quite a few of the problems the relational model was meant to solve.
3. Stonebraker’s comment that “one size no longer fits all” is a very misleading comment. According to the core concepts of the relational model, an RDBMS can be done in many different ways, while still adhering to the relational model (not the SQL model). There is no reason a relational DBMS could not–for instance– support complex or “composite” datatypes, custom operators, or “rich” datatypes such as media files. Also, there is no reason they can’t support hierarchical data at least as well as a hierarchical DBMS does. The relational model is not what limits us here, but the narrow-minded view of what an RDBMS consists.
In fact, any question of why RDBMSs are somehow obsolete can be answered by the one quote the article gets right: Relational databases solved this problem in two ways. First, they hid the physical organization of the database from the application and provided only a logical view of the data. Second, they used a declarative language to describe the data of interest in a particular query, rather than forcing the programmer to write a collection of function calls to fetch the data. With this in mind, how does this look at all like a “one size fits all” solution?
What, exactly, is lighter weight than RDBMS? What about RDBMS is heavy-weight? Is it merely that lack of understanding that “weighs” heavy on the mind?
I have been thinking this for a long time. People infer relational -> SQL -> big ass DBMS server.
The problem is that many so-called “lightweight databases” are in fact dumb associative string tables (i.e. completely useless).
SQL is terrible. It is completely unprogrammable. To do anything serious you have to generate lots of text strings – just for them to be immediately parsed! And as a bonus it introduces whole class of security problems!
What is needed in many applications where popular SQL DBMSs are overkill and the application typically invents its own file format is a small and fast non-SQL relational database library. You could trivially build a daemon around it if so desired. We have already solved the data storage/retrieval problem. Someone just needs to implement it properly.
I say this because many times I have needed such a thing. SQLite is the nearest thing I have come across so far but the SQL part cripples its usefullness.
@Luke McCarthy
If you use PostgreSQL and plperlu (untrusted perl as procedural language) you get access to every Perl module on the machine. Very powerful and programable
For years I worked with a database technology called Multi-value. I learned this database technology through the DOS Advanced Revelation DBMS engine.
To this day, I know of companies using applications developed with this system. Yes there are still lots of these DOS systems being used today. They are slowly being moved to RDBMS systems that are SQL compliant (or should I say follow the normalization of data model). Or they are being moved to it’s windows cousin OpenInsight.
There is nothing I have found that can match the flexibility of these systems to store data that matches the way end users think. Having historical data in the same table as the core data is simply awesome in my experience.
Here the programming language was specific to the DBMS and data manipulation was simple when used with a live data dictionary. Users could create their own fields in seconds and not have to worry about bringing down the database.
Giving the power to the ens users is what I feel is lost in the large RBDMS systems.
Just my two cents based on my years of using both of these technologies.
People so often make the mistake that the term “relational” has to do with the particular storage mechanism you use. Relational implementations can use *any* physical storage method desired by the developer, as long as they provide set-oriented access to the data, constraints, views-as-relations, etc… Here are a couple examples of non-SQL systems that are actually more relational than your typical SQL system:
http://duro.sourceforge.net
http://dbappbuilder.sourceforge.net/Rel.html
Both of these projects use Berkely DB as their physical storage mechanism, but provide a relational system of access and constraints on top of that. Imagine –using a non-relational DBMS as the basis for a relational interface . Codd would have seen nothing wrong with that. Especially since BerkeleyDB has transactions and a very robust storage engine. The whole point of the relational DBMS is the logical methods it provides the user.
So far I used
* RDBMS
* OODBMS (to be honest, so far only evaluation purposes)
* LDAP
* XML
as persistance concepts.
RDBMSs as their name states are based on relations. OODBMS hold object references. Since relations are basically just a special form of references, these to concepts are more or less equivalent on that level.
LDAP as well as XML are conceptually based on hierarchies. XML with xlink:href also offers references, but the primary structure of a XML document is strictly hierarchical.
Since hierarchies just are a special (tree) topology of relations, hierarchies are trivial to implement with relations. The reverse is not trivial at all. Thus the concept of relaional modelling is more powerful than that of hierarchical modelling.
Beside this relations seem to be the way how humans intuitively associate items/things. So far I have to meet a college/customer, who is not able to understand relational models. They are damned easy to grasp.
OODBMS concepts are, in my experience, significally more difficult to understand. For developers that is no problem. But I also have to talk with customers. That is why I personally rate the relational model as superior.
Many, if not most LDAP and XML repositories are based on RDB. I myself did a XML repository, which features a — good enough for our needs — XPATH subset as query language, which we first integrated in a system delivered 3 or 4 years ago. I also used a relational database model as the base.
<<< —
I originally intended to make this XML repository GPLed OSS. I also have a verbal permission to do so. But the documentation is not yet dumped into a document. It still only resides somewhere in my brain. I hope, I will find the time to write it down anytime soon. I have some ideas how to further optimize query performance and there is need for a feature update. Maybe I find the time to write down an essential documentation, when I implenent this (earliest possible starting date, I suppose: Q4/2005).]
— >>>
In one project, since the customer explicitly demanded it, used LDAP to store user profiles. A college of mine and myself have warned the custumer several time to do so. Well, last year we modified the whole user management module to use rDB. ~8000 locations (schools) combined with the need of localized roles (generic_role@location) ended up in ~200000 “LDAP roles”. That was simply unable to handle. With rDB we have some two dozend generic roles and the ca. 8000 locations (schools). A relation specifies in which roles (permissions) a specific user (administrator, secretary or teacher) has in a specific location (school, note: secretaries usually work for several schools, an administrator might administer all schools in one township or even a collection of townships). Damn easy to handle.
Carsten
I still have a reference summary for IBM DB2 V2R3 in my cubical, since it basically describes SQL92 🙂
This handy booklet, one of the 3rd edition, was printed March 1992. These days — if I remember correctly (please correct me if I am wrong) — the 308x and 3090 were top notch machines. Each of these machines were used to conurrently run the business applications and the “middleware”, mainly CICS and DB2. Each machine served dozends of users concurrently. Today a mediocore PDA outperforms these machines in terms of raw compute power and main memory.
I really do not grasp, why there should be need for “lighter-weight alternatives”. What for?
Carsten
I seldom agreed more.
Carsten
>SQL is terrible. It is completely unprogrammable.
>To do anything serious you have to generate lots of text strings – just for them to be immediately parsed!
>And as a bonus it introduces whole class of security problems!
No’ so fast.
The two things you bought were
a) human legibility, and
b) some chance of platform independence.
Coincidentally, these are XML’s touted ‘wins’.
If parsing represents too much overhead, most systems offer parameterization, which also reduces your security concern significantly.
Now, if you want more than that, you can always use SQLite or PostgreSQL, study the VM source code, and write to it directly, though you’d just be re-inventing a wheel, without much hope of improved roundness.
VAX/VMS provided excellent indexed files, with a primary index and availability of one or more secondary indexes. These files could be manipulated with Datatrieve, a command language confusingly similar to SQL; or with VAX Fortran, VAX Basic, or other languages. Datatrieve identified fields by name, and old procedures would still run if you added more fields to the record. Record locking was there, but not transactions as far as I recall.
A 21st Century rework of these ideas, based on SQL, might be useful.
For those talking about the inability to add additional fields, etc… think about it this way:
1. adding any fields or even tables is simply a matter of user permissions.
2. the whole point of a relational DBMS is that you can add related data *without* interfering with your main structure (and more importantly, without violating any constraints). That’s what foreign key relationships and views are for. In any serious DBMS it is trivial to allow users their own schema where they can add related data, fields tables, whatever, which will tie in directly to the main data, but still allow them their own customized additional data. The fact that most DBAs don’t allow for this is more a matter of fear, tradition and “religion” than anything else.
an OS that is so ensecure that on a numeric scale it would be negative 30 or more; then to say after SP2 it is 15 times safer is not much reassurance.
but we are talking M$ products