PostgreSQL 8.1 has been released. You can download from here, and the source code from here. The release notes are here.New features include: Improve concurrent access to the shared buffer cache. Allow index scans to use an intermediate in-memory bitmap. Add two-phase commit. Create a new role system that replaces users and groups. Automatically use indexes for MIN() and MAX(). Move /contrib/pg_autovacuum into the main server. Add shared row level locks using SELECT…FOR SHARE. Add dependencies on shared objects, specifically roles. Improve performance for partitioned tables.
As per the usual, the best open source database system in the world is even better! Keep up the great work!
The best open source database? arguably?
Let’s hear what MySQL has to say:
In http://dev.mysql.com/downloads/mysql/5.0.html
> MySQL Community Edition has not been certified and is
> not considered ready for enterprise production use.
Well, they have said it all they had to …
I use to use MySQL… until I took a SQL class.
then I found out what SQL could really do
I did a small comparison of Postgresql 8.1 vs MySQL 5.
I created a table:
CREATE TABLE filedata (
path character varying(1024) NOT NULL,
md5 character varying(80) NOT NULL,
size integer
);
On Postgresql I also added a primary key constraint. MySQL didn’t allow me to do that, as it told me that the key was too big:
ALTER TABLE ONLY filedata
ADD CONSTRAINT filedata_pkey PRIMARY KEY (path);
I Entered 586137 records, then I performed the following query:
select b.path,b.size, a.md5 from
(select md5, count(path)
from filedata
group by md5
having count(path)>1) a, filedata b
where a.md5=b.md5 order by a.md5,size;
Postgresql returnd with an answer in about 20 minutes. MySQL is still running, and have been running for over six hours by now. Running two queries simultaneously on Postgresql didn’t change the response time significantly.
I used an old IBM Thinkpad 600 laptop, with a 500MHz Pentium III for Postgresql. For MySQL I used a 1 GHz Pentium IV and 1 GB RAM. Both machines had the same services and programs running, apart from the difference in the choise of database.
Both databases used one disk for all their needs i.e. no separate disk for logging, the disk also contained the OS. No special configuration was made to either of them. The OS used was Fedora Core 4.
This really doesn’t look too god for MySQL. I’m sure that there is some way to tune MySQL for better performance, but most will compare them on how they perform out of the box.
For one thing you could replace the default InnoDB tables of MySQL with the old engine used in older versions of MySQL, but that would also mean saying good bye to many features that most people take for granted in a RDBMS.
On the positive side, the count(*) function seam to be much faster in MySQL. Counting all the tuples in the table returned allmost instantly in MySQL while Postgresql needed about 2 seconds.
What if you hadn’t added primary key to Postgres? How did the two compare? (Maybe reduce the number of rows, as it will probably take hours).
Nevertheless, the benchmark is pretty useless as it’s not a “real-world” use case.
Nevertheless, the benchmark is pretty useless as it’s not a “real-world” use case.
What makes you think that? The code was actully written for the real world purpose of serving as backend for an application identifying duplicate files in a filesystem.
It is not more complicated or strange than what you see in many business applications.
🙂 Nice!
PostgreSQL becomes more and more user friendly for each release. More and more advanced features, but also focus on new users needs. Integrated autovacuum helps a lot for new users and admins.
We have been running postgres DB for over 4 years now; it keeps getting better with each version. We have replaced all MS SQL Servers/windows with Postgresql/linux and are very pleased with the change.
I recently converted from using MySQL to PostgreSQL. I’m still trying to get the hang of PostgreSQL (administrataion and usage).
I switched because I wanted to use more advanced features like foriegn keys and triggers. I have yet to build anything because, like I said, I’m still getting the hang of PostgreSQL. So far, so good though.
There are a few things I miss about MySQL, but those things are only cosmetic and really of not of great value (i.e. being able add a column where ever I want it within a table – that’s totally cosmetic).
One of the things I’d like to do is dump our current Win2K/MSSQL setup and replace it with a BSD/PostgreSQL setup for the ecommerce site. Currently, we have managed servers hosted a a colo facility. I’d like to get BSD/PostgreSQL hosting but I haven’t found any good companies yet in the U.S. that provide that type of service. Any help on that would be appreciated (note: I would prefer if it ran on BSD).
It’s sad when MySQL users consider foreign keys an advanced feature.
> being able add a column where ever I want it within a table – that’s totally cosmetic.
Mmm… in the worst of cases, you can make a pg_dump of all the database, change whatever you want (it’s a text file) and later restore.
“There are a few things I miss about MySQL, but those things are only cosmetic and really of not of great value (i.e. being able add a column where ever I want it within a table – that’s totally cosmetic).”
I think you can use ALTER TABLE for that :
http://www.gnegg.ch/archives/alter_table_in_postgresql_80.html
>> “There are a few things I miss about MySQL, but those
>> things are only cosmetic and really of not of great
>> value (i.e. being able add a column where ever I want
>> it within a table – that’s totally cosmetic).”
> I think you can use ALTER TABLE for that :
He says he wants to add the column wherever he wants within a table, that is to say, he wants to have the new column (for example) between the “id” column and the “name” column.
That is why I said that (in the worst of cases) he could dump the database, edit the text file as he wanted, and restore.
Hi, an alternative to dumping the database and editing text files:
In the database;
– copy the original table
– drop the original table
– re-create it with the desired structure
– copy the temp table to the new
– drop the temp table.
All within the database, all in SQL, all operations that a DB should be able to do quickly.
HTH
“He says he wants to add the column wherever he wants within a table, that is to say, he wants to have the new column (for example) between the “id” column and the “name” column.”
That’s exactly what I meant. Like I said, that’s just cosmetic. It’s database “eye-candy”.
In reality, a person should select their fields in the order they want and not assume that the database will return the fields the way the table was created (i.e. use “select field1, field2 from table” rather than “select * from table” and expect to get field1 field2).
You can use this to reoder fields (transaction is used for a reason):
BEGIN;
SELECT f1, f2, f3, … INTO TABLE tmp_tbl FROM orig_tbl;
DROP TABLE orig_tbl;
ALTER TABLE tmp_tbl RENAME TO orig_tbl;
COMMIT;
EMS PostgreSQL Manager can do this even better – there is function ‘Reorder fields’ which will recreate all indexes and other dependencies as well.
I’ve performed some informal load testing against the 8.1beta. A few differences were significant:
* index selection is much improved. given similar indexes, it assigns a far more accurate “cost”.
* COPY FROM is much faster.
* bitmap indexes are much faster in many plans.
Doesn’t take the Blastwave people long to have it packaged up and ready :
http://www.blastwave.org/testing/
Special thanks to Mark Round for incredible response time. Expect this to hit the mirrors soon.
http://www.4java.ca
FreeBSD + Postgresql are in all the plans. I am pleased with the service. Go try it.
Is the speed of the Windows version improved?
If you want the comparison to be fair and about performance, you shouldn’t add a primary key to Postgresql and not to MySQL. I’m not saying that it makes a difference, but for the sake of comparison, the playfield should be levelled.
If you want the comparison to be fair and about performance, you shouldn’t add a primary key to Postgresql and not to MySQL. I’m not saying that it makes a difference, but for the sake of comparison, the playfield should be levelled.
The normal state of a relation in a relational database is to have one or more attributes constituting a primary key of the relation.
So removing the key from postgresql, in some strange hope that postgresql would execute slower without it, seams a bit misguided.
After all, I did try to add the key to MySQL but it wouldn’t let me. So I would say the playfield was levelled enough. I tried to do the same thing to both databases.
To make it even worse on the MySQL part, not having a primary key, will require, the software that calls MySQL to do the integrity checking. Slowing down the application even more compared to a Postgresql based application than what this simple test would lead us to believe.
If I should have been completely fair, I should probably also have turned on the strict mode in MySQL as well to make MySQL have the same level of integrety checks that Postgresql does by default. Turning that on normally slows it down even further, but I expect that the difference in this case would have been small as I am not using any foreign keys in the query.
What use of a table without primary key? Maybe he’d better create index on “path” in MySQL for the comparison sake?
Yes, an index is a good idea. I’m in the process of creating one right now. (I aborted the original query as it hadn’t produced any result in about 10 hours).
Meanwhile, I tried to test Postgresql under the same strange conditions to get some sort of fairnes (i.e. I dropped the primary key, and created no index for ‘path’). Under these conditions Postgresql was still faster than MySQL and the execution time was not much different from what I got using primary key.
If the this thread is still alive once the index is created on MySQL, I run the test again. After all this is not that much data. The table data is only 132MB big. Many databases are far bigger than that. It is also not uncommon with business logic that is much more complicated than what I have in this query, so I think it really should be possible to get MySQL to return an answer in this century. Perhaps the index will do the trick.
So removing the key from postgresql, in some strange hope that postgresql would execute slower without it, seams a bit misguided.
That was not what I was hoping for. I simply stated that you should strive to use an identical setup, if you really want to make a performance comparison. I don’t mind postgresql being faster, but your comparison wasn’t fair. Also…
This really doesn’t look too god for MySQL. I’m sure that there is some way to tune MySQL for better performance, but most will compare them on how they perform out of the box.
With run-of-the-mill queries that is a valid statement, but when things get a bit more complicated (such as your query), you should never expect to get away with doing nothing to optimize the performance.
<sidebar>
I tried your query on 586137 rows I created on my MSDE, using two concatenated GUIDs to simulate an md5, a counter field converted to the size field and the md5 and the size concatenated to simulate a path. Then I used modulus to distribute 5000 duplicates evenly in the set. On my Pentium 3.2 GHz the query took 3 seconds to run
</sidebar>
Edited 2005-11-09 00:49