“Within the past two years, Oracle, IBM and Microsoft have all released freely available versions of their flagship database servers, a move that would have been unheard of just a few years ago. While their respective representatives would argue the move was made in order to better accommodate the needs of all users, it’s fairly clear that continued pressure from open source alternatives such as MySQL and PostgreSQL have caused these database juggernauts to rethink their strategies within this increasingly competitive market.”
Last year we moved to PostgreSQL, and we’re happy with it.
And here I was about to write yet again about the advantages of open source software over free, closed source software, when I read the article instead. Good article debunking several common misconceptions attributed to PostgresSQL, and some to MySQL and other open source projects.
I use postgre and I like it. It does not take that much memory and has many(if not all) of the features of its big brothers.
I personally appreciate: the regular expressions, that it has a .NET provider and a 64 bit version.
read the article
Postgresql is centainly a good choice for middle size database system. I just moved our system from Postgresql to DB2. Then I knew its sql explainer is much better than DB2’s when there’s some outer join and distinct in a sql statement. (I have already installed the newest fixpaks for DB28.2). It’s very nice for programmer since it supports a lot of language, and its authentication part is also very flexable. (support IP address restrict). Only thing that Postgresql need to do is to enhance its scalability: specially on partitioning and threading query.
– Postgres has MUCH STRONGER Sql support, which translates into much better Programmer productivity.
– Postgres sql can be ported to Oracle or DB2 with not much change, so there’s an emergency escape path if needed.
MySql has better ISAM speed, at the expense of Developer Time. These days that’s not a smart trade off.
MySQL although less featured always seemed faster to me in production. Maybe PGSQL is just harder to tune. The only thing I can say is that I have had machines with *lots* of mysql requests and never had problems once the basic key buffer sizes were increased.
With Postgres on the other hand I have a lots of performance issues. But then again I haven’t used Postgresql since the 7.x and I am anything but a database expert.
MySQL has viral GPL license. If you develop commercial application using MySQL, you must publish your source code. Or, you can purchase commercial license for MySQL.
Wrong. If you develop an application using MySQL, you can distribute it under the terms of any license you chose, as long as you don’t link to the database.
Of course you will have to distribute the MySQL source code. And of course if you make any modifications to MySQl, you will have to publish that modifications too. But as long as you don’t link to MySQL, you can stay as closed sourced as you like.
In case folks are confused, the article is about debunking common reasons given for NOT using PostgreSQL. It is firmly pro-PostgreSQL. The title is rather misleading and makes you wonder about the editor of the site the article was posted on.
I really need an answer for this from someone who has strong experience with PostgreSQL. We are a company with about 200 concurrent users with very heavy database usage. Our current database is about 250 Gigabytes big and is on SQL Server 2000 running on quad cpu Dell attached to an EMC unit through fiber. We are currently planning for dramatic growth. We estimate that within two years, we will have around 1000 very heavy database users and a 500 Gigabyte database. Furtheremore, within five years, we will have around 2000 very heavy database users and probably a database approaching 1 Terabyte. We need to know how PostgreSQL has scaled for people and whether it can be used to plan our future infrastructure growth.
a database approaching 1 Terabyte
Wow, can I ask you what kind of data you have in this DB? What kind of company?
It is an ERP database system. With only 200 users, the database already grows about 60 gigabytes per year. We plan to increase the number of employees, increase the number of transacations created per employee, and increase data-mining, analysis, and logging. This is our enterprise core database and it will need high availability and scalability. It will be an enterprise system. We will deploy a redundant fiber SAN, with either EMC or NetApp Storage Arrays. We will most likely deploy a RHEL (AMD64 -> Itanium or Power) solution or Solaris (AMD64 -> Sparc) solution since both provide a path to grow. Although these solutions are pricey, they are actually dwarfed by the cost of Oracle, DB2, or Sybase, especially as we start scaling more and more.
I suggest first a pretty good server such as you quad-Xeon or the like, with more RAM than the size of your DB. Then, install a few SCSI disks in RAID-5 or sthg, S.O. could be either RHEL or FreeBSD 6. And PostgreSQL has tablespace so you can distribute your data to several different RAID arrays. Then, maybe a small cluster or load balancing will be necessary.
With a DB size of up to a 1TB, how on earth could he put that much RAM in a server?
It sounds like one of those new Sun coolthreads machines might be right up your alley. Plus, you can get them to send you one for a free 60 day trial.
The amount of data is no problem for PostgreSQL, however:
– PostgreSQL needs to analyse its data for statistics regularily. This can take a lot of time if you have a large database.
– Query optimization becomes very important for large data. The PostgreSQL query optimizer often gets it right, but sometimes gets it wrong. In such cases you must manually tweak queries so the optimizer gets it right. This occur s especially when you use a lot of views or subqueries.
– Aggregates in PostgreSQL can be slow, i.e. don’t try count(*) on a result with 5 million records.
I suggest you to subscribe to the postgresql-performance mailing list, you’ll find a lot of knowledgeable people to help you.
http://www.postgresql.org/community/
Cheers,
Last year I was asking a big company specialising in ERP systems why they are not supporting Postgres. They said to have met problems. I wanted an example and he quoted that order-by clause had severe limits (real world databases use wide materializations in the range of 100+ columns. Have a look into a datawarehouse book to know why). Some Oracle fan raised suspicions about scalebility due to missing streams and asynchronous transaction support. (Only Sql Server is worse for being quick in escalating cursors to a table lock). In the real world things are not quite as fair as for running postgres on a webserver with a few dozen tables per application. The “5 myths dispelled” in the article have not been gotten wrong by anyone ever since Postgres-7. The author of that article should be blamed for requoting these things as it might make for the impression that some “people in the industry” are pondering with those. They are not.
The article didn’t use the acronym LAMP, but it talked around it. It is usually defined as
Linux/Apache/MySQL/(Perl,PHP,Python).
Of course that leaves PostgreSQL out in the cold, so I was amused to find that they define it on their web site as
Linux/Apache/Middleware(Perl,PHP,Python,Ruby)/PostgreSQL,
which not only brings PostgreSQL back into the fold, but Ruby as well.
Hey, “Middleware”, could mean java/C/tuxedo/SOAP too.
I would suggest you ask this to the very friendly mailinglist. There the real postgresql db developers spend a lot of time and this question will trigger a lot of good answers for sure.
It’s GPL. The free software community prefers using GPL’d software. That’s all.
so I was amused to find that they define it on their web site as Linux/Apache/Middleware(Perl,PHP,Python,Ruby)/PostgreSQL,
which not only brings PostgreSQL back into the fold, but Ruby as well.
Hehehe. Although I’m still a LAMP user, I can see how the acronym is in need of an update. Unfortunately for Ruby and PostgreSQL, “lamp” is pretty catchy. I guess if you wanted, you could remain general:
High-level language
OS
Web server
DB
and then say you use a “Free HOWDy” setup.
You heard it here first pardner!
also means you use a jdbc or odbc driver which is gpled… 😉