This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
To: =?iso-8859-1?q?Mart=EDn=20Marqu=E9s?= <martin@bugs.unl.edu.ar> From: Tom Lane <tgl@sss.pgh.pa.us> Subject: Re: [HACKERS] beta3 Date: Tue, 20 Nov 2001 10:16:57 -0500 =?iso-8859-1?q?Mart=EDn=20Marqu=E9s?= <martin@bugs.unl.edu.ar> writes: > P.D.: bzip2 is slow, but you can get a real small package with it, even > though PostgreSQL isn't that big, if we compare it with KDE or Mozilla. As an experiment, I zipped my current PG source tree with both. (This isn't an exact test of the distribution size, because I didn't bother to get rid of the CVS control files, but it's pretty close.) Original tar file: 37089280 bytes gzip -9: 8183182 bytes bzip2: 6762638 bytes or slightly less than a 20% savings for bzip over gzip. That's useful, but not exactly compelling. A comparison of unzip runtime also seems relevant: $ time gunzip pgsql.tar.gz real 0m5.48s user 0m4.46s sys 0m0.62s $ time bunzip2 pgsql.tar.bz2 real 0m27.77s user 0m26.50s sys 0m0.92s If I'd downloaded this thing over a decent DSL or cable modem line, bzip2 would actually be a net loss in total download + uncompress time. <editorial> The reason bzip is still an also-ran is that it's not enough better than gzip to have persuaded people to switch over. My bet is that bzip will always be an also-ran, and that gzip will remain the de facto standard until something comes along that's really significantly better, like a factor of 2 better. I've watched this sort of game play out before, and I know you don't take over the world with a 20% improvement over the existing standard. At least not without other compelling reasons, like speed (oops) or patent freedom (no win there either). </editorial> === To: Tom Lane <tgl@sss.pgh.pa.us> From: mlw <markw@mohawksoft.com> Subject: Re: [HACKERS] beta3 Date: Fri, 23 Nov 2001 11:10:33 -0500 Tom Lane wrote: > <editorial> > The reason bzip is still an also-ran is that it's not enough better > than gzip to have persuaded people to switch over. My bet is that > bzip will always be an also-ran, and that gzip will remain the de > facto standard until something comes along that's really significantly > better, like a factor of 2 better. I've watched this sort of game > play out before, and I know you don't take over the world with a 20% > improvement over the existing standard. At least not without other > compelling reasons, like speed (oops) or patent freedom (no win there > either). > </editorial> While agree in principle with your view on bzip2, I think there is a strong reason why you should use it, 20% That 20% is quite valuable. Just by switching to bzip2, the hosting companies can deliver 20% more downloads with the same equipment and bandwidth cost. The people with slow connections can get it 20% faster. Will bzip2 become the standard? Probably not in general use, but for downloadable tarballs it is rapidly becoming the standard. Those who pay for bandwidth (server or client) welcome any improvement possible. I would switch the argument around, time how long it takes to do: ncftpget postgresql-xxxx.tar.gz tar xpzvf postgresql-xxxx.tar.gz cd postgresql-xxxx ./configure --option make make install