This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Subject: Re[2]: [OT] PostgreSQL gets commercial support/funding From: Brian Ashe <brian@dee-web.com> Date: Thu, 11 May 2000 02:02:38 -0400 Hi Matt, Here's my synopsis... MySQL - Not free for commercial use Not a full SQL implementation Speedy(though I don't know how much this gap is with PgSQL 7.0 but it has been closing with each new PgSQL release.) Best used for simple read/write to tables (data storage) PgSQL - FREE! Very close to full SQL compatibility (especially with 7.0) Very nice additional PgSQL special features Some stuff poorly documented Best used for more complex uses (data manipulation) or where SQL compatibility is important or because it's FREE! Me personally, I like PostgreSQL. I have never fallen under this performance issue with it, though I have seen bench tests prove MySQL faster. I think it is just a matter of choice. Most people ignore the fact that if MySQL is used commercially, the authors expect to be paid. PostgreSQL is free under the Berkeley license. Best advice: Look at each set of docs to compare features for what YOU will need. Run each (they don't conflict) and run some sample queries to see the performance difference. Then judge for yourself. You will probably be happy either way you go. Millions of people use each, so neither can be that bad. So would anyone like to start an emacs vs vi war next? ;) === Subject: Re: [OT] PostgreSQL gets commercial support/funding From: Joe Brenner <doom@kzsu.stanford.edu> Date: Mon, 15 May 2000 13:13:06 -0700 > > What is the groups opinions regarding MySQL vs PostgreSQL? > > > > MySQL - Not free for commercial use > > Not a full SQL implementation > > Speedy(though I don't know how much this gap is with PgSQL 7.0 but > > it has been closing with each new PgSQL release.) > > Best used for simple read/write to tables (data storage) > > > > PgSQL - FREE! > > Very close to full SQL compatibility (especially with 7.0) > > Very nice additional PgSQL special features > > Some stuff poorly documented > > Best used for more complex uses (data manipulation) or where SQL > > compatibility is important or because it's FREE! This seems like a reasonably fair comparison. I'm a postgresql user in part for political reasons (the MySQL license is really squirrelly, and this kind of nonsense just shouldn't be encouraged). I think a lot of younger geeks are somewhat confused about this because Slashdot uses MySQL, and they assume that (a) it must be technically slick and (b) it must be politically correct (i.e. open/free). My take on this is that MySQL's usefulness is limited to something like Slashdot, which is a high hit-rate site that has to push a lot of data around, and yet paradoxically no one really cares about the data all that much (if you drop a few comments out of a thread, no one is going to sue you). (This is presuming that MySQL really *is* fast... I haven't benched it myself, and everyone I've talked to seems to be relying on data posted on the MySQL site.) I like the fact that postgresql is more of a "real" database than MySQL... even if I don't have any plans to use something like transactions at the moment, I like the fact that I can incorporate them later without switching databases (and that if I ever decided to upgrade to the realest database -- which appears to still be Oracle -- the feature set won't be a lot different than what I'm used to). > So would anyone like to start an emacs vs vi war next? ;) Sure, I'm always up for a good emacs/vi war. I think it's funny that for years vi users went around saying they didn't need any fancy emacs features like auto-saves and automatically generated backup files, but now that they've got them in vim, all of a sudden they're wonderful to have. Think about it: why wait for another vim upgrade when you could have the emacs features today? === Subject: RE: [OT] PostgreSQL gets commercial support/funding From: Alan Mead <adm@ipat.com> Date: Thu, 11 May 2000 14:53:48 -0500 At 12:54 PM 5/9/00 , Bjornson, Matt wrote: >What is the groups opinions regarding MySQL vs PostgreSQL? I think this has been discussed here and you may find more detail in the archives or on deja news. PostgreSQL is more ANSI compliant and fully featured than MySQL at a price of being slower and less widely used. Less featured can be deceptive though, IIRC MySQL has more varieties of data types. MySQL is perhaps a better solution for very quickly showing static data or storing information which is not precious enough to require COMMIT/ROLLBACK. The MySQL documents on-line describe and contrast MySQL pretty well although since the docs are large, you may have to scroll through a lot to find them. Ideologically, although the MySQL license is liberal and I think you can get some source (I think they GPL older versions), PostgreSQL is more "open", some would say the only open solution of the two. === Subject: [HACKERS] The New Slashdot Setup (includes MySql server) From: Alessio Bragadini <alessio@albourne.com> Date: Thu, 18 May 2000 19:33:43 +0300 Info on the new slashdot.org setup <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment> interesting because of the plans (meaning $$$) they have to improve MySql, and because they are the flagship MySql site/application. In the comment page, replying to the usual "Why not PostgreSql?" thread someone pointed out an extract from the MySql docs that seems to me blatantly false (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131). === Subject: Re: [HACKERS] The New Slashdot Setup (includes MySql server) From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Thu, 18 May 2000 13:12:38 -0400 (EDT) > Info on the new slashdot.org setup > > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment> > > interesting because of the plans (meaning $$$) they have to improve > MySql, and because they are the flagship MySql site/application. > > In the comment page, replying to the usual "Why not PostgreSql?" thread > someone pointed out an extract from the MySql docs that seems to me > blatantly false > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131). Just finished reading the thread. I am surprised how many people slammed them on their MySQL over PostgreSQL decision. People are slamming MySQL all over the place. :-) Seems like inertia was the reason to stay with MySQL. What that means to me is that for their application space, PostgreSQL already has superior technology, and people realize it. This means we are on our way up, and MySQL is, well, .... === Subject: Re: [HACKERS] The New Slashdot Setup (includes MySql server) From: Benjamin Adida <ben@mit.edu> Date: Thu, 18 May 2000 13:42:13 -0400 on 5/18/00 1:12 PM, Bruce Momjian at pgman@candle.pha.pa.us wrote: > Seems like inertia was the reason to stay with MySQL. What that means > to me is that for their application space, PostgreSQL already has > superior technology, and people realize it. This means we are on our > way up, and MySQL is, well, .... There is this growing desire among some OpenACS people to replicate the Slashdot functionality in an OpenACS module (probably a weekend's worth of work). I wish I had a bit more free time to do it. It's time to show what can be done with a real RDBMS (and a real web application environment, but that's a different story). -Ben === Subject: Re: [HACKERS] The New Slashdot Setup (includes MySql server) From: The Hermit Hacker <scrappy@hub.org> Date: Thu, 18 May 2000 15:08:56 -0300 (ADT) On Thu, 18 May 2000, Bruce Momjian wrote: > > Info on the new slashdot.org setup > > > > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment> > > > > interesting because of the plans (meaning $$$) they have to improve > > MySql, and because they are the flagship MySql site/application. > > > > In the comment page, replying to the usual "Why not PostgreSql?" thread > > someone pointed out an extract from the MySql docs that seems to me > > blatantly false > > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131). > > Just finished reading the thread. I am surprised how many people > slammed them on their MySQL over PostgreSQL decision. People are > slamming MySQL all over the place. :-) > > Seems like inertia was the reason to stay with MySQL. What that means > to me is that for their application space, PostgreSQL already has > superior technology, and people realize it. This means we are on our > way up, and MySQL is, well, .... In SlashDot's defence here ... I dooubt there is much they do that would require half of what we offer ... it *very* little INSERT/UPDATE/DELETE and *alot* of SELECT ... === Subject: Re: [HACKERS] The New Slashdot Setup (includes MySql server) From: Alfred Perlstein <bright@wintelcom.net> Date: Thu, 18 May 2000 12:16:19 -0700 The Hermit Hacker <scrappy@hub.org> [000518 11:51] wrote: > On Thu, 18 May 2000, Bruce Momjian wrote: > > > > Info on the new slashdot.org setup > > > > > > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment> > > > > > > interesting because of the plans (meaning $$$) they have to improve > > > MySql, and because they are the flagship MySql site/application. > > > > > > In the comment page, replying to the usual "Why not PostgreSql?" thread > > > someone pointed out an extract from the MySql docs that seems to me > > > blatantly false > > > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131). > > > > Just finished reading the thread. I am surprised how many people > > slammed them on their MySQL over PostgreSQL decision. People are > > slamming MySQL all over the place. :-) > > > > Seems like inertia was the reason to stay with MySQL. What that means > > to me is that for their application space, PostgreSQL already has > > superior technology, and people realize it. This means we are on our > > way up, and MySQL is, well, .... > > In SlashDot's defence here ... I dooubt there is much they do that would > require half of what we offer ... it *very* little INSERT/UPDATE/DELETE > and *alot* of SELECT ... If those guys still are doing multiple selects for each page view after at least 2 years of being around and choking on the load, they seriously need to get a clue. mod_perl... belch! === Subject: Re: [HACKERS] The New Slashdot Setup (includes MySql server) From: The Hermit Hacker <scrappy@hub.org> Date: Thu, 18 May 2000 15:54:50 -0300 (ADT) okay, that is a good point ... I know what the difference in performance the two vs one select issue can produce ... === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Hannu Krosing <hannu@tm.ee> Date: Fri, 19 May 2000 11:00:29 +0300 The Hermit Hacker wrote: > > thanks for the pointer ... I just posted my response ... specifically > pointing out how "accurate" the MySQL docs tend to be *rofl* And now there is a response to your response stating the following > > The MySQL people have said exactly the same sort of things about > the PostgreSQL people. So please stop the name-calling and > the quotes around "test", it's not going to get you anywhere. > > That being said, the standard MySQL benchmark _still_ is 30 times > faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off, > _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled, btw, how does one "enable" vacuum ? > I might add). The main reason seems to be some sort of failure to > use the index in the SELECT and UPDATE test loops on the part of > PostgreSQL. > > The benchmark, for the curious, works like this: > > First it creates a table with an index: > > create table bench1 (id int NOT NULL,id2 int NOT NULL,id3 int NOT NULL,dummy1 char(30)); create unique > index bench1_index_ on bench1 using btree (id,id2); create index bench1_index_1 on bench1 using btree (id3); > > Then it fills the table with 300.000 entries with unique id values. > > Then, it issues a query like this: > > update bench1 set dummy1='updated' where id=1747 > > which causes the backend to do one thousand read() calls. For each query. could it be that just updating 1 unique index causes 1k read()'s ? > No wonder it's slow. An EXPLAIN query states that it's using the > index, though. I have no clue what happens here. I've sent this > to the pgsql-general mailing list and have just reposted it to -hackers. I somehow missed it (on -hackers at least) so I repost it here > Oh yes, the benchmark also revealed that CREATE TABLE in PostgreSQL 7.0 > leaks about 2k of memory. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Fri, 19 May 2000 11:14:24 +0200 Hi, Hannu Krosing: > And now there is a response to your response stating the following > That response is from me, actually. (I subscribed to -hackers two hours ago, so I'm sorry if I missed anything.) > > That being said, the standard MySQL benchmark _still_ is 30 times > > faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off, > > _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled, > > btw, how does one "enable" vacuum ? run-all-tests ... --fast. The code has stuff like $server->vacuum(1,\$dbh) if $opt_fast and defined $server->{vacuum}; sprinkled at strategic places. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Chris <chris@bitmead.com> Date: Sat, 20 May 2000 05:54:20 +1000 That being said, the standard MySQL benchmark _still_ is 30 times > > > faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off, > > > _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled, VACUUM is not a speed-up feature, it's a slow-down feature. It reclaims space and that takes time. It does update system statistics which can help performance if done after a data load or perhaps once a day. But "sprinkling the code" with vacuum sounds like a big performance killer. Hope you are not counting vacuum as part of your 1000 read() calls. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Fri, 19 May 2000 12:18:03 +0200 Hi, Chris: > VACUUM is not a speed-up feature, it's a slow-down feature. It reclaims > space and that takes time. It does update system statistics which can > help performance if done after a data load or perhaps once a day. > OK, thanks for the clarification. > But "sprinkling the code" with vacuum sounds like a big performance > killer. Hope you are not counting vacuum as part of your 1000 read() > calls. > Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE call. They actually scan the pg_index table of the benchmark database. Why they do that is another question entirely. (a) these tables should have indices, and (b) whatever postgres wants to know should have been cached someplace. Oh yes, (c) what's in pg_index that needs to be 4 MBytes big? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Chris <chris@bitmead.com> Date: Sat, 20 May 2000 06:34:06 +1000 Matthias Urlichs wrote: > Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE > call. They actually scan the pg_index table of the benchmark database. Does this only happen on the first call to INSERT/UPDATE after connecting to the database, or does it happen with all subsequent calls too? === Subject: RE: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Fri, 19 May 2000 19:51:38 +0900 Original Message----- > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On > Behalf Of Matthias Urlichs > > Why they do that is another question entirely. (a) these tables should > have indices, and (b) whatever postgres wants to know should have been > cached someplace. Oh yes, (c) what's in pg_index that needs to be 4 > MBytes big? > What does 'vacuum pg_index' show ? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Fri, 19 May 2000 12:39:40 +0200 Hi, Chris: > Matthias Urlichs wrote: > > > Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE > > call. They actually scan the pg_index table of the benchmark database. > > Does this only happen on the first call to INSERT/UPDATE after > connecting to the database, or does it happen with all subsequent calls > too? > All of them. Whatever the server is looking up here, it's _not_ cached. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Hiroshi Inoue <Inoue@tpf.co.jp> Date: Fri, 19 May 2000 20:12:00 +0900 Matthias Urlichs wrote: > Hi, > > Chris: > > Matthias Urlichs wrote: > > > > > Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE > > > call. They actually scan the pg_index table of the benchmark database. > > > > Does this only happen on the first call to INSERT/UPDATE after > > connecting to the database, or does it happen with all subsequent calls > > too? > > > All of them. Whatever the server is looking up here, it's _not_ cached. > Maybe shared buffer isn't so large as to keep all the(4.1M) pg_index pages. So it would read pages from disk every time, Unfortunately pg_index has no index to scan the index entries of a relation now. However why is pg_index so large ? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Fri, 19 May 2000 13:40:08 +0200 Hi, Hiroshi Inoue: > What does 'vacuum pg_index' show ? > test=> vacuum pg_index; NOTICE: Skipping "pg_index" --- only table owner can VACUUM it VACUUM OK, so I suppose I should do it as the postgres user... test=> vacuum pg_index; VACUUM The debug output says: DEBUG: --Relation pg_index-- DEBUG: Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 164, MaxLen 164; Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec. DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec. DEBUG: Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec. DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec. ... which helped. A lot. Thanks, everybody. The first quick benchmark run I did afterwards states that PostgreSQL is now only half as fast as MySQL, instead of the factor of 30 seen previously, on the MySQL benchmark test. ;-) === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Fri, 19 May 2000 14:04:24 +0200 Hi, Hiroshi Inoue: > > Maybe shared buffer isn't so large as to keep all the(4.1M) pg_index pages. That seems to be the case. > So it would read pages from disk every time, Unfortunately pg_index > has no index to scan the index entries of a relation now. > Well, it's reasonable that you can't keep an index on the table which states what the indices are. ;-) ... on the other hand, Apple's HFS file system stores all the information about the on-disk locations of their files as a B-Tree in, in, you guessed it, a B-Tree which is saved on disk as an (invisible) file. Thus, the thing stores the information on where its sectors are located at, inside itself. To escape this catch-22 situation, the location of the first three extents (which is usually all it takes anyway) is stored elsewhere. Possibly, something like this would work with postgres too. > However why is pg_index so large ? > Creating ten thousand tables will do that to you. Is there an option I can set to increase the appropriate cache, so that the backend can keep the data in memory? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes From: The Hermit Hacker <scrappy@hub.org> Date: Fri, 19 May 2000 09:42:30 -0300 (ADT) On Fri, 19 May 2000, Matthias Urlichs wrote: > Hi, > > Hiroshi Inoue: > > What does 'vacuum pg_index' show ? > > > test=> vacuum pg_index; > NOTICE: Skipping "pg_index" --- only table owner can VACUUM it > VACUUM > > OK, so I suppose I should do it as the postgres user... > test=> vacuum pg_index; > VACUUM > > The debug output says: > DEBUG: --Relation pg_index-- > DEBUG: Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 164, MaxLen 164; > Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec. > DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec. > DEBUG: Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec. > DEBUG: Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec. > > ... which helped. A lot. > > Thanks, everybody. The first quick benchmark run I did afterwards states > that PostgreSQL is now only half as fast as MySQL, instead of the factor > of 30 seen previously, on the MySQL benchmark test. ;-) Wow, shock of shocks ... MySQL has more inaccuracies in their docs? *grin* === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Mitch Vincent" <mitch@huntsvilleal.com> Date: Fri, 19 May 2000 09:47:39 -0400 which helped. A lot. > > Thanks, everybody. The first quick benchmark run I did afterwards states > that PostgreSQL is now only half as fast as MySQL, instead of the factor > of 30 seen previously, on the MySQL benchmark test. ;-) while (horse == DEAD) { beat(); } ... Anyway.. I can see this being true (the MySQL being twice as fast as PostgreSQL) however I don't think that MySQL being faster than PostgreSQL was ever up for debate. When you take a RDBMS and strip out a huge amount of features, of course you're going to get a faster end product. It's just not nearly as safe, feature rich or easy to work with (from a programmers standpoint). I looked at MySQL to use for my applications, for all of ten seconds.... To code in and around, MySQL just isn't a useable RDBMS for me and I can hardly see how it's useful for anyone doing the kind of programming I do.. What it is very good for is something like RADIUS/POP3 authentication, I use it at my ISP to keep all my user authentication in one place... However the only thing I catred about was speed there, and there are all of two things I ever do to that database. I SELECT (once every auth request) and occasionally I INSERT and possibly UPDATE, that coupled with the fact that there are only two to three things in the database per user (username, password and domain for POP3 auth) -- it's just not a very complicated thing to do... I use a SQL backend because it's very easy to maintain and I can easily write software to manipulate the data held in the tables -- that's all. With the other applications I and my company write, it's a totally different story. I just don't see how a person can write any kind of a larger application and not need all the features MySQL lacks... I like MySQL for certain things -- however I've never considered "MySQL vs PostgreSQL" -- they're just two totally different databases for totally different uses IMHO. -Mitch === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Fri, 19 May 2000 16:00:14 +0200 Hi, The Hermit Hacker: > > Thanks, everybody. The first quick benchmark run I did afterwards states > > that PostgreSQL is now only half as fast as MySQL, instead of the factor > > of 30 seen previously, on the MySQL benchmark test. ;-) > > Wow, shock of shocks ... MySQL has more inaccuracies in their docs? *grin* No, that factor of 30 was my result after running the benchmark for the first time. Presumably, unless I skip the large_number_of_tables test, it'll be just as slow the second time around. The MySQL people probably didn't dig deeper into PostgreSQL's innards. They don't seem to think it's their job to find out exactly why their benchmark runs so slow on some other databases, and I don't particularly fault them for that attitude. The PostgreSQL community has an attitude too, after all. One of these might be to answer "you must have had fsync turned on" whenever somebody reports a way-too-slow benchmark. In this case, that's definitely not true. Another attitude of the PostgreSQL developers might be to answer "run VACUUM" whenever somebody reports performance problems. That answer is not helpful at all WRT this benchmark, because the user who caused the problem ("test", in my case) isn't permitted to run VACUUM on the pg_index table. The alternate solution would be for the backend to notice "Gee, I just scanned a whole heap of what turned out to be empty space in this here pg_index file, maybe it would be a good idea call vacuum() on it." Or, if that doesn't work, increase the buffer for holding its content. Anyway, I fully expect to have a more reasonable benchmark result by tomorrow, and the MySQL guys will get a documentation update. Which they _will_ put in the next update's documentation file. Trust me. ;-) === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Chris <chris@bitmead.com> Date: Sat, 20 May 2000 10:19:00 +1000 Matthias Urlichs wrote: > Another attitude of the PostgreSQL developers might be to answer "run > VACUUM" whenever somebody reports performance problems. That answer is > not helpful at all WRT this benchmark, because the user who caused the > problem ("test", in my case) isn't permitted to run VACUUM on the > pg_index table. Speaking of which, why can't any user who can change meta-data, also Vacuum meta-data ? It's not a threat to security is it? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup From: "Michael A. Olson" <mao@sleepycat.com> Date: Fri, 19 May 2000 07:18:31 -0700 At 02:04 PM 5/19/00 +0200, you wrote: > Well, it's reasonable that you can't keep an index on the table which > states what the indices are. ;-) > > ... on the other hand, Apple's HFS file system stores all the information > about the on-disk locations of their files as a B-Tree in, in, you > guessed it, a B-Tree which is saved on disk as an (invisible) file. > Thus, the thing stores the information on where its sectors are located > at, inside itself. > To escape this catch-22 situation, the location of the first three > extents (which is usually all it takes anyway) is stored elsewhere. > > Possibly, something like this would work with postgres too. This is one of several things we did at Illustra to make the backend run faster. I did the design and implementation, but it was a few years ago, so the details are hazy. Here's what I remember. We had to solve three problems: First, you had to be able to run initdb and bootstrap the system without the index on pg_index in place. As I recall, we had to carefully order the creation of the first several tables to make that work, but it wasn't rocket science. Second, when the index on pg_index gets created, you need to update it with tuples that describe it. This is really just the same as hard-coding the pg_attribute attribute entries into pg_attribute -- ugly, but not that bad. Third, we had to abstract a lot of the hard-coded table scans in the bowels of the system to call a routine that checked for the existence of an index on the system table, and used it. In order for the index on pg_index to get used, its reldesc had to be nailed in the cache. Getting it there at startup was more hard-coded ugliness, but you only had do to it one time. The advantage is that you can then index a bunch more of the system catalog tables, and on a bunch more attributes. That produced some surprising speedups. This was simple enough that I'm certain the same technique would work in the current engine. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 10:24:02 -0400 Matthias Urlichs" <smurf@noris.net> writes: >>>> Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE >>>> call. They actually scan the pg_index table of the benchmark database. Ohh ... pg_index is the culprit! OK, I know exactly where that's coming from: the planner is looking around to see what indexes might be interesting for planning the query. Several comments here: 1. Probably we ought to try to bypass most of the planning process for a simple INSERT ... VALUES. (I thought I had fixed that, but apparently it's not getting short-circuited soon enough, if the search for indexes is still happening.) 2. The search is not using either an index or a cache IIRC. Needs to be fixed but there may be no suitable index present in 7.0. 3. I have been toying with the notion of having relcache entries store information about the indexes associated with the table, so that the planner wouldn't have to search through pg_index at all. The trouble with doing that is getting the info updated when an index is added or dropped; haven't quite figured out how to do that... === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 11:25:49 -0400 (EDT) > The advantage is that you can then index a bunch more of the system > catalog tables, and on a bunch more attributes. That produced some > surprising speedups. We have indexes on all system tables that need it. The pg_index index was done quite easily and is new for 7.0. A check for recursion and fallback to sequential scan for pg_index table rows in the pg_index table allows it to happen. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 12:39:17 -0400 Bruce Momjian <pgman@candle.pha.pa.us> writes: >> The advantage is that you can then index a bunch more of the system >> catalog tables, and on a bunch more attributes. That produced some >> surprising speedups. > We have indexes on all system tables that need it. There isn't any fundamental reason why the planner can't be using an index to scan pg_index; we just need to code it that way. Right now it's coded as a sequential scan. Unfortunately there is no index on pg_index's indrelid column in 7.0, so this is not fixable without an initdb. TODO item for 7.1, I guess. More generally, someone should examine the other places where heap_getnext() loops occur, and see if any of them look like performance bottlenecks... === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 12:54:10 -0400 (EDT) > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> The advantage is that you can then index a bunch more of the system > >> catalog tables, and on a bunch more attributes. That produced some > >> surprising speedups. > > > We have indexes on all system tables that need it. > > There isn't any fundamental reason why the planner can't be using an > index to scan pg_index; we just need to code it that way. Right now > it's coded as a sequential scan. > > Unfortunately there is no index on pg_index's indrelid column in 7.0, > so this is not fixable without an initdb. TODO item for 7.1, I guess. The reason there is no index is because it is not a unique column, and at the time I was adding system indexes for 7.0, I was looking for indexes that could be used for system cache lookups. The index you are describing returns multiple tuples, so it would be an actual index call in the code. i will add this to the TODO. > > More generally, someone should examine the other places where > heap_getnext() loops occur, and see if any of them look like performance > bottlenecks... Good idea. Added to TODO. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup From: "Michael A. Olson" <mao@sleepycat.com> Date: Fri, 19 May 2000 09:54:11 -0700 At 12:39 PM 5/19/00 -0400, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > We have indexes on all system tables that need it. > > There isn't any fundamental reason why the planner can't be using an > index to scan pg_index; we just need to code it that way. Right now > it's coded as a sequential scan. Eliminating the hard-coded seqscans of catalogs in the bowels of the system was the hardest part of the project. As I said, it was good to do. It made parsing and planning queries much, much faster. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 13:14:30 -0400 (EDT) At 12:39 PM 5/19/00 -0400, Tom Lane wrote: > > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > > > We have indexes on all system tables that need it. > > > > There isn't any fundamental reason why the planner can't be using an > > index to scan pg_index; we just need to code it that way. Right now > > it's coded as a sequential scan. > > Eliminating the hard-coded seqscans of catalogs in the bowels of the > system was the hardest part of the project. As I said, it was good > to do. It made parsing and planning queries much, much faster. All the sequential catalog scans that return one row are gone. What has not been done is adding indexes for scans returning more than one row. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 13:36:29 -0400 Bruce Momjian <pgman@candle.pha.pa.us> writes: > All the sequential catalog scans that return one row are gone. What has > not been done is adding indexes for scans returning more than one row. I've occasionally wondered whether we can't find a way to use the catcaches for searches that can return multiple rows. It'd be easy enough to add an API for catcache that could return multiple rows given a nonunique search key. The problem is how to keep the catcache up to date with underlying reality for this kind of query. Deletions of rows will be handled by the existing catcache invalidation mechanism, but how can we know when some other backend has added a row that will match a search condition? Haven't seen an answer short of scanning the table every time, which makes the catcache no win at all. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 13:56:58 -0400 (EDT) > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > All the sequential catalog scans that return one row are gone. What has > > not been done is adding indexes for scans returning more than one row. > > I've occasionally wondered whether we can't find a way to use the > catcaches for searches that can return multiple rows. It'd be easy > enough to add an API for catcache that could return multiple rows given > a nonunique search key. The problem is how to keep the catcache up to > date with underlying reality for this kind of query. Deletions of rows > will be handled by the existing catcache invalidation mechanism, but > how can we know when some other backend has added a row that will match > a search condition? Haven't seen an answer short of scanning the table > every time, which makes the catcache no win at all. Good point. You can invalidate stuff, but how to find new stuff that doesn't have a specific key? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 15:35:04 -0400 Michael A. Olson" <mao@sleepycat.com> writes: > Third, we had to abstract a lot of the hard-coded table scans in > the bowels of the system to call a routine that checked for the > existence of an index on the system table, and used it. The way that we've been approaching this is by switching from hard-coded sequential scans (heap_getnext() calls) to hard-coded indexscans (index_getnext() calls) at places where performance dictates it. An advantage of doing it that way is that you don't have the bootstrapping/circularity problems that Mike describes; the code doesn't need to consult pg_index to know whether there is an index to use, it just has the necessary info hard-coded in. For the same reason it's very quick. Nonetheless it's also a pretty ugly answer. I'd rather the code wasn't so tightly tied to a particular set of indexes for system tables. I was thinking about doing something like what Mike describes: replace uses of heap_beginscan() with calls to a routine that would examine the passed ScanKey(s) to see if there is a relevant index, and then start either a heap or index scan as appropriate. The circularity issue could be resolved by having that routine have hard-coded knowledge of some of the system-table indexes (or even all of them, which is still better than having that knowledge scattered throughout the code). But the performance cost of identifying the right index based on ScanKeys gives me pause. It's hard to justify that per-search overhead when the hard-coded approach works well enough. Thoughts anyone? regards, tom lane === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 15:52:44 -0400 (EDT) Michael A. Olson" <mao@sleepycat.com> writes: > > Third, we had to abstract a lot of the hard-coded table scans in > > the bowels of the system to call a routine that checked for the > > existence of an index on the system table, and used it. > > The way that we've been approaching this is by switching from hard-coded > sequential scans (heap_getnext() calls) to hard-coded indexscans > (index_getnext() calls) at places where performance dictates it. > > An advantage of doing it that way is that you don't have the > bootstrapping/circularity problems that Mike describes; the code doesn't > need to consult pg_index to know whether there is an index to use, it > just has the necessary info hard-coded in. For the same reason it's > very quick. I like hard-coded. There aren't many of them, last time I looked. Maybe 5-10 that need index scan. The rest are already done using the catalog cache. === Subject: RE: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Sat, 20 May 2000 07:01:15 +0900 Original Message----- > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On > Behalf Of Tom Lane > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> The advantage is that you can then index a bunch more of the system > >> catalog tables, and on a bunch more attributes. That produced some > >> surprising speedups. > > > We have indexes on all system tables that need it. > > There isn't any fundamental reason why the planner can't be using an > index to scan pg_index; we just need to code it that way. Right now > it's coded as a sequential scan. > > Unfortunately there is no index on pg_index's indrelid column in 7.0, > so this is not fixable without an initdb. TODO item for 7.1, I guess. > I've noticed the fact since before but haven't complained. As far as I see,pg_index won't so big. In fact Matthias's case has only 1 page after running vacuum for pg_index. In such cases sequential scan is faster than index scan as you know. I don't agree with you to increase system indexes easily. Though I implemented REINDEX command to recover system indexes it doesn't mean index corruption is welcome. I know another case. pg_attrdef has no index on (adrelid,attnum) though it has an index on (adrelid). > More generally, someone should examine the other places where > heap_getnext() loops occur, and see if any of them look like performance > bottlenecks... Please don't lose sequential scan stuff even when changes to index scan is needed because -P option of standalone postgres needs sequential scan for system tables. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 18:41:32 -0400 Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> Unfortunately there is no index on pg_index's indrelid column in 7.0, >> so this is not fixable without an initdb. TODO item for 7.1, I guess. > I've noticed the fact since before but haven't complained. > As far as I see,pg_index won't so big. In fact Matthias's case has > only 1 page after running vacuum for pg_index. In such cases > sequential scan is faster than index scan as you know. True, but the differential isn't very big either when dealing with a small table. I think I'd rather use an index and be assured that performance doesn't degrade drastically when the database contains many indexes. I've also been thinking about ways to implement the relcache-based caching of index information that I mentioned before. That doesn't address the scanning problem in general but it should improve performance for this part of the planner quite a bit. The trick is to ensure that other backends update their cached info whenever an index is added or deleted. I thought of one way to do that: force an update of the owning relation's pg_class tuple during CREATE or DROP INDEX, even when we don't have any actual change to make in its contents --- that'd force a relcache invalidate cycle at other backends. (Maybe we don't even need to change the pg_class tuple, but just send out a shared-cache-invalidate message as if we had.) > I know another case. pg_attrdef has no index on (adrelid,attnum) > though it has an index on (adrelid). Doesn't look to me like we need an index on (adrelid,attnum), at least not in any paths that are common enough to justify maintaining another index. The (adrelid) index supports loading attrdef data into the relcache, which is the only path I'm particularly concerned about performance of... === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 18:23:26 -0400 Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> More generally, someone should examine the other places where >> heap_getnext() loops occur, and see if any of them look like performance >> bottlenecks... > Please don't lose sequential scan stuff even when changes to > index scan is needed because -P option of standalone postgres > needs sequential scan for system tables. Good point. I'd still like not to clutter the code with deciding which kind of scan to invoke, though. Maybe we could put the begin_xxx routine in charge of ignoring a request for an indexscan when -P is used. (AFAIR there's no real difference for the calling code, it sets up scankeys and so forth just the same either way, no? We should just need a switching layer in front of heap_beginscan/ index_beginscan and heap_getnext/index_getnext...) === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 18:44:27 -0400 I wrote: > We should just need a switching layer in front of heap_beginscan/ > index_beginscan and heap_getnext/index_getnext...) After refreshing my memory of how these are used, it seems that we'd have to change the API of either the heap or index scan routines in order to unify them like that. Might be worth doing to maintain code cleanliness, though. The places Hiroshi has fixed to support both index and seq scan look really ugly to my eyes ... === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 19:21:47 -0400 (EDT) > I've noticed the fact since before but haven't complained. > As far as I see,pg_index won't so big. In fact Matthias's case has > only 1 page after running vacuum for pg_index. In such cases > sequential scan is faster than index scan as you know. > I don't agree with you to increase system indexes easily. > Though I implemented REINDEX command to recover system > indexes it doesn't mean index corruption is welcome. > > I know another case. pg_attrdef has no index on (adrelid,attnum) > though it has an index on (adrelid). > > > More generally, someone should examine the other places where > > heap_getnext() loops occur, and see if any of them look like performance > > bottlenecks... > > Please don't lose sequential scan stuff even when changes to > index scan is needed because -P option of standalone postgres > needs sequential scan for system tables. Certainly whatever we do will be discussed. I realize initdb is an issue. However, I am not sure sequential scan is faster than index scan for finding only a few rows in the table. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 19:24:13 -0400 (EDT) > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >> Unfortunately there is no index on pg_index's indrelid column in 7.0, > >> so this is not fixable without an initdb. TODO item for 7.1, I guess. > > > I've noticed the fact since before but haven't complained. > > As far as I see,pg_index won't so big. In fact Matthias's case has > > only 1 page after running vacuum for pg_index. In such cases > > sequential scan is faster than index scan as you know. > > True, but the differential isn't very big either when dealing with > a small table. I think I'd rather use an index and be assured that > performance doesn't degrade drastically when the database contains > many indexes. Agreed. > > I've also been thinking about ways to implement the relcache-based > caching of index information that I mentioned before. That doesn't > address the scanning problem in general but it should improve > performance for this part of the planner quite a bit. The trick is to > ensure that other backends update their cached info whenever an index > is added or deleted. I thought of one way to do that: force an update > of the owning relation's pg_class tuple during CREATE or DROP INDEX, > even when we don't have any actual change to make in its contents --- > that'd force a relcache invalidate cycle at other backends. (Maybe > we don't even need to change the pg_class tuple, but just send out a > shared-cache-invalidate message as if we had.) Oh, good idea. Just invalidate the relation so we reload. BTW, Hiroshi is the one who gave me the recursion-prevision fix for the system index additions for 7.0. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 20:32:18 -0400 I wrote: >>>>> Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE >>>>> call. They actually scan the pg_index table of the benchmark database. > > Ohh ... pg_index is the culprit! OK, I know exactly where that's coming > from: the planner is looking around to see what indexes might be > interesting for planning the query. Several comments here: > > 1. Probably we ought to try to bypass most of the planning process for > a simple INSERT ... VALUES. (I thought I had fixed that, but apparently > it's not getting short-circuited soon enough, if the search for indexes > is still happening.) It never pays to assume you know what's happening without having looked :-(. It turns out the planner is not the only culprit: the executor's ExecOpenIndices() routine *also* does a sequential scan of pg_index. I did shortcircuit the planner's search in the INSERT ... VALUES case, but of course the executor still must find out whether the table has indexes. In UPDATE, DELETE, or INSERT ... SELECT, pg_index is scanned *twice*, once in the planner and once in the executor. (In fact it's worse than that: the planner scans pg_index separately for each table named in the query. At least the executor only does it once since it only has to worry about one target relation.) Definitely need to cache the indexing information... regards, tom lane Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 19 May 2000 20:32:18 -0400 I wrote: >>>>> Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE >>>>> call. They actually scan the pg_index table of the benchmark database. > > Ohh ... pg_index is the culprit! OK, I know exactly where that's coming > from: the planner is looking around to see what indexes might be > interesting for planning the query. Several comments here: > > 1. Probably we ought to try to bypass most of the planning process for > a simple INSERT ... VALUES. (I thought I had fixed that, but apparently > it's not getting short-circuited soon enough, if the search for indexes > is still happening.) It never pays to assume you know what's happening without having looked :-(. It turns out the planner is not the only culprit: the executor's ExecOpenIndices() routine *also* does a sequential scan of pg_index. I did shortcircuit the planner's search in the INSERT ... VALUES case, but of course the executor still must find out whether the table has indexes. In UPDATE, DELETE, or INSERT ... SELECT, pg_index is scanned *twice*, once in the planner and once in the executor. (In fact it's worse than that: the planner scans pg_index separately for each table named in the query. At least the executor only does it once since it only has to worry about one target relation.) Definitely need to cache the indexing information... === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 19 May 2000 21:46:43 -0400 (EDT) > I wrote: > > We should just need a switching layer in front of heap_beginscan/ > > index_beginscan and heap_getnext/index_getnext...) > > After refreshing my memory of how these are used, it seems that > we'd have to change the API of either the heap or index scan routines > in order to unify them like that. Might be worth doing to maintain > code cleanliness, though. The places Hiroshi has fixed to support > both index and seq scan look really ugly to my eyes ... Agreed, and I think there are a few places that have them. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Thomas Lockhart <lockhart@alumni.caltech.edu> Date: Sat, 20 May 2000 01:53:28 +0000 The MySQL people probably didn't dig deeper into PostgreSQL's innards. > They don't seem to think it's their job to find out exactly why their > benchmark runs so slow on some other databases, and I don't particularly > fault them for that attitude. Hmm. And then who's job is it to take someone else's work and make it accurate? If the shoe were on the other foot: if I generated a benchmark suite and features list, and it contained major and numerous inaccuracies, who would you expect to be responsible (or at least feel responsible) for correcting/updating/improving it? 'Twould be me imho. We've tried, and failed (to date) to contribute information to the "crashme" travesty. My recollection was a ~30% error rate on information for Postgres, and I didn't look into the stats for other databases. Check the archives for details. > The PostgreSQL community has an attitude too, after all. Yup ;) > One of these might be to answer "you must have had fsync turned on" > whenever somebody reports a way-too-slow benchmark. In this case, > that's definitely not true. I'm sorry that has been your experience. imho, that initial response might be considered "helpful advice", not "attitude". And I'll submit that most postings I've seen (I'm mostly on the -hackers list) are followed up to the bitter end if the poster can state the problem succinctly and can follow up with specific information. But I'm a developer, so don't have the right outlook. > Anyway, I fully expect to have a more reasonable benchmark result by > tomorrow, and the MySQL guys will get a documentation update. Which they > _will_ put in the next update's documentation file. Trust me. ;-) Fantastic! We've been down this road before, and have had little luck in getting more than a token update of inaccuracies. Any little bit helps. And while you're at it, can you update their docs and web site to make clear that transactions and atomicity are not anywhere near the feature list of MySQL yet? TIA - Thomas === Subject: RE: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Sat, 20 May 2000 14:25:23 +0900 Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > I wrote: > > We should just need a switching layer in front of heap_beginscan/ > > index_beginscan and heap_getnext/index_getnext...) > > After refreshing my memory of how these are used, it seems that > we'd have to change the API of either the heap or index scan routines > in order to unify them like that. Might be worth doing to maintain > code cleanliness, though. The places Hiroshi has fixed to support > both index and seq scan look really ugly to my eyes ... > Yes,it's ugly unfortunately. So I had hesitated to commit it for pretty long. There's a trial of unification in my trial implementation of ALTER TABLE DROP COLUMN in command.c. === Subject: RE: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Sat, 20 May 2000 14:25:26 +0900 Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >> Unfortunately there is no index on pg_index's indrelid column in 7.0, > >> so this is not fixable without an initdb. TODO item for 7.1, I guess. > > I've also been thinking about ways to implement the relcache-based > caching of index information that I mentioned before. That doesn't > address the scanning problem in general but it should improve > performance for this part of the planner quite a bit. Sounds reasonble to me. > The trick is to > ensure that other backends update their cached info whenever an index > is added or deleted. I thought of one way to do that: force an update > of the owning relation's pg_class tuple during CREATE or DROP INDEX, > even when we don't have any actual change to make in its contents --- > that'd force a relcache invalidate cycle at other backends. (Maybe > we don't even need to change the pg_class tuple, but just send out a > shared-cache-invalidate message as if we had.) > Seems CREATE INDEX already sends shared_cache_invalidate message. DROP INDEX doesn't ?? I'm not sure. > > I know another case. pg_attrdef has no index on (adrelid,attnum) > > though it has an index on (adrelid). > > Doesn't look to me like we need an index on (adrelid,attnum), at > least not in any paths that are common enough to justify maintaining > another index. The (adrelid) index supports loading attrdef data > into the relcache, which is the only path I'm particularly concerned > about performance of... > It seems to me that an index on (adrelid,adnum) should exist instead of the current index. It identifies pg_attrdef. I say *Oops* about it in my trial implementation of ALTER TABLE DROP COLUMN. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 20 May 2000 01:30:08 -0400 Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >>>> I know another case. pg_attrdef has no index on (adrelid,attnum) >>>> though it has an index on (adrelid). >> >> Doesn't look to me like we need an index on (adrelid,attnum), at >> least not in any paths that are common enough to justify maintaining >> another index. The (adrelid) index supports loading attrdef data >> into the relcache, which is the only path I'm particularly concerned >> about performance of... > It seems to me that an index on (adrelid,adnum) should > exist instead of the current index. It identifies pg_attrdef. > I say *Oops* about it in my trial implementation of ALTER > TABLE DROP COLUMN. Right, I saw that. But it seems to be the only place where such an index would be useful. The relcache-loading routines, which seem to be the only performance-critical access to pg_attrdef, prefer an index on adrelid only. Is it worth maintaining a 2-column index (which is bulkier and slower than a 1-column one) just to speed up ALTER TABLE DROP COLUMN? === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 10:09:00 +0200 Hi, Tom Lane: > It never pays to assume you know what's happening without having looked > :-(. It turns out the planner is not the only culprit: the executor's > ExecOpenIndices() routine *also* does a sequential scan of pg_index. That meshes with my observation that updates seem to do twice as many read() calls on pg_index than inserts. For this test, anyway. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Mike Mascari <mascarm@mascari.com> Date: Sat, 20 May 2000 07:02:42 -0400 Matthias Urlichs wrote: > Attached is the current crashme output. "crash_me_safe" is off only > because of the fact that some tests go beyond available memory. > There's no sense in testing how far you can push a > "SELECT a from b where c = 'xxx(several megabytes worth of Xes)'" > query when the size fo a TEXT field is limited to 32k. > > Limits with '+' in front of the number say that this is the max value > tested, without implying whether higher values are OK or not. > > If you have any remarks, especially about the '=no' results (i.e. you > think PostgreSQL can do that, therefore the crashme test must be wrong > somehow), tell me. Otherwise I'll forward the results to the MySQL > people next week. How about: 1. alter_rename_table = no The syntax in PostgreSQL is ALTER TABLE x RENAME TO y; 2. atomic_updates = no Huh? Besides being paranoid about fsync()'ing transactions how is a transaction based MVCC not atomic with respect to updates? 3. automatic_rowid = no The description simply says Automatic rowid. Does this apply to query result sets or to the underlying relation? If the latter, PostgreSQL has, of course, an OID for every tuple in the database. 4. binary_items = no Read up on large objects... 5. connections = 32 This, should, of course be +32, since PostgreSQL can easily handle hundreds of simultaneous connections. 6. create_table_select = no Again. PostgreSQL supports CREATE TABLE AS SELECT (i.e. Oracle), and SELECT INTO syntax. 7. except = no PostgreSQL has had both INTERSECT and EXCEPT since 6.5.0 (albeit they're slow). I'm starting to get very tired of this. I don't see why PostgreSQL users are obligated to get MySQL tests correct. And I'm only 15% through the list... Bottom line...either the test writers are ignorant or deceptive. Either way I won't trust my data with them... === Subject: RE: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Sun, 21 May 2000 01:28:31 +0900 Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >>>> I know another case. pg_attrdef has no index on (adrelid,attnum) > >>>> though it has an index on (adrelid). > >> > >> Doesn't look to me like we need an index on (adrelid,attnum), at > >> least not in any paths that are common enough to justify maintaining > >> another index. The (adrelid) index supports loading attrdef data > >> into the relcache, which is the only path I'm particularly concerned > >> about performance of... > > > It seems to me that an index on (adrelid,adnum) should > > exist instead of the current index. It identifies pg_attrdef. > > I say *Oops* about it in my trial implementation of ALTER > > TABLE DROP COLUMN. > > Right, I saw that. But it seems to be the only place where such an > index would be useful. The relcache-loading routines, which seem to > be the only performance-critical access to pg_attrdef, prefer an index > on adrelid only. Is it worth maintaining a 2-column index (which is > bulkier and slower than a 1-column one) just to speed up ALTER TABLE > DROP COLUMN? > I don't mind so much about the performance in this case. The difference would be little. Isn't it a fundamental principle to define primary(unique identification) constraint for each table ? I had never thought that the only one index of pg_attrdef isn't an unique identification index until I came across the unexpcted result of my DROP COLUMN test case. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 20:17:10 +0200 Hi, Mike Mascari: > > 1. alter_rename_table = no > > The syntax in PostgreSQL is ALTER TABLE x RENAME TO y; > They say "alter table crash_q rename crash_q1". What does the official standard say (assuming any exists) -- is the "to" optional or not? > 2. atomic_updates = no > > Huh? Besides being paranoid about fsync()'ing transactions how is > a transaction based MVCC not atomic with respect to updates? > That's a misnomer. They actually mean this: create table crash_q (a integer not null); create unique index crf on crash_q(a); insert into crash_q values (2); insert into crash_q values (3); insert into crash_q values (1); update crash_q set a=a+1; > 3. automatic_rowid = no > > The description simply says Automatic rowid. Does this apply to > query result sets or to the underlying relation? If the latter, > PostgreSQL has, of course, an OID for every tuple in the > database. > I'll have them fix that. MySQL calls them "_rowid" and apparently tests only for these. > 4. binary_items = no > > Read up on large objects... > ... with an ... erm ... let's call it "nonstandard" ... interface. > 5. connections = 32 > > This, should, of course be +32, since PostgreSQL can easily > handle hundreds of simultaneous connections. > The testing code (Perl) looks like this, and it bombs after the 32nd connection. for ($i=1; $i < $max_connections ; $i++) { if (!($dbh=DBI->connect($server->{'data_source'},$opt_user,$opt_password, { PrintError => 0}))) { print "Last connect error: $DBI::errstr\n" if ($opt_debug); last; } $dbh->{LongReadLen}= $longreadlen; # Set retrieval buffer print "." if ($opt_debug); push(@connect,$dbh); } print "$i\n"; I do not know where that limit comes from. It might be the DBI interface to PostgreSQL, or a runtime limit. Anyway, $max_connections has the value to 1000. > 6. create_table_select = no > > Again. PostgreSQL supports CREATE TABLE AS SELECT (i.e. Oracle), > and SELECT INTO syntax. Test code: create table crash_q SELECT * from crash_me; Again, is the "AS" optional or not? > 7. except = no > > PostgreSQL has had both INTERSECT and EXCEPT since 6.5.0 (albeit > they're slow). > Looking at the test, we see it doing this: create table crash_me (a integer not null,b char(10) not null); insert into crash_me (a,b) values (1,'a'); create table crash_me2 (a integer not null,b char(10) not null, c integer); insert into crash_me2 (a,b,c) values (1,'b',1); select * from crash_me except select * from crash_me2; For what it's worth, there is at least one database which doesn't have this restriction (i.e., that the number of columns must be identical) (namely SOLID). So this test needs to be split into two. I'll do that. > I'm starting to get very tired of this. I don't see why > PostgreSQL users are obligated to get MySQL tests correct. And > I'm only 15% through the list... > _Somebody_ has to get these things right. I'm not suggesting that it's any obligation of yours specifically, but somebody's gotta do it, and (IMHO) it can only be done by somebody who already knows _something_ about the databse to be tested. > Bottom line...either the test writers are ignorant or deceptive. Or the tests are just badly written. Or they're too old and suffer from severe bit rot. For what its worth, I do NOT think the people who wrote these tests are either ignorant or deceptive. Most, if not all, of these tests are OK when checked against at least one SQLish database. === Subject: Re: [HACKERS] More Performance From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 20:54:20 +0200 Hi, I've found another one of these performance problems in the benchmark, related to another ignored index. The whole thing works perfectly after a VACUUM ANALYZE on the table. IMHO this is somewhat non-optimal. In the absence of information to the contrary, PostgreSQL should default to using an index if it might be appropriate, not ignore it. I am thus throwing away yet another benchmark run -- the query now runs 300 times faster. *Sigh* test=# vacuum bench1; VACUUM test=# \d bench1 Table "bench1" Attribute | Type | Modifier -----------+----------+---------- id | integer | not null id2 | integer | not null id3 | integer | not null dummy1 | char(30) | Indices: bench1_index_, bench1_index_1 test=# \d bench1_index_ Index "bench1_index_" Attribute | Type -----------+--------- id | integer id2 | integer unique btree test=# test=# test=# \d bench1_index_1 Index "bench1_index_1" Attribute | Type -----------+--------- id3 | integer btree test=# explain update bench1 set dummy1='updated' where id=150; NOTICE: QUERY PLAN: Seq Scan on bench1 (cost=0.00..6843.00 rows=3000 width=18) EXPLAIN test=# vacuum bench1; VACUUM test=# explain update bench1 set dummy1='updated' where id=150; NOTICE: QUERY PLAN: Seq Scan on bench1 (cost=0.00..6843.00 rows=3000 width=18) EXPLAIN test=# select count(*) from bench1; count -------- 300000 (1 row) test=# select count(*) from bench1 where id = 150; count ------- 1 (1 row) test=# explain select count(*) from bench1 where id = 150; NOTICE: QUERY PLAN: Aggregate (cost=6850.50..6850.50 rows=1 width=4) -> Seq Scan on bench1 (cost=0.00..6843.00 rows=3000 width=4) EXPLAIN *************************************************************** Related to this: test=# explain select id from bench1 order by id; NOTICE: QUERY PLAN: Sort (cost=38259.21..38259.21 rows=300000 width=4) -> Seq Scan on bench1 (cost=0.00..6093.00 rows=300000 width=4) EXPLAIN The basic idea to speed this one up (a lot...) would be to walk the index. This is _after_ ANALYZE, of course. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sat, 20 May 2000 14:56:36 -0400 (EDT) I know I am going to regret believing that I will actually make any difference, but I am going to shoot myself anyway. I am writing this more for the new PostgreSQL members who were not around last time than in any belief it will make a difference on the MySQL end. > Hi, > > Mike Mascari: > > > > 1. alter_rename_table = no > > > > The syntax in PostgreSQL is ALTER TABLE x RENAME TO y; > > > They say "alter table crash_q rename crash_q1". > > What does the official standard say (assuming any exists) -- is the "to" > optional or not? I don't see any RENAME in the SQL92 spec. Now, how hard is it to do a 'man alter_table' and see what it says at the top of the screen? > > > 2. atomic_updates = no > > > > Huh? Besides being paranoid about fsync()'ing transactions how is > > a transaction based MVCC not atomic with respect to updates? > > > That's a misnomer. They actually mean this: > > create table crash_q (a integer not null); > create unique index crf on crash_q(a); > > insert into crash_q values (2); > insert into crash_q values (3); > insert into crash_q values (1); > update crash_q set a=a+1; Poorly named, huh? How do you think it got such a name? This item was on the crashme tests before TRANSACTION was on there? Can you explain how a very exotic issue got on there year(s) before transactions. Transactions got on there only because I complained. > > > 3. automatic_rowid = no > > > > The description simply says Automatic rowid. Does this apply to > > query result sets or to the underlying relation? If the latter, > > PostgreSQL has, of course, an OID for every tuple in the > > database. > > > I'll have them fix that. MySQL calls them "_rowid" and apparently tests > only for these. Well, I don't see _rowid in the SQL spec either, so we are both non-standard here, though I believe our OID is SQL3. > > > 4. binary_items = no > > > > Read up on large objects... > > > ... with an ... erm ... let's call it "nonstandard" ... interface. Yes. > > > 5. connections = 32 > > > > This, should, of course be +32, since PostgreSQL can easily > > handle hundreds of simultaneous connections. > > > The testing code (Perl) looks like this, and it bombs after the 32nd > connection. > > for ($i=1; $i < $max_connections ; $i++) > { > if (!($dbh=DBI->connect($server->{'data_source'},$opt_user,$opt_password, > { PrintError => 0}))) > { > print "Last connect error: $DBI::errstr\n" if ($opt_debug); > last; > } > $dbh->{LongReadLen}= $longreadlen; # Set retrieval buffer > print "." if ($opt_debug); > push(@connect,$dbh); > } > print "$i\n"; > > I do not know where that limit comes from. > It might be the DBI interface to PostgreSQL, or a runtime limit. > > Anyway, $max_connections has the value to 1000. You have to recompile the backend to increase it. Not on the client end. See FAQ. > > > 6. create_table_select = no > > > > Again. PostgreSQL supports CREATE TABLE AS SELECT (i.e. Oracle), > > and SELECT INTO syntax. > > Test code: > create table crash_q SELECT * from crash_me; > > Again, is the "AS" optional or not? man create_table. That is all it takes. There is not standard for this. It is from Oracle. Is their AS optional? Does it really matter? > > > 7. except = no > > > > PostgreSQL has had both INTERSECT and EXCEPT since 6.5.0 (albeit > > they're slow). > > > Looking at the test, we see it doing this: > > create table crash_me (a integer not null,b char(10) not null); > insert into crash_me (a,b) values (1,'a'); > create table crash_me2 (a integer not null,b char(10) not null, c integer); > insert into crash_me2 (a,b,c) values (1,'b',1); > select * from crash_me except select * from crash_me2; > > For what it's worth, there is at least one database which doesn't > have this restriction (i.e., that the number of columns must be > identical) (namely SOLID). > > So this test needs to be split into two. I'll do that. So you test EXCEPT by having a different number of columns. I can see it now, "Hey we don't have EXCEPT. PostgreSQL does it, but they can't handle a different number of columns. Let's do only that test so we look equal." > > > I'm starting to get very tired of this. I don't see why > > PostgreSQL users are obligated to get MySQL tests correct. And > > I'm only 15% through the list... > > > _Somebody_ has to get these things right. I'm not suggesting that it's > any obligation of yours specifically, but somebody's gotta do it, and > (IMHO) it can only be done by somebody who already knows _something_ > about the database to be tested. > > > Bottom line...either the test writers are ignorant or deceptive. > > Or the tests are just badly written. Or they're too old and suffer from > severe bit rot. > > > For what its worth, I do NOT think the people who wrote these tests > are either ignorant or deceptive. Most, if not all, of these tests > are OK when checked against at least one SQLish database. In looking at each of these items, it is impossible for me to believe that the tests were not written by either very ignorant people ("I can't run 'man') or very deceptive people ("Let's make ourselves look good."). If you view this from outside the MySQL crowd, can you see how we would feel this way? This is just a small example of the volumes of reasons we have in believing this. If you are going to publish things about other databases on your web site, you had better do a reasonable job to see that is it accurate and fair. If it can't be done, take it off. Don't leave it up and have it be wrong, and ignore people in the past who tell you it is wrong. It never has been fair, and I suspect never will be, because this is hashed around every year with little change or acknowledgement. So, yea, we have an attitude. We are usually nice folks, so if the majority of us have a bad attitude, there must be some basis for that feeling, and I can tell you, the majority of us do have a bad attitude on the topic. I know the MySQL folks don't have a bad attitude about us, and you know, they don't because we never did anything like this Crashme to them. But actually, we are tired of being pushed by an ignorant/deceptive crashme test, and we are starting to push back. But, you can be sure we will never stoop to the level of the crashme test. === Subject: Re: [HACKERS] More Performance From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sat, 20 May 2000 15:17:56 -0400 (EDT) > Hi, > > I've found another one of these performance problems in the benchmark, > related to another ignored index. > > The whole thing works perfectly after a VACUUM ANALYZE on the > table. > > IMHO this is somewhat non-optimal. In the absence of information > to the contrary, PostgreSQL should default to using an index if > it might be appropriate, not ignore it. This is an interesting idea. So you are saying that if a column has no vacuum analyze statistics, assume it is unique? Or are you talking about a table that has never been vacuumed? Then we assume it is a large table. Interesting. It would help some queries, but hurt others. We have gone around and around on what the default stats should be. Tom Lane can comment on this better than I can. > > Related to this: > > test=# explain select id from bench1 order by id; > NOTICE: QUERY PLAN: > > Sort (cost=38259.21..38259.21 rows=300000 width=4) > -> Seq Scan on bench1 (cost=0.00..6093.00 rows=300000 width=4) > > EXPLAIN > > The basic idea to speed this one up (a lot...) would be to walk the index. > > This is _after_ ANALYZE, of course. But you are grabbing the whole table. Our indexes are separate files. The heap is unordered, meaning a sequential scan and order by is usually faster than an index walk unless there is a restrictive WHERE clause. Thanks for the tip about needing an index on pg_index. That will be in 7.1. I remember previous crashme rounds did bring up some good info for us, like the fact older releases couldn't handle trailing comments from perl. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 22:26:40 +0200 Hi, [ Sorry if this reply is much too long. I know that...] Bruce Momjian: > I know I am going to regret believing that I will actually make any > difference, but I am going to shoot myself anyway. > I sincerely hope/believe you're wrong. > > What does the official standard say (assuming any exists) -- is the "to" > > optional or not? > > I don't see any RENAME in the SQL92 spec. Now, how hard is it to do a > 'man alter_table' and see what it says at the top of the screen? > It's not a question of your manpage vs. their manpage. I can read your manpage just fine. It's a question of whether there is somethign that can be regarded as a standard on it or not. "Official" is a poor wording in this case -- sorry. If yes, then the test will be changed to do it the standard way. If no, then I might have to test for both syntaxes, which is a PITA. While I'm at it, I note that the last sentence of that manpage says The clauses to rename columns and tables are Postgres extensions from SQL92. Correct me when I'm wrong, but is that really _your_ extension, or did some other database vendor (who?) come up with it? > > Anyway, $max_connections has the value to 1000. > > You have to recompile the backend to increase it. Not on the client > end. See FAQ. I was compiling and running the backend with default options(*). That means that the tests will show the default limits. It does this for all the other databases in the crash-me test result suite. (Oracle:40, mysql:100, solid:248, empress:10, interbase:10) Anyway, the max value for PostgreSQL, without recompiling the backend, is 1024 according to the FAQ; but there's no way an automated test can find out _that_. I'll add a comment ("installation default") to that test column. (*) except for fsync, of course, in the interest of fairness. > man create_table. That is all it takes. There is not standard for > this. It is from Oracle. Is their AS optional? Does it really matter? > No. What matters is that your opinion is that they are responsible for making the test 100% accurate. Their reply to that is that many database vendors actually provided fixes for this test instead of bitching about how inaccurate it is, thus they feel the obligation is on your side. Now I am of neither side. I am, IMHO, thus in a position to ask you about your opinion of these inaccuracies, I am going to change the crashme test to be a whole lot more accurate WRT PostgreSQL, I will feed these changes back to the MySQL people, and they'll incorporate these changes into their next release. (Their head honcho (Monty) has said so on their mailing list. I _am_ going to take him up on it, and I can be quite obnoxious if somebody reneges on a promise. *EVIL*GRIN* ) If your opinion is that you have a right to be annoyed about all of this because you went through the whole thing last year, and the year before that, and ..., ... well, I can understand your point of view. But I honestly think that the problem is not one of either malice or stupidity. "Different sets of priorities" and "different project structure" are equally-valid assumptions. At least for me. Until I'm proven wrong (IF I am). > So you test EXCEPT by having a different number of columns. I can see > it now, "Hey we don't have EXCEPT. PostgreSQL does it, but they can't > handle a different number of columns. Let's do only that test so we > look equal." > They might as well have written that test while checking their crash-me script against SOLID and noting a few features MySQL doesn't have yet. Or they may have gotten it from them in the first place. I might add that their test lists 52 features of PostgreSQL which MySQL doesn't have (13 functions). It also lists 122 features of MySQL which PostgreSQL doesn't have; 78 of those are extra functions (40 of these, just for M$-ODBC compatibility). So it seems that overall, that crash-me test result is reasonably balanced (39 vs. 44 non-function differences -- let's face it, adding another function for compatibility with SQL variant FOO is one of the easier exercises here, whatever the current value of FOO is). The result is going to be even more balanced when I'm through with it, but I cannot do that on my own, as I do not have enough experience with either PostgreSQL or the various SQL standards. Thus, I'm asking. Is that really a problem? > If you view this from outside the MySQL crowd, can you see how we would > feel this way? This is just a small example of the volumes of reasons > we have in believing this. > I would like not to view this from any outside, inside, or whatever viewpoint. My goal is to get at least some part of the petty arguments out of the way because, in MY book at least, the _real_ "battle", such as there is, isn't PostgreSQL against MySQL! It's more-or-less- -open-source databases on one side and closed-source products, some of which are the equivalent of MS Word in the database world (you know who I'm talkign about ;-) on the other side. > It never has been fair, and I suspect never will be, because this is > hashed around every year with little change or acknowledgement. > It is about as fair as a certain comparison chart on your site has been. It's gone now, thus as far as I'm concerned it's water under the bridge. Besides, I'm not interested. Some of the members of this list seem to be pretty much burned out on the whole issue -- I can live with that; but I'm trying to do something about the problem. Don't shoot the messenger. ;-) === Subject: MySQL's "crashme" (was Re: [HACKERS] Performance) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 20 May 2000 16:26:03 -0400 aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <14160.958852518.1@sss.pgh.pa.us> "Matthias Urlichs" <smurf@noris.net> writes: >> Hmm. And then who's job is it to take someone else's work and make it >> accurate? If the shoe were on the other foot: if I generated a >> benchmark suite and features list, and it contained major and numerous >> inaccuracies, who would you expect to be responsible (or at least feel >> responsible) for correcting/updating/improving it? 'Twould be me imho. >> > Umm, there's still a difference between saying (a) "it's broken, fix > it", (b) "here's my analysis as to what exactly is broken, can you fix > it", and (c) "here's a patch that fixes it". Good luck. Close analysis of the crashme test leaves an extremely bad taste in the mouth: there are just too many cases where it's clearly designed as a pro-MySQL advertising tool and not an honest attempt to describe reality. Shall we consider details? > Attached is the current crashme output. "crash_me_safe" is off only > because of the fact that some tests go beyond available memory. > There's no sense in testing how far you can push a "SELECT a from b where > c = 'xxx(several megabytes worth of Xes)'" query when the size fo a TEXT > field is limited to 32k. I would not like to see us labeled "crashme unsafe" merely because someone is too impatient to let the test run to conclusion. But there's a more interesting problem here: using stock crashme and Postgres 7.0, on my system it's crashme that crashes and not Postgres! The crashme Perl script is a huge memory hog and runs into the kernel's process-size limit long before the connected backend does. To get it to run to completion, I have to reduce the thing's limit on the longest query it will try: *** crash-me~ Sat May 20 12:28:11 2000 --- crash-me Sat May 20 13:21:11 2000 *************** *** 104,110 **** # $max_connections="+1000"; # Number of simultaneous connections ! $max_buffer_size="+16000000"; # size of communication buffer. $max_string_size="+8000000"; # Enough for this test $max_name_length="+512"; # Actually 256, but ... $max_keys="+64"; # Probably too big. --- 104,110 ---- # $max_connections="+1000"; # Number of simultaneous connections ! $max_buffer_size="+1000000"; # size of communication buffer. $max_string_size="+8000000"; # Enough for this test $max_name_length="+512"; # Actually 256, but ... $max_keys="+64"; # Probably too big. A few months ago I was able to use max_buffer_size = +2000000, but crashme 1.43 seems to be an even worse memory hog than its predecessors. At this setting, the Perl process tops out at about 114Mb while the connected backend grows to 66Mb. (I run with a process limit of 128Mb.) To be fair, this could be Perl's fault more than crashme's. I'm using Perl 5.005_03 ... anyone know if more recent versions use less memory? Now, on to some specific complaints: > alter_drop_col=no # Alter table drop column While our ALTER TABLE support is certainly pretty weak, it should be noted that this test will continue to fail even when we have ALTER TABLE DROP COLUMN, because crashme is testing for a non-SQL-compliant syntax. > alter_rename_table=no # Alter table rename table We have ALTER TABLE RENAME ... but not under the syntax crashme is testing. Since SQL92 doesn't specify a syntax for RENAME, there's no absolute authority for this --- but a quick check of the comparative crashme results at http://www.mysql.com/crash-me-choose.htmy shows that *none* of the major commercial DBMSs "pass" this test. Rather curious that crashme uses a MySQL-only syntax for this test, no? > atomic_updates=no # atomic updates What's actually being tested here is whether the DBMS will let you do "update crash_q set a=a+1" in a table with a unique index on "a" and consecutive pre-existing values. In other words, is the uniqueness constraint checked on a per-tuple-update basis, or deferred to end of transaction? It's fair to blame Postgres for not supporting a deferred uniqueness check, but this test is extremely misleadingly labeled. A person who hadn't examined the guts of crashme would probably think it tests whether concurrent transactions see each others' results atomically. > automatic_rowid=no # Automatic rowid Test is actually looking for a system column named "_rowid". Our OIDs serve the same purpose, and I believe there are equivalent features in many other DBMSes. Again, MySQL is the only "passer" of this test, which says more about their level of standardization than other people's. > binary_items=no # binary items (0x41) We have binary literals (per the test name) and hex literals (what it actually appears to be testing). Unfortunately for us, ours are SQL92-compliant syntax, and what crashme is looking for isn't. > comment_#=no # # as comment > comment_--=yes # -- as comment > comment_/**/=yes # /* */ as comment > comment_//=no # // as comment It'd be helpful to the reader if they indicated which two of these conventions are SQL-compliant ... of course, that might expose the fact that MySQL isn't ... > connections=32 # Simultaneous connections Should probably be noted that this is just the default limit (chosen to avoid creating problems on small systems) and can easily be raised at postmaster start time. > crash_me_safe=no # crash me safe I get "yes", and I'd *really* appreciate it if you not submit this misleading statement. > create_table_select=no # create table from select This is looking for "create table crash_q SELECT * from crash_me", which again appears to be a MySQL-only syntax. We have the same feature but we want "AS" in front of the "SELECT". Dunno how other DBMSs do it. > date_zero=no # Supports 0000-00-00 dates Note this is not checking to see if the date format yyyy-mm-dd is accepted, it's checking to see if the specific value '0000-00-00' is accepted. Haven't these people heard of NULL? Another test that only MySQL "passes". > except=no # except This test is checking: create table crash_me (a integer not null,b char(10) not null); create table crash_me2 (a integer not null,b char(10) not null, c integer); select * from crash_me except select * from crash_me2; Postgres rejects it with ERROR: Each UNION | EXCEPT | INTERSECT query must have the same number of columns. Unsurprisingly, hardly anyone else accepts it either. > except_all=no # except all While we do not have "except all", when we do this test will still fail for the same reason as above. > func_extra_not=no # Function NOT in SELECT What they are looking for here is "SELECT NOT 0", which Postgres rejects as a type violation. SQL-compliant "NOT FALSE" would work. BTW, while I haven't got the patience to go through the function list in detail, quite a few functions that we actually have are shown as "not there" because of type resolution issues. For example they test exp() with "select exp(1)" which fails because of ambiguity about whether exp(float8) or exp(numeric) is wanted. This will get cleaned up soon, but it's not really a big problem in practice... > having_with_alias=no # Having on alias Again, how curious that MySQL is the only DBMS shown as passing this test. Couldn't be because it violates SQL92, could it? > insert_select=no # insert INTO ... SELECT ... We would pass this test if the crashme script weren't buggy: it fails to clean up after a prior test that creates a crash_q table with different column names. The prior test is testing "drop table if exists", which means the only way to be shown as having this SQL-standard feature is to implement the not-standard "if exists". > intersect=no # intersect > intersect_all=no # intersect all See above comments for EXCEPT. > logical_value=1 # Value of logical operation (1=1) A rather odd result, considering that what Postgres actually returns for "SELECT (1=1)" is 't'. But showing the correct answer isn't one of crashme's highest priorities... > minus_neg=no # Calculate 1--1 Another case where "passing" the test means accepting MySQL's version of reality instead of SQL92's. All the SQL-compliant DBMSs think -- is a comment introducer, so "select a--1 from crash_me" produces an error ... but not in MySQL ... > quote_ident_with_"=no # " as identifier quote (ANSI SQL) > quote_ident_with_[=no # [] as identifier quote > quote_ident_with_`=no # ` as identifier quote Here at least they admit which variant is ANSI ;-). Postgres doesn't pass because we think 'select "A" from crash_me' should look for a column named upper-case-A, but the column is actually named lower-case-a. We are not conforming to the letter of the SQL standard here --- SQL says an unquoted name should be mapped to all upper case, not all lower case as we do it, which is how the column got to be named that way. We're closer than MySQL though... > select_string_size=+16208 # constant string size in SELECT I got 1048567 here, roughly corresponding to where I set max_buffer_size. Not sure why you get a smaller answer. > select_table_update=no # Update with sub select We certainly have update with sub select. What they're looking for is the non-SQL-compliant syntax update crash_q set crash_q.b= (select b from crash_me where crash_q.a = crash_me.a); It works in Postgres if you remove the illegal table specification: update crash_q set b= (select b from crash_me where crash_q.a = crash_me.a); > type_sql_bit=yes # Type bit > type_sql_bit(1_arg)=yes # Type bit(1 arg) > type_sql_bit_varying(1_arg)=yes # Type bit varying(1 arg) It should probably be noted that we only have syntax-level support for BIT types in 7.0; they don't actually work. The test is not deep enough to notice that, however. General comments: It appears that they've cleaned up their act a little bit. The last time I examined crashme in any detail, there was an even longer list of tests that checked for standard features but were careful to use a nonstandard variant so they could claim that other people failed to have the feature at all. More generally, it's difficult to take seriously a test method and presentation method that puts more weight on how many variant spellings of "log()" you accept than on whether you have subselects. (I count five entries versus two.) One could also complain about the very large number of tests that are checking features that are non-SQL if not downright SQL-contradictory, but are listed simply as bullet points with no pro or con. A naive reader would think that green stars are always good; they are not, but how are you to tell without a copy of the SQL spec in hand? Finally, the test coverage seems to have been designed with an eye towards giving MySQL as many green stars as possible, not towards exercising the most important features of SQL. It would be interesting to see considerably more coverage of subselects, for example, and I expect that'd turn up shortcomings in a number of products including Postgres. But it won't happen as long as crashme is a tool of, by, and for MySQL partisans (at least not till MySQL has subselects, whereupon the test coverage will no doubt change). Just FYI, I attach a diff between what you presented and what I get from running the current crashme. I don't understand exactly what's causing the small differences in the values of some of the size limits. Perhaps it is a side effect of using a different max_buffer_size, but it seems really weird. regards, tom lane ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <14160.958852518.2@sss.pgh.pa.us> Content-Description: Pg.cfg.diff 37c37 < crash_me_safe=no # crash me safe --- > crash_me_safe=yes # crash me safe 309c309 < max_char_size=8104 # max char() size --- > max_char_size=8088 # max char() size 315c315 < max_index_length=2704 # index length --- > max_index_length=2700 # index length 317c317 < max_index_part_length=2704 # max index part length --- > max_index_part_length=2700 # max index part length 319,321c319,321 < max_index_varchar_part_length=2704 # index varchar part length < max_row_length=7949 # max table row length (without blobs) < max_row_length_with_null=7949 # table row length with nulls (without blobs) --- > max_index_varchar_part_length=2700 # index varchar part length > max_row_length=7937 # max table row length (without blobs) > max_row_length_with_null=7937 # table row length with nulls (without blobs) 326c326 < max_text_size=8104 # max text or blob size --- > max_text_size=8092 # max text or blob size 328c328 < max_varchar_size=8104 # max varchar() size --- > max_varchar_size=8088 # max varchar() size 344c344 < operating_system=Linux 2.3.99s-noris-pre9-2 i686 # crash-me tested on --- > operating_system=HP-UX B.10.20 9000/780 # crash-me tested on 355c355 < query_size=16777216 # query size --- > query_size=1048576 # query size 369c369 < select_string_size=+16208 # constant string size in SELECT --- > select_string_size=1048567 # constant string size in SELECT 490c490 < where_string_size=+16208 # constant string size in where --- > where_string_size=1048542 # constant string size in where ------- =_aaaaaaaaaa0-- === Subject: Re: [HACKERS] More Performance From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 22:30:21 +0200 Hi, Bruce Momjian: > > > > test=# explain select id from bench1 order by id; > > Sort (cost=38259.21..38259.21 rows=300000 width=4) > > -> Seq Scan on bench1 (cost=0.00..6093.00 rows=300000 width=4) > > > The heap is unordered, meaning a sequential scan and order by is usually > faster than an index walk unless there is a restrictive WHERE clause. > What heap? The index is a b-tree in this case. Thus you should be able to walk it and get the sorted result without ever touching the data file. Whether that makes sense with the current structure of the PostgreSQL backend is a different question, of course. Certain othr databases (no, not just MySQL ;-) are capable of doing that optimization, however. == Subject: Re: [HACKERS] More Performance From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 22:40:43 +0200 Hi, Bruce Momjian: > > IMHO this is somewhat non-optimal. In the absence of information > > to the contrary, PostgreSQL should default to using an index if > > it might be appropriate, not ignore it. > > This is an interesting idea. So you are saying that if a column has no > vacuum analyze statistics, assume it is unique? Nope. But why should vacuum analyze be the one and only part of PostgreSQL where statistics are ever updated? When you have no statistics, a "column_name=CONSTANT" query for an indexed column yields exactly one result (actually, "significantly fewer results than there are 8-kbyte records in the table" would do), you might want to record the fact that using the index might, in hindsight, have been a good idea after all. Then, when the next query like that comes in, you use the index. Maybe I'm too naive ;-) but I fail to see how this approach could be either hard to implement or detrimental to performance. === Subject: Re: [HACKERS] More Performance From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sat, 20 May 2000 16:43:57 -0400 (EDT) > Hi, > > Bruce Momjian: > > > > > > test=# explain select id from bench1 order by id; > > > Sort (cost=38259.21..38259.21 rows=300000 width=4) > > > -> Seq Scan on bench1 (cost=0.00..6093.00 rows=300000 width=4) > > > > > The heap is unordered, meaning a sequential scan and order by is usually > > faster than an index walk unless there is a restrictive WHERE clause. > > > What heap? The index is a b-tree in this case. Thus you should be able > to walk it and get the sorted result without ever touching the data > file. > > Whether that makes sense with the current structure of the PostgreSQL > backend is a different question, of course. Certain othr databases > (no, not just MySQL ;-) are capable of doing that optimization, however. We can't read data from the index. It would be nice if we could, but we can't. I think we believe that there are very few cases where this would be win. Usually you need non-indexed data too. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 20 May 2000 17:44:45 -0400 Matthias Urlichs" <smurf@noris.net> writes: >> So you test EXCEPT by having a different number of columns. I can see >> it now, "Hey we don't have EXCEPT. PostgreSQL does it, but they can't >> handle a different number of columns. Let's do only that test so we >> look equal." >> > They might as well have written that test while checking their crash-me > script against SOLID and noting a few features MySQL doesn't have yet. > Or they may have gotten it from them in the first place. Our gripe is not that they're testing an extension we haven't got. It's that the test result is misleadingly labeled. It doesn't say "EXCEPT with incompatible select lists", it says "EXCEPT", full stop. That's deceptive. And no, we do not think it's an honest mistake. It's part of a consistent pattern of misstatements that's been going on for a long time. Sure, any one might be an honest mistake, but when you see the same sort of thing over and over again, your credulity drops to a low level. crashme is designed to make MySQL look good and everyone else (not just Postgres) look bad. I'm glad to hear your optimism about cleaning this up. Perhaps you can actually accomplish something, but most of us decided long ago that crashme is not meant as a fair comparison. We have other things to do than ride herd on crashme and try to keep them to the straight and narrow, when they clearly have no desire to make it an unbiased test and will not do so without constant prodding. regards, tom lane === Subject: Re: [HACKERS] More Performance From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 20 May 2000 21:13:49 -0400 Bruce Momjian <pgman@candle.pha.pa.us> writes: >> What heap? The index is a b-tree in this case. Thus you should be able >> to walk it and get the sorted result without ever touching the data >> file. > We can't read data from the index. It would be nice if we could, but we > can't. The reason we can't is that we don't store tuple validity data in indexes. The index entry has the key value and a pointer to the tuple in the main heap file, but we have to visit the tuple to find out whether it's committed or dead. If we did otherwise, then committing or killing tuples would be lots slower than it is, because we'd have to find and mark all the index entries pointing at the tuple, not just the tuple itself. It's a tradeoff... but we think it's a good one. > I think we believe that there are very few cases where this > would be win. Usually you need non-indexed data too. Right, non-toy examples usually read additional data columns anyway. regards, tom lane === Subject: Re: [HACKERS] More Performance From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sat, 20 May 2000 21:59:04 -0400 (EDT) > > We can't read data from the index. It would be nice if we could, but we > > can't. I think we believe that there are very few cases where this > > would be win. Usually you need non-indexed data too. > > I have used other databases where this _is_ possible in the past, and > the win is big when the programmer codes for it. Sure, most cases don't > just use indexed data, but if the programmer knows that the database > supports index-only scans then sometimes an extreme performance > requirement can be met. > Yes, totally true. It is an extreme optimization. In Ingres, you could actually SELECT on the index and use that when needed. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sat, 20 May 2000 21:57:50 -0400 (EDT) > Our gripe is not that they're testing an extension we haven't got. > It's that the test result is misleadingly labeled. It doesn't say > "EXCEPT with incompatible select lists", it says "EXCEPT", full stop. > That's deceptive. And no, we do not think it's an honest mistake. > It's part of a consistent pattern of misstatements that's been going on > for a long time. Sure, any one might be an honest mistake, but when you > see the same sort of thing over and over again, your credulity drops to > a low level. crashme is designed to make MySQL look good and everyone > else (not just Postgres) look bad. > > I'm glad to hear your optimism about cleaning this up. Perhaps you > can actually accomplish something, but most of us decided long ago > that crashme is not meant as a fair comparison. We have other things > to do than ride herd on crashme and try to keep them to the straight > and narrow, when they clearly have no desire to make it an unbiased > test and will not do so without constant prodding. The basic issue is that you can tell us that this big crashme mess happened by mistake, and that there there was no deceptive intent. However, really, we are not stupid enough to believe it. Why don't you find out who wrote this thing, and ask them what they were thinking when they wrote it? I bet you will find out our perception is correct. === Subject: [HACKERS] MySQL crashme test and PostgreSQL From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sat, 20 May 2000 23:13:46 -0400 (EDT) [CC: to general list.] > > > What does the official standard say (assuming any exists) -- is the "to" > > > optional or not? > > > > I don't see any RENAME in the SQL92 spec. Now, how hard is it to do a > > 'man alter_table' and see what it says at the top of the screen? > > > It's not a question of your manpage vs. their manpage. I can read your > manpage just fine. It's a question of whether there is something that > can be regarded as a standard on it or not. "Official" is a poor wording > in this case -- sorry. > > If yes, then the test will be changed to do it the standard way. > If no, then I might have to test for both syntaxes, which is a PITA. > You know, you are asking what syntax is SQL standard. It is actually not our job to report it to you. If you are responsible for the test, you should know what the standard says, and test against that. If you are not responsible for the test, then it shows that the person who is responsible for the test doesn't care enough to test for SQL standard syntax, only for MySQL syntax. You know, there is a saying, "Do it right, or don't do it at all." That is pretty much the PostgreSQL style. And if you are going to criticize someone, you better be sure you are right. We didn't write the crashme test, we don't host it on our web site, we didn't ask to be in it. Someone has to be responsible for the test, and knowing standard SQL syntax, and that must be whoever put it on the MySQL site. We really don't want to hear that it dropped from the sky and landed on the MySQL site, and no one there is responsible for it. If we put something on our site, we are responsible for it. If we don't like something or can't take ownership of it, we remove it. Now, I am not picking on you. You may have the best of intentions. But basically someone has decided to put it on the MySQL site, and has not considered it worth their while to learn the SQL standard. They would rather make other people tell them about the SQL standard, and maybe, just maybe, we will fix the test someday. Well, I will tell you, we have better things to do than fix the MySQL crashme test. > What matters is that your opinion is that they are responsible for making > the test 100% accurate. Their reply to that is that many database > vendors actually provided fixes for this test instead of bitching > about how inaccurate it is, thus they feel the obligation is on your > side. BINGO! You know, if other database vendors are stupid enough to do MySQL's work for them and read the SQL standard for them, well... You can't just point fingers and say no one at MySQL is responsible. The MySQL bias is written all through that test. > Now I am of neither side. I am, IMHO, thus in a position to ask you > about your opinion of these inaccuracies, I am going to change > the crashme test to be a whole lot more accurate WRT PostgreSQL, > I will feed these changes back to the MySQL people, and they'll > incorporate these changes into their next release. (Their head honcho > (Monty) has said so on their mailing list. I _am_ going to take him up > on it, and I can be quite obnoxious if somebody reneges on a promise. > *EVIL*GRIN* ) You know, how do we know he is not just saying that hoping no one will actually take him up on it. You know, Monty was on this list last year, and he asked why we had a bad attitude about MySQL, and we told him about the crashme test, and you know, nothing happened. So I don't think it is very important to Monty to be fair, or more accurately, he would rather keep a test that makes MySQL look good, than to spend time making the test fair. He made his choice. I can tell you our reaction would be totally different. > I might add that their test lists 52 features of PostgreSQL which > MySQL doesn't have (13 functions). It also lists 122 features of MySQL > which PostgreSQL doesn't have; 78 of those are extra functions (40 of > these, just for M$-ODBC compatibility). > > So it seems that overall, that crash-me test result is reasonably > balanced (39 vs. 44 non-function differences -- let's face it, adding > another function for compatibility with SQL variant FOO is one of the > easier exercises here, whatever the current value of FOO is). You have to make the test deceptive to get MySQL to be on par with PostgreSQL. Period. Doesn't MySQL admit they have fewer features than PostgreSQL. How did MySQL get an equal score on features? Answer me that one. We have given enough of our time to this, and have pointed out many problems. Why don't you go an get those fixed, to show that the MySQL group is working in good faith on this, and then, go and get a copy of the standard, or a book about standard SQL, and start actually doing something about the test. And if it is not worth your time, and it is not worth any one else's time at MySQL, then you folks have to admit you want to criticize PostgreSQL without spending time to be fair about it. I am going to suggest that no one else in the PostgreSQL group send any more problem reports about the crashme tests until some changes appear on the MySQL end. Tom Lane has already done a great job of illustrating the issues involved. Pointing to actual SQL items is not the real problem. The MySQL attitude about crashme is the problem. Also, I have heard about the hit squads attacking MySQL. I never condone inaccuracy or attacks, but I can understand why it is happening. For years, I believe the deceptiveness of the MySQL crashme test has hampered acceptance of PostgreSQL. And our response was to just reply with our opinion when asked about it. We didn't create a web page to attack MySQL and make them look bad. We believed that in the end, truth always wins. So we kept going, and you know, in the end, truth does win. We have a $25 million dollar company forming around PostgreSQL, with maybe more to come. We are on our way up, even though the MySQL crashme test delayed us. And there is a saying "If you are not nice to people on your way up, they will not be nice to you on the way down." I bet the hit squads are frustrated people who have seen unfair things said about PostgreSQL for years, with nothing they could do about it. Now they can do something, and they are talking. But instead of one web page with deceptive results, you have 100 people all over the net slamming MySQL. There is a certain poetic justice in that. The saying goes, "Oh what a tangled web we weave, When first we practice to deceive". === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Sun, 21 May 2000 05:06:10 +0200 Hi, Bruce Momjian: > > > > > 2. atomic_updates = no > > That's a misnomer. They actually mean this: > > > > create table crash_q (a integer not null); > > create unique index crf on crash_q(a); > > > > insert into crash_q values (2); > > insert into crash_q values (3); > > insert into crash_q values (1); > > update crash_q set a=a+1; > > Poorly named, huh? How do you think it got such a name? This item was > on the crashme tests before TRANSACTION was on there? It probably got that name because nobody thought about people associating atomicity with transactions. Anyway, the issue isn't all that exotic. ms-sql, mimer, db2, solid and sybase are listed as supporting this kind of update. If you can think of an understandable five-word-or-so description for it, I'll happily rename the test. I've been thinking about it for the last ten minutes or so, but couldn't come up with one. :-/ A different question is whether the database bungles the update when the first few row can be updated and THEN you run into a conflict. PostgreSQL handles this case correctly, MySQL doesn't => I'll add a test for it. === Subject: [HACKERS] Re: MySQL crashme test and PostgreSQL From: Mike Mascari <mascarm@mascari.com> Date: Sat, 20 May 2000 23:37:17 -0400 Bruce Momjian wrote: > > [CC: to general list.] > > > I might add that their test lists 52 features of PostgreSQL which > > MySQL doesn't have (13 functions). It also lists 122 features of > > MySQL which PostgreSQL doesn't have; 78 of those are extra > > functions (40 of these, just for M$-ODBC compatibility). > > > > > So it seems that overall, that crash-me test result is reasonably > > balanced (39 vs. 44 non-function differences -- let's face it, > > adding another function for compatibility with SQL variant FOO is > > one of the easier exercises here, whatever the current value of > > FOO is). > > You have to make the test deceptive to get MySQL to be on par with > PostgreSQL. Period. Doesn't MySQL admit they have fewer features > than PostgreSQL. How did MySQL get an equal score on features? > Answer me that one. That's easy: MySQL has type mediumint PostgreSQL has transactions MySQL allows 'and' as string markers PostgreSQL has views MySQL has case insensitive compare PostgreSQL has referential integrity MySQL has support for 0000-00-00 dates PostgreSQL has subqueries MySQL has 'drop table if exists' PostgreSQL has multiversion concurrency control etc. See? Equal. I hope my sarcasm is not too overstated. === Subject: Re: [HACKERS] Re: Heaps of read() syscalls by the postmaster From: "Matthias Urlichs" <smurf@noris.net> Date: Sun, 21 May 2000 06:14:15 +0200 Hi, Tom Lane: > entry construction (worst possible time). If you care to try the > repaired code, see our CVS server, or grab a nightly snapshot dated > later than this message. > Thank you, I'll do that. === Subject: [HACKERS] Re: MySQL crashme test and PostgreSQL From: "Matthias Urlichs" <smurf@noris.net> Date: Sun, 21 May 2000 06:26:46 +0200 Hi, Bruce Momjian: > Also, I have heard about the hit squads attacking MySQL. I never > condone inaccuracy or attacks, but I can understand why it is happening. > You _are_ doing your side of the story a disservice, you know that? > For years, I believe the deceptiveness of the MySQL crashme test has > hampered acceptance of PostgreSQL. And our response was to just reply > with our opinion when asked about it. Yeah, I can see that. Let me tell you up front that your opinion is not at all helpful to either the cause of PostgreSQL or to the problems between you and the MySQL people, especially when stated like this. This is the Internet. The right thing to do if somebody spreads bad information (a biased, inaccurate, wrong, deceptive, what-have-you) crash-me test would be to write your own test which either prefers PostgreSQL, or is reasonably neutral. I'll shut up now, until the first of my patches is in the crash-me suite. Perhaps that will have _some_ impact here. === Subject: [HACKERS] Re: MySQL crashme test and PostgreSQL From: "Matthias Urlichs" <smurf@noris.net> Date: Sun, 21 May 2000 06:34:13 +0200 Hi, Mike Mascari: > MySQL has type mediumint > PostgreSQL has transactions > > MySQL allows 'and' as string markers > PostgreSQL has views Look, we all know that transaction support is more important than type mediumint or the function ODBC LENGTH or cosine or whatever. But what should I, or anybody else, do about it, in your opinion? Take the "unimportant" tests out? Invent a couple of inconsequential stuff PostgreSQL can do to balance the book? Repeat the "transactions=no" entry in the crash-me results file ten times? > See? Equal. I hope my sarcasm is not too overstated. Sarcasm hasn't helped in the past, and it's unlikely to help in the future. === Subject: Re: [HACKERS] More Performance From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun, 21 May 2000 00:45:58 -0400 Matthias Urlichs" <smurf@noris.net> writes: > I've found another one of these performance problems in the benchmark, > related to another ignored index. > The whole thing works perfectly after a VACUUM ANALYZE on the > table. > IMHO this is somewhat non-optimal. In the absence of information > to the contrary, PostgreSQL should default to using an index if > it might be appropriate, not ignore it. Just FYI: Postgres absolutely does not "ignore" an index in the absence of VACUUM ANALYZE stats. However, the default assumptions about selectivity stats create cost estimates that are not too far different for index and sequential scans. On a never-vacuumed table you will get an indexscan for "WHERE col = foo". If the table has been vacuumed but never vacuum analyzed, it turns out that you get varying results depending on the size of the table and the average tuple size (since the planner now has non-default info about the table size, but still nothing about the actual selectivity of the WHERE condition). The cost estimation code is under active development, and if you check the pgsql list archives you will find lively discussions about its deficiencies ;-). But I'm having a hard time mustering much concern about whether it behaves optimally in the vacuum-but-no-vacuum-analyze case. === Subject: Re: MySQL's "crashme" (was Re: [HACKERS] Performance) From: "Matthias Urlichs" <smurf@noris.net> Date: Sun, 21 May 2000 08:25:32 +0200 Hi, Tom Lane: > I would not like to see us labeled "crashme unsafe" merely because > someone is too impatient to let the test run to conclusion. There's not much anybody can do about a backend which dies because of a "hard" out-of-memory error which the OS only notices when all it can do is segfault the client. Anyway, I'll go through your list of problems and create an appropriate patch for the beast. Not to forget: THANK YOU for taking the time to go through this. > > alter_drop_col=no # Alter table drop column > > While our ALTER TABLE support is certainly pretty weak, it should be > noted that this test will continue to fail even when we have ALTER TABLE > DROP COLUMN, because crashme is testing for a non-SQL-compliant syntax. > You mean because the COLUMN keyword is missing? Added. > > alter_rename_table=no # Alter table rename table > > We have ALTER TABLE RENAME ... but not under the syntax crashme is > testing. TO keyword added. > > atomic_updates=no # atomic updates > Test clarified and two new tests added. The result now is: atomic_updates=no # update constraint check are deferred atomic_updates_ok=yes # failed atomic update MySQL has "no" and "no". > > automatic_rowid=no # Automatic rowid > > Test is actually looking for a system column named "_rowid". Our OIDs > serve the same purpose I'll add a test for OIDs. > > binary_items=no # binary items (0x41) > > We have binary literals (per the test name) and hex literals (what > it actually appears to be testing). Unfortunately for us, ours are > SQL92-compliant syntax, and what crashme is looking for isn't. > I'll tell them to fix that. > > comment_#=no # # as comment > > It'd be helpful to the reader if they indicated which two of these > conventions are SQL-compliant ... of course, that might expose the > fact that MySQL isn't ... > Are there any problems with using '#' as a comment character, given that MySQL doesn't support user-defined operators? > > connections=32 # Simultaneous connections > > Should probably be noted that this is just the default limit Noted. > > crash_me_safe=no # crash me safe > > I get "yes", and I'd *really* appreciate it if you not submit this > misleading statement. > I won't submit test results (not until the thing is cleaned up to everybody's satisfaction), but I'll change the documentation of the thing to state that >>> Some of the tests you are about to execute require a lot of memory. >>> Your tests _will_ adversely affect system performance. Either this >>> crash-me test program, or the actual database back-end, _will_ die with >>> an out-of-memory error. So might any other program on your system if it >>> requests more memory at the wrong time. Good enough? > > date_zero=no # Supports 0000-00-00 dates > Another test that only MySQL "passes". ... and SOLID. > > except=no # except > Unsurprisingly, hardly anyone else accepts it either. SOLID again. I'll mark the features that are necessary for SQL compliancy (as well as those that actually are detrimental to it). _After_ the actual test results are cleaned up. > What they are looking for here is "SELECT NOT 0", which Postgres rejects > as a type violation. SQL-compliant "NOT FALSE" would work. > ... not with MySQL, which doesn't have FALSE in the first place. :-( I added another test for TRUE/FALSE, and fixed the NOT-0 thing. > > having_with_alias=no # Having on alias > > Again, how curious that MySQL is the only DBMS shown as passing this > test. Couldn't be because it violates SQL92, could it? > No, but it's an extremely nice feature to have, IMHO. I will not do anything about tests for extensions that won't hurt one way or another. Classifying them, as noted above, should be sufficient. > > insert_select=no # insert INTO ... SELECT ... > > We would pass this test if the crashme script weren't buggy: it fails > to clean up after a prior test that creates a crash_q table with > different column names. Fixed. > > logical_value=1 # Value of logical operation (1=1) > > A rather odd result, considering that what Postgres actually returns for > "SELECT (1=1)" is 't'. But showing the correct answer isn't one of > crashme's highest priorities... > > > minus_neg=no # Calculate 1--1 > > Another case where "passing" the test means accepting MySQL's version of > reality instead of SQL92's. All the SQL-compliant DBMSs think -- is a > comment introducer So does MySQL -- when the '--' is followed by a space. They do explain that in their documentation. I have to agree with them -- you can turn a perfectly legal SQL statement into dangerous nonsense with this kind of comment. >>> $dbh->do("update foo set bar = baz-$adjust where something-or-other"). That kind of mistake can be rather expensive. > > select_string_size=+16208 # constant string size in SELECT > > I got 1048567 here, roughly corresponding to where I set max_buffer_size. > Not sure why you get a smaller answer. > Note the '+'. I have changed the test to 2*max_row_size since anything bigger will return an empty answer anyway. > > select_table_update=no # Update with sub select > > We certainly have update with sub select. What they're looking for is > the non-SQL-compliant syntax > update crash_q set crash_q.b= > (select b from crash_me where crash_q.a = crash_me.a); Gah. Thanks; fixed. > One could also complain about the very large number of tests that are > checking features that are non-SQL if not downright SQL-contradictory, > but are listed simply as bullet points with no pro or con. A naive > reader would think that green stars are always good; they are not, > but how are you to tell without a copy of the SQL spec in hand? > I'll adapt the comments, but that is quite time consuming (and the changes are extensive) and thus will have to wait until after the first round is in one of their next alpha-test releases. === Subject: Re: MySQL's "crashme" (was Re: [HACKERS] Performance) From: Kaare Rasmussen <kar@webline.dk> Date: Sun, 21 May 2000 11:34:00 +0200 minus_neg=no # Calculate 1--1 Minus_neg expressed as select 1- -1; works. === Subject: Re: [HACKERS] Re: MySQL crashme test and PostgreSQL From: Benjamin Adida <ben@mit.edu> Date: Sun, 21 May 2000 10:31:45 -0400 on 5/21/00 12:34 AM, Matthias Urlichs at smurf@noris.net wrote: > But what should I, or anybody else, do about it, in your opinion? Take > the "unimportant" tests out? Invent a couple of inconsequential stuff > PostgreSQL can do to balance the book? Repeat the "transactions=no" > entry in the crash-me results file ten times? Take the unimportant tests out. Absolutely. Explain why the important tests are important. The MySQL team is responsible for teaching a generation of hackers that "transactions aren't important, they're just for lazy coders." The solution here looks extremely simple. The only risk, of course, is that it makes MySQL look bad, which I understand could be an unwanted outcome on your end. -Ben === Subject: Re: MySQL's "crashme" (was Re: [HACKERS] Performance) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun, 21 May 2000 13:10:27 -0400 Matthias Urlichs" <smurf@noris.net> writes: > Tom Lane: >> I would not like to see us labeled "crashme unsafe" merely because >> someone is too impatient to let the test run to conclusion. > There's not much anybody can do about a backend which dies because of a > "hard" out-of-memory error which the OS only notices when all it can do > is segfault the client. I'm just saying that it's unfair to downrate us when the problem is demonstrably in crashme itself and not in Postgres. >>>> alter_drop_col=no # Alter table drop column >> >> While our ALTER TABLE support is certainly pretty weak, it should be >> noted that this test will continue to fail even when we have ALTER TABLE >> DROP COLUMN, because crashme is testing for a non-SQL-compliant syntax. >> > You mean because the COLUMN keyword is missing? Added. No, the COLUMN keyword is optional according to the SQL92 specification: <alter table statement> ::= ALTER TABLE <table name> <alter table action> <alter table action> ::= <add column definition> | <alter column definition> | <drop column definition> | <add table constraint definition> | <drop table constraint definition> <drop column definition> ::= DROP [ COLUMN ] <column name> <drop behavior> <drop behavior> ::= CASCADE | RESTRICT What is *not* optional is a <drop behavior> keyword. Although we don't yet implement DROP COLUMN, our parser already has this statement in it --- and it follows the SQL92 grammar. >>>> comment_#=no # # as comment >> >> It'd be helpful to the reader if they indicated which two of these >> conventions are SQL-compliant ... of course, that might expose the >> fact that MySQL isn't ... >> > Are there any problems with using '#' as a comment character, given that > MySQL doesn't support user-defined operators? Only in that your scripts don't port to spec-compliant DBMSes ... >>>> Some of the tests you are about to execute require a lot of memory. >>>> Your tests _will_ adversely affect system performance. Either this >>>> crash-me test program, or the actual database back-end, _will_ die with >>>> an out-of-memory error. So might any other program on your system if it >>>> requests more memory at the wrong time. > Good enough? No, pretty misleading I'd say. Since the crashme script does have a limit on max_buffer_size, it *will* run to completion if run on a machine with a sufficiently large per-process memory limit (and enough swap of course). I may just be old-fashioned in running with a not-so-large memory limit. It'd probably be more helpful if you document the behavior seen when crashme runs out of memory (for me, the script just stops cold with no notice) and what to do about it (reduce max_buffer_size until it runs to completion). >>>> date_zero=no # Supports 0000-00-00 dates >> Another test that only MySQL "passes". > ... and SOLID. Still doesn't mean it's a good idea ;-) >>>> except=no # except >> Unsurprisingly, hardly anyone else accepts it either. > SOLID again. It'd be appropriate to divide this into two tests, or at least relabel it. >>>> minus_neg=no # Calculate 1--1 >> >> Another case where "passing" the test means accepting MySQL's version of >> reality instead of SQL92's. All the SQL-compliant DBMSs think -- is a >> comment introducer > So does MySQL -- when the '--' is followed by a space. Considering how much we got ragged on for not being perfectly compliant with SQL-spec handling of comments (up till 7.0 our parser didn't recognize "--" as a comment if it was embedded in a multicharacter operator --- but we knew that was a bug), I don't have a lot of sympathy for MySQL unilaterally redefining the spec here. And I have even less for them devising a test that can only be "passed" by non-spec-compliant parsers, and then deliberately mislabeling it to give the impression that the spec-compliant systems are seriously broken. How about labeling the results "Fails to recognize -- comment introducer unless surrounded by whitespace"? Anyway, I am pleased to see you trying to clean up the mess. Good luck! === Subject: Re: [HACKERS] Re: MySQL crashme test and PostgreSQL From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sun, 21 May 2000 13:49:45 -0400 (EDT) > Hi, > > Bruce Momjian: > > Also, I have heard about the hit squads attacking MySQL. I never > > condone inaccuracy or attacks, but I can understand why it is happening. > > > You _are_ doing your side of the story a disservice, you know that? Hey, I am not saying I like it happening. All I am saying is that I can understand why it is happening. Certainly MSSQL and Oracle are the real products we need to compete against. > > > For years, I believe the deceptiveness of the MySQL crashme test has > > hampered acceptance of PostgreSQL. And our response was to just reply > > with our opinion when asked about it. > > Yeah, I can see that. > > Let me tell you up front that your opinion is not at all helpful to > either the cause of PostgreSQL or to the problems between you and the > MySQL people, especially when stated like this. > > > This is the Internet. The right thing to do if somebody spreads bad > information (a biased, inaccurate, wrong, deceptive, what-have-you) > crash-me test would be to write your own test which either prefers > PostgreSQL, or is reasonably neutral. We have better things to do than compete against deceptive tests. We just work to make are our product better and better. Making another crashme test is not going to make PostgreSQL a better piece of software. === Subject: [HACKERS] Re: [GENERAL] Re: MySQL crashme test and PostgreSQL From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Sun, 21 May 2000 13:44:48 -0400 (EDT) That's easy: > > MySQL has type mediumint > PostgreSQL has transactions > > MySQL allows 'and' as string markers > PostgreSQL has views > > MySQL has case insensitive compare > PostgreSQL has referential integrity > > MySQL has support for 0000-00-00 dates > PostgreSQL has subqueries > > MySQL has 'drop table if exists' > PostgreSQL has multiversion concurrency control > > etc. > > See? Equal. I hope my sarcasm is not too overstated. It took me a minute to figure this out. Wow, that was funny. I am still laughing. === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun, 21 May 2000 14:45:22 -0400 Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > Isn't it a fundamental principle to define primary(unique > identification) constraint for each table ? > I had never thought that the only one index of pg_attrdef > isn't an unique identification index until I came across the > unexpcted result of my DROP COLUMN test case. Good point --- I was only thinking about the performance aspect, but if we're going to have unique indexes to prevent errors in other system tables then pg_attrdef deserves one too. Actually, I have a more radical proposal: why not get rid of pg_attrdef entirely, and add its two useful columns (adsrc, adbin) to pg_attribute? If we allow them to be NULL for attributes with no default, then there's no space overhead where they're not being used, and we eliminate any need for the second table. regards, tom lane === Subject: Re: [HACKERS] Performance (was: The New Slashdot Setup (includes MySql server)) From: "Matthias Urlichs" <smurf@noris.net> Date: Sat, 20 May 2000 12:14:38 +0200 5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Hi, Thomas Lockhart: > > Hmm. And then who's job is it to take someone else's work and make it > accurate? If the shoe were on the other foot: if I generated a > benchmark suite and features list, and it contained major and numerous > inaccuracies, who would you expect to be responsible (or at least feel > responsible) for correcting/updating/improving it? 'Twould be me imho. > Umm, there's still a difference between saying (a) "it's broken, fix it", (b) "here's my analysis as to what exactly is broken, can you fix it", and (c) "here's a patch that fixes it". I get the distinct impression that most of the communication between the PostgreSQL and MySQL people has been looking more like (a) in the past... if I can help both projects by doing some "translation" towards (b) and (c), if at all possible, then so much the better. > We've tried, and failed (to date) to contribute information to the > "crashme" travesty. My recollection was a ~30% error rate on > information for Postgres, and I didn't look into the stats for other > databases. Check the archives for details. > Attached is the current crashme output. "crash_me_safe" is off only because of the fact that some tests go beyond available memory. There's no sense in testing how far you can push a "SELECT a from b where c = 'xxx(several megabytes worth of Xes)'" query when the size fo a TEXT field is limited to 32k. Limits with '+' in front of the number say that this is the max value tested, without implying whether higher values are OK or not. If you have any remarks, especially about the '=no' results (i.e. you think PostgreSQL can do that, therefore the crashme test must be wrong somehow), tell me. Otherwise I'll forward the results to the MySQL people next week. The crash-me test script, BTW, is included in MySQL's sql-bench subdirectory. --5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=pgtest #This file is automaticly generated by crash-me 1.43 NEG=yes # update of column= -column Need_cast_for_null=no # Need to cast NULL for arithmetic alter_add_col=yes # Alter table add column alter_add_constraint=no # Alter table add constraint alter_add_foreign_key=yes # Alter table add foreign key alter_add_multi_col=no # Alter table add many columns alter_add_primary_key=no # Alter table add primary key alter_add_unique=no # Alter table add unique alter_alter_col=yes # Alter table alter column default alter_change_col=no # Alter table change column alter_drop_col=no # Alter table drop column alter_drop_constraint=no # Alter table drop constraint alter_drop_foreign_key=no # Alter table drop foreign key alter_drop_primary_key=no # Alter table drop primary key alter_drop_unique=no # Alter table drop unique alter_modify_col=no # Alter table modify column alter_rename_table=no # Alter table rename table atomic_updates=no # atomic updates automatic_rowid=no # Automatic rowid binary_items=no # binary items (0x41) case_insensitive_strings=no # case insensitive compare char_is_space_filled=yes # char are space filled column_alias=yes # Column alias columns_in_group_by=+64 # number of columns in group by columns_in_order_by=+64 # number of columns in order by comment_#=no # # as comment comment_--=yes # -- as comment comment_/**/=yes # /* */ as comment comment_//=no # // as comment compute=no # Compute connections=32 # Simultaneous connections constraint_check=yes # Column constraints constraint_check_table=yes # Table constraints constraint_null=yes # NULL constraint (SyBase style) crash_me_safe=no # crash me safe crash_me_version=1.43 # crash me version create_default=yes # default value for column create_default_func=no # default value function for column create_if_not_exists=no # create table if not exists create_index=yes # create index create_schema=no # Create SCHEMA create_table_select=no # create table from select cross_join=yes # cross join (same as from a,b) date_last=yes # Supports 9999-12-31 dates date_one=yes # Supports 0001-01-01 dates date_with_YY=yes # Supports YY-MM-DD 2000 compilant dates date_zero=no # Supports 0000-00-00 dates domains=no # Domains (ANSI SQL) double_quotes=yes # Double '' as ' in strings drop_if_exists=no # drop table if exists drop_index=yes # drop index end_colon=yes # allows end ';' except=no # except except_all=no # except all float_int_expr=yes # mixing of integer and float in expression foreign_key=yes # foreign keys foreign_key_syntax=yes # foreign key syntax full_outer_join=no # full outer join func_extra_!=no # Function NOT as '!' in SELECT func_extra_%=yes # Function MOD as % func_extra_&=no # Function & (bitwise and) func_extra_&&=no # Function AND as '&&' func_extra_<>=yes # Function <> in SELECT func_extra_==yes # Function = func_extra_add_months=no # Function ADD_MONTHS func_extra_and_or=yes # Function AND and OR in SELECT func_extra_ascii_char=no # Function ASCII_CHAR func_extra_ascii_code=no # Function ASCII_CODE func_extra_atn2=no # Function ATN2 func_extra_auto_num2string=no # Function automatic num->string convert func_extra_auto_string2num=yes # Function automatic string->num convert func_extra_between=yes # Function BETWEEN in SELECT func_extra_binary_shifts=no # Function << and >> (bitwise shifts) func_extra_bit_count=no # Function BIT_COUNT func_extra_ceil=yes # Function CEIL func_extra_charindex=no # Function CHARINDEX func_extra_chr=no # Function CHR func_extra_concat_as_+=no # Function concatenation with + func_extra_concat_list=no # Function CONCAT(list) func_extra_convert=no # Function CONVERT func_extra_cosh=no # Function COSH func_extra_date_format=no # Function DATE_FORMAT func_extra_dateadd=no # Function DATEADD func_extra_datediff=no # Function DATEDIFF func_extra_datename=no # Function DATENAME func_extra_datepart=no # Function DATEPART func_extra_elt=no # Function ELT func_extra_encrypt=no # Function ENCRYPT func_extra_field=no # Function FIELD func_extra_format=no # Function FORMAT func_extra_from_days=no # Function FROM_DAYS func_extra_from_unixtime=no # Function FROM_UNIXTIME func_extra_getdate=no # Function GETDATE func_extra_greatest=no # Function GREATEST func_extra_if=no # Function IF func_extra_in_num=yes # Function IN on numbers in SELECT func_extra_in_str=no # Function IN on strings in SELECT func_extra_initcap=yes # Function INITCAP func_extra_instr=no # Function LOCATE as INSTR func_extra_instr_oracle=no # Function INSTR (Oracle syntax) func_extra_instrb=no # Function INSTRB func_extra_interval=no # Function INTERVAL func_extra_last_day=no # Function LAST_DAY func_extra_last_insert_id=no # Function LAST_INSERT_ID func_extra_least=no # Function LEAST func_extra_lengthb=no # Function LENGTHB func_extra_like=yes # Function LIKE in SELECT func_extra_like_escape=no # Function LIKE ESCAPE in SELECT func_extra_ln=no # Function LN func_extra_log(m_n)=yes # Function LOG(m,n) func_extra_logn=no # Function LOGN func_extra_lpad=yes # Function LPAD func_extra_mdy=no # Function MDY func_extra_mid=no # Function SUBSTRING as MID func_extra_months_between=no # Function MONTHS_BETWEEN func_extra_not=no # Function NOT in SELECT func_extra_not_between=yes # Function NOT BETWEEN in SELECT func_extra_not_like=yes # Function NOT LIKE in SELECT func_extra_odbc_convert=no # Function ODBC CONVERT func_extra_password=no # Function PASSWORD func_extra_paste=no # Function PASTE func_extra_patindex=no # Function PATINDEX func_extra_period_add=no # Function PERIOD_ADD func_extra_period_diff=no # Function PERIOD_DIFF func_extra_pow=no # Function POW func_extra_range=no # Function RANGE func_extra_regexp=no # Function REGEXP in SELECT func_extra_replicate=no # Function REPLICATE func_extra_reverse=no # Function REVERSE func_extra_root=no # Function ROOT func_extra_round1=yes # Function ROUND(1 arg) func_extra_rpad=yes # Function RPAD func_extra_sec_to_time=no # Function SEC_TO_TIME func_extra_sinh=no # Function SINH func_extra_str=no # Function STR func_extra_strcmp=no # Function STRCMP func_extra_stuff=no # Function STUFF func_extra_substrb=no # Function SUBSTRB func_extra_substring_index=no # Function SUBSTRING_INDEX func_extra_sysdate=no # Function SYSDATE func_extra_tail=no # Function TAIL func_extra_tanh=no # Function TANH func_extra_time_to_sec=no # Function TIME_TO_SEC func_extra_to_days=no # Function TO_DAYS func_extra_translate=yes # Function TRANSLATE func_extra_trim_many_char=yes # Function TRIM; Many char extension func_extra_trim_substring=error # Function TRIM; Substring extension func_extra_trunc=yes # Function TRUNC func_extra_uid=no # Function UID func_extra_unix_timestamp=no # Function UNIX_TIMESTAMP func_extra_userenv=no # Function USERENV func_extra_version=yes # Function VERSION func_extra_weekday=no # Function WEEKDAY func_extra_|=no # Function | (bitwise or) func_extra_||=no # Function OR as '||' func_odbc_abs=yes # Function ABS func_odbc_acos=yes # Function ACOS func_odbc_ascii=yes # Function ASCII func_odbc_asin=yes # Function ASIN func_odbc_atan=yes # Function ATAN func_odbc_atan2=yes # Function ATAN2 func_odbc_ceiling=no # Function CEILING func_odbc_char=no # Function CHAR func_odbc_concat=no # Function CONCAT(2 arg) func_odbc_cos=yes # Function COS func_odbc_cot=yes # Function COT func_odbc_curdate=no # Function CURDATE func_odbc_curtime=no # Function CURTIME func_odbc_database=no # Function DATABASE func_odbc_dayname=no # Function DAYNAME func_odbc_dayofmonth=no # Function DAYOFMONTH func_odbc_dayofweek=no # Function DAYOFWEEK func_odbc_dayofyear=no # Function DAYOFYEAR func_odbc_degrees=yes # Function DEGREES func_odbc_difference=no # Function DIFFERENCE() func_odbc_exp=no # Function EXP func_odbc_floor=yes # Function FLOOR func_odbc_fn_left=no # Function ODBC syntax LEFT & RIGHT func_odbc_hour=no # Function HOUR func_odbc_hour_time=no # Function ANSI HOUR func_odbc_ifnull=no # Function IFNULL func_odbc_insert=no # Function INSERT func_odbc_lcase=no # Function LCASE func_odbc_left=no # Function LEFT func_odbc_length=no # Function REAL LENGTH func_odbc_length_without_space=no # Function ODBC LENGTH func_odbc_locate_2=no # Function LOCATE(2 arg) func_odbc_locate_3=no # Function LOCATE(3 arg) func_odbc_log=no # Function LOG func_odbc_log10=no # Function LOG10 func_odbc_ltrim=yes # Function LTRIM func_odbc_minute=no # Function MINUTE func_odbc_mod=yes # Function MOD func_odbc_month=no # Function MONTH func_odbc_monthname=no # Function MONTHNAME func_odbc_now=yes # Function NOW func_odbc_pi=yes # Function PI func_odbc_power=no # Function POWER func_odbc_quarter=no # Function QUARTER func_odbc_radians=yes # Function RADIANS func_odbc_rand=no # Function RAND func_odbc_repeat=yes # Function REPEAT func_odbc_replace=no # Function REPLACE func_odbc_right=no # Function RIGHT func_odbc_round=yes # Function ROUND(2 arg) func_odbc_rtrim=yes # Function RTRIM func_odbc_second=no # Function SECOND func_odbc_sign=yes # Function SIGN func_odbc_sin=yes # Function SIN func_odbc_soundex=no # Function SOUNDEX func_odbc_space=no # Function SPACE func_odbc_sqrt=no # Function SQRT func_odbc_substring=no # Function ODBC SUBSTRING func_odbc_tan=yes # Function TAN func_odbc_timestampadd=no # Function TIMESTAMPADD func_odbc_timestampdiff=no # Function TIMESTAMPDIFF func_odbc_truncate=no # Function TRUNCATE func_odbc_ucase=no # Function UCASE func_odbc_user()=no # Function USER() func_odbc_week=no # Function WEEK func_odbc_year=no # Function YEAR func_sql_+=yes # Function +, -, * and / func_sql_bit_length=no # Function BIT_LENGTH func_sql_cast=yes # Function CAST func_sql_char_length=yes # Function CHAR_LENGTH func_sql_char_length(constant)=yes # Function CHAR_LENGTH(constant) func_sql_character_length=yes # Function CHARACTER_LENGTH func_sql_coalesce=yes # Function COALESCE func_sql_concat_as_||=yes # Function concatenation with || func_sql_current_date=yes # Function CURRENT_DATE func_sql_current_time=yes # Function CURRENT_TIME func_sql_current_timestamp=yes # Function CURRENT_TIMESTAMP func_sql_current_user=yes # Function CURRENT_USER func_sql_extract_sql=yes # Function EXTRACT func_sql_localtime=no # Function LOCALTIME func_sql_localtimestamp=no # Function LOCALTIMESTAMP func_sql_lower=yes # Function LOWER func_sql_nullif=no # Function NULLIF func_sql_octet_length=no # Function OCTET_LENGTH func_sql_position=yes # Function POSITION func_sql_searched_case=yes # Function searched CASE func_sql_session_user=yes # Function SESSION_USER func_sql_simple_case=yes # Function simple CASE func_sql_substring=yes # Function ANSI SQL SUBSTRING func_sql_system_user=no # Function SYSTEM_USER func_sql_trim=yes # Function TRIM func_sql_upper=yes # Function UPPER func_sql_user=yes # Function USER func_where_between=yes # Function BETWEEN func_where_eq_all=yes # Function = ALL func_where_eq_any=yes # Function = ANY func_where_eq_some=yes # Function = SOME func_where_exists=yes # Function EXISTS func_where_in_num=yes # Function IN on numbers func_where_like=yes # Function LIKE func_where_like_escape=no # Function LIKE ESCAPE func_where_match=no # Function MATCH func_where_match_unique=no # Function MATCH UNIQUE func_where_matches=no # Function MATCHES func_where_not_between=yes # Function NOT BETWEEN func_where_not_exists=yes # Function NOT EXISTS func_where_not_like=yes # Function NOT LIKE func_where_not_unique=no # Function NOT UNIQUE func_where_unique=no # Function UNIQUE functions=yes # Functions group_by=yes # Group by group_by_alias=yes # Group by alias group_by_null=yes # group on column with null values group_by_position=yes # Group by position group_distinct_functions=yes # Group functions with distinct group_func_extra_bit_and=no # Group function BIT_AND group_func_extra_bit_or=no # Group function BIT_OR group_func_extra_count_distinct_list=no # Group function COUNT(DISTINCT expr,expr,...) group_func_extra_std=no # Group function STD group_func_extra_stddev=no # Group function STDDEV group_func_extra_variance=no # Group function VARIANCE group_func_sql_avg=yes # Group function AVG group_func_sql_count_*=yes # Group function COUNT (*) group_func_sql_count_column=yes # Group function COUNT column name group_func_sql_count_distinct=yes # Group function COUNT(DISTINCT expr) group_func_sql_max=yes # Group function MAX on numbers group_func_sql_max_str=yes # Group function MAX on strings group_func_sql_min=yes # Group function MIN on numbers group_func_sql_min_str=yes # Group function MIN on strings group_func_sql_sum=yes # Group function SUM group_functions=yes # Group functions having=yes # Having having_with_alias=no # Having on alias having_with_group=yes # Having with group function ignore_end_space=yes # ignore end space in compare index_in_create=no # index in create table index_namespace=no # different namespace for index index_parts=no # index on column part (extension) insert_empty_string=yes # insert empty string insert_select=no # insert INTO ... SELECT ... insert_with_set=no # INSERT with set syntax intersect=no # intersect intersect_all=no # intersect all join_tables=+64 # tables in join left_outer_join=no # left outer join left_outer_join_using=no # left outer join using like_with_column=yes # column LIKE column like_with_number=yes # LIKE on numbers lock_tables=yes # lock table logical_value=1 # Value of logical operation (1=1) max_big_expressions=10 # big expressions max_char_size=8104 # max char() size max_column_name=+512 # column name length max_columns=1600 # Columns in table max_conditions=19994 # OR and AND in WHERE max_expressions=9999 # simple expressions max_index=+64 # max index max_index_length=2704 # index length max_index_name=+512 # index name length max_index_part_length=2704 # max index part length max_index_parts=16 # index parts max_index_varchar_part_length=2704 # index varchar part length max_row_length=7949 # max table row length (without blobs) max_row_length_with_null=7949 # table row length with nulls (without blobs) max_select_alias_name=+512 # select alias name length max_stack_expression=+2000 # stacked expressions max_table_alias_name=+512 # table alias name length max_table_name=+512 # table name length max_text_size=8104 # max text or blob size max_unique_index=+64 # unique indexes max_varchar_size=8104 # max varchar() size minus=no # minus minus_neg=no # Calculate 1--1 multi_drop=yes # many tables to drop table multi_strings=yes # Multiple line strings multi_table_delete=no # DELETE FROM table1,table2... multi_table_update=no # Update with many tables multi_value_insert=no # Value lists in INSERT natural_join=yes # natural join natural_left_outer_join=no # natural left outer join no_primary_key=yes # Tables without primary key null_concat_expr=yes # Is 'a' || NULL = NULL null_in_index=yes # null in index null_in_unique=yes # null in unique null_num_expr=yes # Is 1+NULL = NULL odbc_left_outer_join=no # left outer join odbc style operating_system=Linux 2.3.99s-noris-pre9-2 i686 # crash-me tested on order_by=yes # Order by order_by_alias=yes # Order by alias order_by_function=yes # Order by function order_by_position=yes # Order by position order_by_remember_desc=no # Order by DESC is remembered primary_key_in_create=yes # primary key in create table psm_functions=no # PSM functions (ANSI SQL) psm_modules=no # PSM modules (ANSI SQL) psm_procedures=no # PSM procedures (ANSI SQL) psm_trigger=no # Triggers (ANSI SQL) query_size=16777216 # query size quote_ident_with_"=no # " as identifier quote (ANSI SQL) quote_ident_with_[=no # [] as identifier quote quote_ident_with_`=no # ` as identifier quote quote_with_"=no # Allows ' and " as string markers recursive_subqueries=+64 # recursive subqueries remember_end_space=no # Remembers end space in char() remember_end_space_varchar=yes # Remembers end space in varchar() repeat_string_size=+8000000 # return string size from function right_outer_join=no # right outer join rowid=oid # Type for row id select_constants=yes # Select constants select_limit=with LIMIT # LIMIT number of rows select_limit2=yes # SELECT with LIMIT #,# select_string_size=+16208 # constant string size in SELECT select_table_update=no # Update with sub select select_without_from=yes # SELECT without FROM server_version=PostgreSQL version ??? # server version simple_joins=yes # ANSI SQL simple joins storage_of_float=round # Storage of float values subqueries=yes # subqueries table_alias=yes # Table alias table_name_case=yes # case independent table names table_wildcard=yes # Select table_name.* tempoary_table=yes # temporary tables transactions=yes # transactions type_extra_abstime=yes # Type abstime type_extra_bfile=no # Type bfile type_extra_blob=no # Type blob type_extra_bool=yes # Type bool type_extra_box=yes # Type box type_extra_byte=no # Type byte type_extra_char(1_arg)_binary=no # Type char(1 arg) binary type_extra_circle=yes # Type circle type_extra_clob=no # Type clob type_extra_datetime=yes # Type datetime type_extra_double=no # Type double type_extra_enum(1_arg)=no # Type enum(1 arg) type_extra_float(2_arg)=no # Type float(2 arg) type_extra_float4=yes # Type float4 type_extra_float8=yes # Type float8 type_extra_image=no # Type image type_extra_int(1_arg)_zerofill=no # Type int(1 arg) zerofill type_extra_int1=no # Type int1 type_extra_int2=yes # Type int2 type_extra_int3=no # Type int3 type_extra_int4=yes # Type int4 type_extra_int8=yes # Type int8 type_extra_int_auto_increment=no # Type int not null auto_increment type_extra_int_identity=no # Type int not null identity type_extra_int_unsigned=no # Type int unsigned type_extra_interval=yes # Type interval type_extra_line=yes # Type line type_extra_long=no # Type long type_extra_long_raw=no # Type long raw type_extra_long_varbinary=no # Type long varbinary type_extra_long_varchar(1_arg)=no # Type long varchar(1 arg) type_extra_lseg=yes # Type lseg type_extra_mediumint=no # Type mediumint type_extra_mediumtext=no # Type mediumtext type_extra_middleint=no # Type middleint type_extra_mlslabel=no # Type mlslabel type_extra_money=yes # Type money type_extra_nclob=no # Type nclob type_extra_number=no # Type number type_extra_number(1_arg)=no # Type number(1 arg) type_extra_number(2_arg)=no # Type number(2 arg) type_extra_nvarchar2(1_arg)=no # Type nvarchar2(1 arg) type_extra_path=yes # Type path type_extra_point=yes # Type point type_extra_polygon=yes # Type polygon type_extra_raw(1_arg)=no # Type raw(1 arg) type_extra_reltime=yes # Type reltime type_extra_rowid=no # Type rowid type_extra_serial=yes # Type serial type_extra_set(1_arg)=no # Type set(1 arg) type_extra_smalldatetime=no # Type smalldatetime type_extra_smallfloat=no # Type smallfloat type_extra_smallmoney=no # Type smallmoney type_extra_text=yes # Type text type_extra_text(1_arg)=no # Type text(1 arg) type_extra_timespan=yes # Type timespan type_extra_uint=no # Type uint type_extra_varchar2(1_arg)=no # Type varchar2(1 arg) type_extra_year=no # Type year type_odbc_bigint=no # Type bigint type_odbc_binary(1_arg)=no # Type binary(1 arg) type_odbc_datetime=yes # Type datetime type_odbc_tinyint=no # Type tinyint type_odbc_varbinary(1_arg)=no # Type varbinary(1 arg) type_sql_bit=yes # Type bit type_sql_bit(1_arg)=yes # Type bit(1 arg) type_sql_bit_varying(1_arg)=yes # Type bit varying(1 arg) type_sql_char(1_arg)=yes # Type char(1 arg) type_sql_char_varying(1_arg)=yes # Type char varying(1 arg) type_sql_character(1_arg)=yes # Type character(1 arg) type_sql_character_varying(1_arg)=yes # Type character varying(1 arg) type_sql_date=yes # Type date type_sql_dec(2_arg)=yes # Type dec(2 arg) type_sql_decimal(2_arg)=yes # Type decimal(2 arg) type_sql_double_precision=yes # Type double precision type_sql_float=yes # Type float type_sql_float(1_arg)=yes # Type float(1 arg) type_sql_int=yes # Type int type_sql_integer=yes # Type integer type_sql_interval_day=yes # Type interval day type_sql_interval_day_to_hour=yes # Type interval day to hour type_sql_interval_day_to_minute=yes # Type interval day to minute type_sql_interval_day_to_second=yes # Type interval day to second type_sql_interval_hour=yes # Type interval hour type_sql_interval_hour_to_minute=yes # Type interval hour to minute type_sql_interval_hour_to_second=yes # Type interval hour to second type_sql_interval_minute=yes # Type interval minute type_sql_interval_minute_to_second=yes # Type interval minute to second type_sql_interval_month=yes # Type interval month type_sql_interval_second=yes # Type interval second type_sql_interval_year=yes # Type interval year type_sql_interval_year_to_month=yes # Type interval year to month type_sql_national_char_varying(1_arg)=yes # Type national char varying(1 arg) type_sql_national_character(1_arg)=yes # Type national character(1 arg) type_sql_national_character_varying(1_arg)=yes # Type national character varying(1 arg) type_sql_nchar(1_arg)=yes # Type nchar(1 arg) type_sql_nchar_varying(1_arg)=yes # Type nchar varying(1 arg) type_sql_numeric(2_arg)=yes # Type numeric(2 arg) type_sql_real=yes # Type real type_sql_smallint=yes # Type smallint type_sql_time=yes # Type time type_sql_timestamp=yes # Type timestamp type_sql_timestamp_with_time_zone=yes # Type timestamp with time zone type_sql_varchar(1_arg)=yes # Type varchar(1 arg) union=yes # union union_all=yes # union all unique_in_create=yes # unique in create table unique_null_in_create=yes # unique null in create views=yes # views where_string_size=+16208 # constant string size in where --5mCyUwZo2JvN/JJP-- === Subject: Re: MySQL's "crashme" (was Re: [HACKERS] Performance) From: "Matthias Urlichs" <smurf@noris.net> Date: Sun, 21 May 2000 22:37:21 +0200 Hi, Tom Lane: > I'm just saying that it's unfair to downrate us when the problem is > demonstrably in crashme itself and not in Postgres. > Right. > <drop behavior> ::= CASCADE | RESTRICT > > What is *not* optional is a <drop behavior> keyword. Although we don't > yet implement DROP COLUMN, our parser already has this statement in it > --- and it follows the SQL92 grammar. > Ah, sorry, I apparently misparsed the BNF. (It was kind of late at night...) > No, pretty misleading I'd say. Since the crashme script does have a > limit on max_buffer_size, it *will* run to completion if run on a > machine with a sufficiently large per-process memory limit (and enough > swap of course). Hmm, I could add an explicit option to limit memory usage instead. (Right now it's hardcoded in the test script.) > >>>> date_zero=no # Supports 0000-00-00 dates > >> Another test that only MySQL "passes". > > ... and SOLID. > > Still doesn't mean it's a good idea ;-) > No argument from me... > >>>> except=no # except > It'd be appropriate to divide this into two tests, or at least relabel > it. > Already done. > Considering how much we got ragged on for not being perfectly compliant > with SQL-spec handling of comments (up till 7.0 our parser didn't > recognize "--" as a comment if it was embedded in a multicharacter > operator --- but we knew that was a bug), I don't have a lot of sympathy > for MySQL unilaterally redefining the spec here. They do note this noncompliance with the SQL spec in their documentation, along with a few others. I'll clean this one up (adding a note about the noncompliance) a bit more, after they incorporate my patch into the next version. > Anyway, I am pleased to see you trying to clean up the mess. > Good luck! > Thanks. === Subject: Re: AW: [HACKERS] Performance (was: The New Slashdot Setup (includes MySqlserver)) From: Andreas Zeugswetter <andreas.zeugswetter@telecom.at> Date: Fri, 26 May 2000 08:30:41 +0200 On Fri, 26 May 2000, Peter Eisentraut wrote: > Zeugswetter Andreas SB writes: > > > Which is imho not what the test is for. I think they mean ctid, > > which again I think we should have a rowid alias for (as in Informix, > > Oracle). > > Let's step back and ask: How is the behaviour of rowid (or whatever) > defined in various existing DBMS. Then we can see if we have anything that > matches. This has been discussed. The outcome is, that you are only safe using rowid if nobody else changes the row inbetween you reading it and accessing it by rowid. This is essentially the same in all db's only the risk of rowid changing is lower in other db's since they do inplace update, but the risk is there nevertheless. Andreas === Subject: Re: [HACKERS] The New Slashdot Setup (includes MySql server) From: Jeff MacDonald <jeff@pgsql.com> Date: Mon, 29 May 2000 11:32:33 -0300 (ADT) agreed, a while back i actually contacted rob malda and offered to convert slashdot to postgres.. he asked why i would want to do this , said postgres's features yada yada.. his reply .. that's dandy but we don't need those features. sad to say but mysql has a niche and slashdot fills it. === Subject: [HACKERS] MySQL crashme From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Mon, 22 May 2000 21:56:01 -0400 (EDT) Matthias, let me add that I wish you luck in updating the MySQL crashme test. You certainly seem to be on top of the issues, and I hope they can be resolved. I know a lot of people on this side are hoping you can make it happen. === Subject: Re: [HACKERS] A test to add to the crashme test From: "Matthias Urlichs" <smurf@noris.net> Date: Tue, 23 May 2000 08:40:54 +0200 Hi, Ross J. Reedstrom: > People keep claiming that applications that are essentially "read-only" > don't need transactions. I'll agree in the limit, that truly read only > databases don't, but I think a lot of people might be surprised at how > little writing you need before you get into trouble. > [ Mozilla buchtracking example ] > How many writer's does it take for this to happen? One. I'd call that > an "essentially read-only" system. A note, this is not a made up, > theoretical example. We're talking real world here. > Right. But that's not about transactions; that's about concurrent read and write access to a table. People using MySQL in real-world situations usually solve this with one read/write database for "normal" work, and another one for the long-running multi-record "let's list every bug in the system" queries. The update from one to the other is set to low-priority so that it won't lock out any queries (with a timeout). Mind you: I'm not saying this is ideal. A system with concurrent read/write access would be better. But it has the benefit of giving you a replicated database which you can fall back to, if the primary system is down for whatever reason. Besides, the MySQL people are currently busy integrating Berkeley DB into their code. Voila, instant read/write concurrency, and instant transactions. Well, almost. ;-) === Subject: Re: [HACKERS] A test to add to the crashme test From: Hannu Krosing <hannu@tm.ee> Date: Tue, 23 May 2000 09:51:02 +0300 Matthias Urlichs wrote: > > Besides, the MySQL people are currently busy integrating Berkeley DB > into their code. Then MySQL may become a RDBMS after all ;) > Voila, instant read/write concurrency, and instant transactions. But won't it slow them down ? === Subject: Re: [HACKERS] A test to add to the crashme test From: "Matthias Urlichs" <smurf@noris.net> Date: Tue, 23 May 2000 09:58:29 +0200 Hi, Hannu Krosing: > > Voila, instant read/write concurrency, and instant transactions. > But won't it slow them down ? > Of course it will. That's why they make the Berkeley tables optional. Their idea is that you use the Berkeley stuff for the tables which really require transactions, HEAP tables for in-memory cache/temp/whatever, and the standard MyISAM tables otherwise. Real-world example: Your customers' account balance really should be transaction safe, and all that. But not their address, or their clicktrail through your online shop system. === Subject: [HACKERS] MySQL now supports transactions ... From: The Hermit Hacker <scrappy@hub.org> Date: Tue, 23 May 2000 19:55:23 -0300 (ADT) http://www.mysql.com/download_3.23.html === Subject: Re: [HACKERS] MySQL now supports transactions ... From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Tue, 23 May 2000 20:26:41 -0400 (EDT) Apparently by way of some Berkeley DB code.... > > http://web.mysql.com/php/manual.php3?section=BDB > > > > The Hermit Hacker wrote: > > > http://www.mysql.com/download_3.23.html > > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy > > Systems Administrator @ hub.org > > primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org > > Yes, I see that too. It makes much more sense for them because they have ordered heaps anyway, with secondary indexes, rather than our unordered heap and indexes. === Subject: Re: [HACKERS] MySQL now supports transactions ... From: "Michael A. Olson" <mao@sleepycat.com> Date: Tue, 23 May 2000 17:25:37 -0700 At 08:02 PM 5/23/00 -0400, you wrote: > Apparently by way of some Berkeley DB code.... > > http://web.mysql.com/php/manual.php3?section=BDB Yeah, that's correct. As there was no existing transaction layer in place, it was pretty straightforward to add Berkeley DB. They had to abstract the ISAM layer that they've used to date, but there were no serious technical issues. You can choose which kind of tables you want (myisam or bdb) at table creation time. BDB tables have the standard ACID properties that Berkeley DB provides generally, via the standard mechanisms (two-phase locking, write-ahead logging, and so forth). The 3.23.16 release is decidedly alpha, but is in good enough shape to distribute. My bet is that we'll hammer out a few dumb bugs in the next weeks, and they'll cut something more stable soon. You need to download the 3.1.5 distribution of Berkeley DB from MySQL.com. We're not distributing that version from Sleepycat. We're in the middle of the release cycle for our 3.1 release, and expect to cut a stable one in the next week or so. MySQL relies on a couple of features we added to 3.1 for them, so they can't run with the 3.0 release that's up on our site now. It's been pretty quiet since my message on Sunday, about the difficulties in integrating Berkeley DB with the PostgreSQL backend. Vadim (and others), what is your opinion? My impression is that the project is too much trouble, but I'd be glad to hear from you folks on the topic. === Subject: Re: [HACKERS] MySQL now supports transactions ... From: The Hermit Hacker <scrappy@hub.org> Date: Tue, 23 May 2000 21:39:28 -0300 (ADT) On Tue, 23 May 2000, Bruce Momjian wrote: > > Apparently by way of some Berkeley DB code.... > > > > http://web.mysql.com/php/manual.php3?section=BDB > > > > > > > > The Hermit Hacker wrote: > > > > > http://www.mysql.com/download_3.23.html > > > > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy > > > Systems Administrator @ hub.org > > > primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org > > > > > > Yes, I see that too. It makes much more sense for them because they > have ordered heaps anyway, with secondary indexes, rather than our > unordered heap and indexes. Just figured a heads up was in order for those that have been using ACID/transactions in their arguments :) === Subject: RE: [HACKERS] Berkeley DB... From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> Date: Fri, 26 May 2000 11:32:20 -0700 We might have part of the story in the recently noticed fact that > each insert/update query begins by doing a seqscan of pg_index. > > I have done profiles of INSERT in the past and not found any really > spectacular bottlenecks (but I was looking at a test table with no > indexes, so I failed to see the pg_index problem :-(). Last time > I did it, I had these top profile entries for inserting 100,000 rows > of 30 columns apiece: Well, I've dropped index but INSERTs still take 70 sec and COPY just 1sec -:((( === Subject: Re: [HACKERS] Berkeley DB... From: Mike Mascari <mascarm@mascari.com> Date: Fri, 26 May 2000 14:48:22 -0400 Tom Lane wrote: > > Hannu Krosing <hannu@tm.ee> writes: > >> As Vadim points out in his comparison > >> of COPY vs. INSERT, something is *wrong* with the time it takes > >> for PostgreSQL to parse, plan, rewrite, and optimize. > > We might have part of the story in the recently noticed fact that > each insert/update query begins by doing a seqscan of pg_index. > > I have done profiles of INSERT in the past and not found any really > spectacular bottlenecks (but I was looking at a test table with no > indexes, so I failed to see the pg_index problem :-(). Last time > I did it, I had these top profile entries for inserting 100,000 rows > of 30 columns apiece: > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 30.08 290.79 290.79 _mcount > 6.48 353.46 62.67 30702766 0.00 0.00 AllocSetAlloc > 5.27 404.36 50.90 205660 0.25 0.25 write > 3.06 433.97 29.61 30702765 0.00 0.00 MemoryContextAlloc > 2.74 460.45 26.48 100001 0.26 0.74 yyparse > 2.63 485.86 25.41 24300077 0.00 0.00 newNode > 2.22 507.33 21.47 3900054 0.01 0.01 yylex > 1.63 523.04 15.71 30500751 0.00 0.00 PortalHeapMemoryAlloc > 1.31 535.68 12.64 5419526 0.00 0.00 hash_search > 1.18 547.11 11.43 9900000 0.00 0.00 expression_tree_walker > 1.01 556.90 9.79 3526752 0.00 0.00 SpinRelease > > While the time spent in memory allocation is annoying, that's only about > ten mallocs per parsed data expression, so it's unlikely that we will be > able to improve on it very much. (We could maybe avoid having *three* > levels of subroutine call to do an alloc, though ;-).) Unless you are > smarter than the flex and bison guys you are not going to be able to > improve on the lex/parse times either. The planner isn't even showing > up for a simple INSERT. Not much left, unless you can figure out how > to write and commit a tuple with less than two disk writes. > > But, as I said, this was a case with no indexes to update. > > I intend to do something about caching pg_index info ASAP in the 7.1 > cycle, and then we can see how much of a difference that makes... > > regards, tom lane It will be interesting to see the speed differences between the 100,000 inserts above and those which have been PREPARE'd using Karel Zak's PREPARE patch. Perhaps a generic query cache could be used to skip the parsing/planning/optimizing stage when multiple exact queries are submitted to the database? I suppose the cached plans could then be discarded whenever a DDL statement or a VACUUM ANALYZE is executed? The old Berkeley Postgres docs spoke about cached query plans *and* results (as well as 64-bit oids, amongst other things). I'm looking forward to when the 7.1 branch occurs... :-) === Subject: [HACKERS] OID question From: Jacques Huard <jacques@intuineer.com> Date: Thu, 01 Jun 2000 22:29:16 GMT I have used MySQL to do several database related things in the past using PHP. In the past I have used a field "row_id" as a unique number (within that specific table) as a reference to a specific row. This "row_id" field was automatically placed in a table when it was created in MySQL and the next unique number was placed in the field when automatically during every new insert. This made things easy for writing applications in PHP. When I switch to Postgres I noticed that there was a OID. I believe that this "object identifier" is similar to the "row_id" in MySQL but I am unable to access it for an given row. PHP has a function which can get the last OID for the last "Insert" issued, however, this won't help me accomplish the same things I was able to accomplish using "row_id" in MySQL. I have read the documetation and have not found a real good description of OID but have found commands that can add a unique sequence column which could accomplish what I need. However, I need (or want) the unique sequence column to maintain itself, without calling a function to fill in that field during an insert. I am sure there is an easy way to accomplish this and I am overlooking the solution. Could someone suggest what they have done? Thanks for any response. === Subject: [HACKERS] Berkeley DB... From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> Date: Sat, 20 May 2000 18:43:37 -0700 Well, I've read SDB code/doc for a few hours... 1. Using RECNO db for heap. For relational DB over-writing smgr means ability to re-use space after DELETE/UPDATE operations (without vacuum -:)). RECNO (btree by nature, with record number as key) will not give us this ability. To insert record into RECNO db one has either to provide "put" method with record number (where to store new record) or specify DB_APPEND in flags, to add new record to the end of db (without space re-using). So, the problem (one of two base problems of over-writing smgr for us) "where to store new tuple" (ie - where in data file there is free space for new tuple) is not resolved. => we can't use SDB smgr: there are no required features - space re-using and MVCC support. 2. SDB' btree-s support only one key, but we have multi-key btree-s... 3. How can we implement gist, rtree AND (multi-key) BTREE access methods using btree and hash access methods provided by SDB?! 1,2,3 => we have to preserve our access methods (and ability to add new!). Now, about WAL. What is WAL? WAL *mostly* is set of functions to write/read log (90% implemented) + *access method specific* redo/undo functions... to be implemented anyway, because of conclusion above. Comments? === Subject: Re: [HACKERS] Berkeley DB... From: "Michael A. Olson" <mao@sleepycat.com> Date: Sun, 21 May 2000 11:36:59 -0700 At 06:43 PM 5/20/00 -0700, Vadim Mikheev wrote: > 1. Using RECNO db for heap. > For relational DB over-writing smgr means ability to re-use space after > DELETE/UPDATE operations (without vacuum -:)). RECNO (btree by nature, > with record number as key) will not give us this ability. To insert record > into RECNO db one has either to provide "put" method with record number > (where to store new record) or specify DB_APPEND in flags, to add new record > to the end of db (without space re-using). So, the problem (one of two base > problems of over-writing smgr for us) "where to store new tuple" (ie - where > in data file there is free space for new tuple) is not resolved. > => we can't use SDB smgr: there are no required features - space re-using > and MVCC support. All of the Berkeley DB access methods reuse space. We return free space to a pool and allocate from the pool in the ordinary course of operation. We have no notion of vacuum. Empty pages get appended to a free list, and will be reused on next page allocation. Empty space on pages (from deleted tuples) where the rest of the page isn't empty will get reused the next time the page is visited. So you do get space reuse. We don't return blocks to the file system automatically (requires reorg, which is hard). "Appending" means appending in key space; that may or may not be physically at the end of the file. We do, however, do reverse splits of underfull nodes, so we're aggressive at getting empty pages back on the free list. In short, I think the space reuse story of Berkeley DB is better than the current space reuse story in PostgreSQL, even for heaps. This is because the current heap AM doesn't opportunistically coalesce pages to make free pages available for reuse by new inserts. We don't have multi-version concurrency control. It's a feature we'd like to see added, but it certainly represents a substantial new engineering effort. As I've said before, we'd be glad to support you in that project if you decide to undertake it. > 2. SDB' btree-s support only one key, but we have multi-key btree-s... This is a misunderstanding. Berkeley DB allows you to use arbitrary data structures as keys. You define your own comparison function, which understands your key structure and is capable of doing comparisons between keys. It's precisely equivalent to the support you've got in PostgreSQL now, since your comparator has to understand key schema (including the presence or absence of nulls). You'd define your own comparator and your own key type. You'd hand (key, value) pairs to Berkeley DB, and we'd call your comparator to compare keys during tree descent. The key you hand us is an arbitrarily complex data structure, but we don't care. You get another benefit from Berkeley DB -- we eliminate the 8K limit on tuple size. For large records, we break them into page-sized chunks for you, and we reassemble them on demand. Neither PostgreSQL nor the user needs to worry about this, it's a service that just works. A single record or a single key may be up to 4GB in size. > 3. How can we implement gist, rtree AND (multi-key) BTREE access methods > using btree and hash access methods provided by SDB?! You'd build gist and rtree on top of the current buffer manager, much as rtree is currently implemented on top of the lower-level page manager in PostgreSQL. Multi-key btree support is there already, as is multi- key extended linear hashing. In exchange for having to build a new rtree AM, you'd get high-performance persistent queues for free. I'd argue that queues are more generally useful than rtrees. I understand that you have users who need rtrees. I wrote that access method in Postgres, and used it extensively for geospatial indexing during the Sequoia 2000 project. I'm a big fan. Nevertheless, there are more database customers looking for fast queues than are looking for spatial indices. > 1,2,3 => we have to preserve our access methods (and ability to add new!). Again, you can add new access methods in Berkeley DB in the same way that you do for PostgreSQL now. > Now, about WAL. What is WAL? WAL *mostly* is set of functions to > write/read log (90% implemented) + *access method specific* redo/undo > functions... to be implemented anyway, because of conclusion above. You wouldn't need to rewrite the current access-method undo and redo functions in Berkeley DB; they're there, and they work. You'd need to do that work for the new access methods you want to define, but as you note, that work is required whether you roll your own or use Berkeley DB. I encourage you to think hard about the amount of work that's really required to produce a commercial-grade recovery and transaction system. This stuff is extremely hard to get right -- you need to design, code, and test for very high-concurrency, complex workloads. The log is a new source of contention, and will be a gate to performance. The log is also a new way to consume space endlessly, so you'll want to think about backup and checkpoint support. With Berkeley DB, you get both today. Our backup support permits you to do on-line backups. Backups don't acquire locks and don't force a shutdown. Testing this stuff is tricky. For example, you need to prove that you're able to survive a crash that interrupts the three internal page writes that you do in the btree access method on a page split. Postgres (when I wrote the Btree AM) carefully ordered those writes to guarantee no loss of data, but it was possible to crash with the children written and the parent lost. The result is an inefficiency in the tree structure that you'll never recover, but that you can deal with at read time. This is an example of a case that Berkeley DB gets right. The advantage of Berkeley DB is that we've got five years of commercial deployment behind us. The code is in production use, and is known to support terabyte-sized databases, hundreds of concurrent threads with arbitrary read/write mixes, and system and application crashes. By putting the log and the data on separate spindles, we are able to survive loss of either device without losing committed data. Big companies have made significant bets on the software by deploying it in mission-critical applications. It works. Plus, we're continuing to work on the code, and we're paid real money to do that. We're able to deliver significant new features and performance improvements about three times a year. All of that said, I'd boil Vadim's message down to this: + With Berkeley DB, you'd need to reimplement multi-version concurrency control, and that's an opportunity to introduce new bugs. + With PostgreSQL, you'll need to implement logging and recovery, and that's an opportunity to introduce new bugs. I don't think that either alternative presents insurmountable difficulties. Which you choose depends on the technical issues and on your willingness to integrate code from outside the project into PostreSQL's internals, to a degree that you've never done before. Regards, mike === Subject: RE: [HACKERS] Berkeley DB... From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> Date: Fri, 26 May 2000 14:04:49 -0700 Well, I've dropped index but INSERTs still take 70 sec and > > COPY just 1sec -:((( > > Well, for those that have fsync turned off we could actually > avoid most of the writes, could'nt we ? Just leave the page > marked dirty. We would only need to write each new page once. > The problem as I see it is, that we don't have a good place > where the writes would actually be done. Now they are obviously > done after each insert. I've run test without fsync and with all inserts in *single* transaction - there should be no write after each insert... Vadim === Subject: Re: [HACKERS] Berkeley DB... From: Bruce Momjian <pgman@candle.pha.pa.us> Date: Fri, 26 May 2000 19:41:44 -0400 (EDT) Charset ISO-8859-1 unsupported, converting... ] > > > Well, I've dropped index but INSERTs still take 70 sec and > > > COPY just 1sec -:((( > > > > Well, for those that have fsync turned off we could actually > > avoid most of the writes, could'nt we ? Just leave the page > > marked dirty. We would only need to write each new page once. > > The problem as I see it is, that we don't have a good place > > where the writes would actually be done. Now they are obviously > > done after each insert. > > I've run test without fsync and with all inserts in *single* > transaction - there should be no write after each insert... Watch out. I think Vadim is settled into San Franciso and is getting fired up again... :-) === Subject: RE: [HACKERS] Berkeley DB... From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Sat, 27 May 2000 12:11:58 +0900 Original Message----- > From: pgsql-hackers-owner@hub.org > [mailto:pgsql-hackers-owner@hub.org]On Behalf Of Mikheev, Vadim > > > We might have part of the story in the recently noticed fact that > > each insert/update query begins by doing a seqscan of pg_index. > > > > I have done profiles of INSERT in the past and not found any really > > spectacular bottlenecks (but I was looking at a test table with no > > indexes, so I failed to see the pg_index problem :-(). Last time > > I did it, I had these top profile entries for inserting 100,000 rows > > of 30 columns apiece: > > Well, I've dropped index but INSERTs still take 70 sec and > COPY just 1sec -:((( > Did you run vacuum after dropping indexes ? Because DROP INDEX doesn't update relhasindex of pg_class, planner/executer may still look up pg_index. === Subject: RE: [HACKERS] Berkeley DB... From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> Date: Sat, 27 May 2000 21:44:52 -0700 Well, I've dropped index but INSERTs still take 70 sec and > > COPY just 1sec -:((( > > > > Did you run vacuum after dropping indexes ? > Because DROP INDEX doesn't update relhasindex of pg_class, > planner/executer may still look up pg_index. Actually, I dropped and re-created table without indices... Vadim === Subject: Re: [HACKERS] Berkeley DB... From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun, 28 May 2000 01:28:04 -0400 Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes: >>>> Well, I've dropped index but INSERTs still take 70 sec and >>>> COPY just 1sec -:((( Mebbe so, but you can't blame it all on parse/plan overhead. I did some experimentation on this with current sources, using a test case of inserting 100,000 rows of 16 columns (the regression "tenk1" table's contents repeated 10 times). Each test was started with a freshly created empty table. The initial runs were done with all postmaster options except -F defaulted. All numbers are wall-clock time in seconds; the "+" column is the time increase from the previous case: load via COPY, fsync off: 0 indexes 24.45s 1 index 48.88s + 24.43 2 indexes 62.65s + 13.77 3 indexes 96.84s + 34.19 4 indexes 134.09s + 37.25 load via INSERTs, fsync off, one xact (begin/end around all inserts): 0 indexes 194.95s 1 index 247.21s + 52.26 2 indexes 269.69s + 22.48 3 indexes 307.33s + 37.64 4 indexes 352.72s + 45.39 load via INSERTs, fsync off, separate transaction for each insert: 0 indexes 236.53s 1 index 295.96s + 59.43 2 indexes 323.40s + 27.44 [ got bored before doing 3/4 index cases ... ] load via INSERTs, fsync on, separate transactions: 0 indexes 5189.99s [ don't want to know how long it will take with indexes :-( ] So while the parse/plan overhead looks kinda bad next to a bare COPY, it's not anything like a 70:1 penalty. But an fsync per insert is that bad and worse. I then recompiled with -pg to learn more about where the time was going. One of the useful places to look at is calls to FileSeek, since mdread, mdwrite, and mdextend all call it. To calibrate these numbers, the table being created occupies 2326 pages and the first index is 343 pages. Inserts (all in 1 xact), no indexes: 0.00 0.00 1/109528 init_irels [648] 0.00 0.00 85/109528 mdread [592] 0.01 0.00 2327/109528 mdextend [474] 0.01 0.00 2343/109528 mdwrite [517] 0.23 0.00 104772/109528 _mdnblocks [251] [250] 0.0 0.24 0.00 109528 FileSeek [250] Inserts (1 xact), 1 index: 0.00 0.00 1/321663 init_irels [649] 0.00 0.00 2667/321663 mdextend [514] 0.10 0.00 55478/321663 mdread [277] 0.11 0.00 58096/321663 mdwrite [258] 0.38 0.00 205421/321663 _mdnblocks [229] [213] 0.1 0.60 0.00 321663 FileSeek [213] COPY, no indexes: 0.00 0.00 1/109527 init_irels [431] 0.00 0.00 84/109527 mdread [404] 0.00 0.00 2327/109527 mdextend [145] 0.00 0.00 2343/109527 mdwrite [178] 0.07 0.00 104772/109527 _mdnblocks [77] [83] 0.0 0.07 0.00 109527 FileSeek [83] COPY, 1 index: 0.00 0.00 1/218549 init_irels [382] 0.00 0.00 2667/218549 mdextend [220] 0.07 0.00 53917/218549 mdread [106] 0.08 0.00 56542/218549 mdwrite [99] 0.14 0.00 105422/218549 _mdnblocks [120] [90] 0.0 0.30 0.00 218549 FileSeek [90] The extra _mdnblocks() calls for the inserts/1index case seem to be from the pg_index scans in ExecOpenIndices (which is called 100000 times in the inserts case but just once in the COPY case). We know how to fix that. Otherwise the COPY and INSERT paths seem to be pretty similar as far as actual I/O calls go. The thing that jumps out here, however, is that it takes upwards of 50000 page reads and writes to prepare a 343-page index. Most of the write calls turn out to be from BufferReplace, which is pretty conclusive evidence that the default setting of -B 64 is not enough for this example; we need more buffers. At -B 128, inserts/0index seems about the same, inserts/1index traffic is 0.00 0.00 1/270331 init_irels [637] 0.01 0.00 2667/270331 mdextend [510] 0.06 0.00 29798/270331 mdread [354] 0.06 0.00 32444/270331 mdwrite [277] 0.40 0.00 205421/270331 _mdnblocks [229] [223] 0.1 0.52 0.00 270331 FileSeek [223] At -B 256, inserts/1index traffic is 0.00 0.00 1/221849 init_irels [650] 0.00 0.00 2667/221849 mdextend [480] 0.01 0.00 5556/221849 mdread [513] 0.01 0.00 8204/221849 mdwrite [460] 0.37 0.00 205421/221849 _mdnblocks [233] [240] 0.0 0.40 0.00 221849 FileSeek [240] At -B 512, inserts/1index traffic is 0.00 0.00 1/210788 init_irels [650] 0.00 0.00 25/210788 mdread [676] 0.00 0.00 2667/210788 mdextend [555] 0.00 0.00 2674/210788 mdwrite [564] 0.27 0.00 205421/210788 _mdnblocks [248] [271] 0.0 0.28 0.00 210788 FileSeek [271] So as long as the -B setting is large enough to avoid thrashing, there shouldn't be much penalty to making an index. I didn't have time to run the COPY cases but I expect they'd be about the same. Bottom line is that where I/O costs are concerned, the parse/plan overhead for INSERTs is insignificant except for the known problem of wanting to rescan pg_index for each INSERT. The CPU overhead is significant, at least if you're comparing no-fsync performance ... but as I commented before, I doubt we can do a whole lot better in that area for simple INSERTs. regards, tom lane === Subject: Re: [HACKERS] Berkeley DB... From: "Zeugswetter Andreas" <andreas.zeugswetter@telecom.at> Date: Fri, 26 May 2000 21:43:20 +0200 % cumulative self self total > > time seconds seconds calls ms/call ms/call name > > 30.08 290.79 290.79 _mcount > > 6.48 353.46 62.67 30702766 0.00 0.00 AllocSetAlloc > > 5.27 404.36 50.90 205660 0.25 0.25 write > > 3.06 433.97 29.61 30702765 0.00 0.00 MemoryContextAlloc > > 2.74 460.45 26.48 100001 0.26 0.74 yyparse > > 2.63 485.86 25.41 24300077 0.00 0.00 newNode > > 2.22 507.33 21.47 3900054 0.01 0.01 yylex > > 1.63 523.04 15.71 30500751 0.00 0.00 PortalHeapMemoryAlloc > > 1.31 535.68 12.64 5419526 0.00 0.00 hash_search > > 1.18 547.11 11.43 9900000 0.00 0.00 expression_tree_walker > > 1.01 556.90 9.79 3526752 0.00 0.00 SpinRelease > > > > While the time spent in memory allocation is annoying, that's only about > > ten mallocs per parsed data expression, so it's unlikely that we will be > > able to improve on it very much. (We could maybe avoid having *three* > > levels of subroutine call to do an alloc, though ;-).) Unless you are > > smarter than the flex and bison guys you are not going to be able to > > improve on the lex/parse times either. The planner isn't even showing > > up for a simple INSERT. Not much left, unless you can figure out how > > to write and commit a tuple with less than two disk writes. > It will be interesting to see the speed differences between the > 100,000 inserts above and those which have been PREPARE'd using > Karel Zak's PREPARE patch If we believe the above output, the win won't be very noticeable. It is the writes (and the Allocs) we have to get rid of. The above is much faster if you do: begin work; 100000 inserts ....; commit work; Andreas === Subject: Re: [HACKERS] Berkeley DB... From: Peter Eisentraut <peter_e@gmx.net> Date: Mon, 29 May 2000 00:18:01 +0200 (CEST) Tom Lane writes: > I have done profiles of INSERT in the past and not found any really > spectacular bottlenecks I am still at a loss on how to make profiles. The latest thing that happened to me is that the postmaster gave me a `Profiling timer expired' message and never started up. Any idea? === Subject: Re: [HACKERS] Berkeley DB... From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun, 28 May 2000 22:15:29 -0400 Peter Eisentraut <peter_e@gmx.net> writes: > I am still at a loss on how to make profiles. The latest thing that > happened to me is that the postmaster gave me a `Profiling timer expired' > message and never started up. Any idea? Dunno ... PROFILE=-pg works for me ... Normally there's a special startup file that the compiler is supposed to know to link instead of the usual crt0.o, when you link with -pg. Possibly there's something wrong with yours? === Subject: Re: [HACKERS] Berkeley DB... From: "Matthias Urlichs" <smurf@noris.de> Date: Fri, 26 May 2000 10:16:31 +0200 Hi, Mike Mascari: > DML statements vs. MySQL. As Vadim points out in his comparison > of COPY vs. INSERT, something is *wrong* with the time it takes > for PostgreSQL to parse, plan, rewrite, and optimize. In my throughly unscientific opinion, the problem may well be the fact that PostgreSQL recurses the whole process, i.e. it is looking up attributes of one table in a bunch of other tables. MySQL, by contrast, has three files per table -- one with the data, one with _all_ the indices, and one .frm file with all the other metadata you would ever want to know about a table. That metadata file is mapped into shared memory space by the first task that opens a table, and it stays there. The data and index files also stay open until they're flushed. Since MySQL is multithreaded, opening a new connection is extremely cheap. By contrast, PostgreSQL does more than 30 open() calls when I connect to it.(*) It's still lots faster than some other databases I might mention, though... Access control is done by a bunch of tables in the "mysql" database, but these are 100% cached. One nice side effect of this is that it's very easy to access tables from another database. Just say "select * from foo.bar". (*) The list: /data//pg_options /etc/passwd /etc/group /data//PG_VERSION /data//pg_database /data//base/test/PG_VERSION /data//base/test/pg_internal.init /data//pg_log /data//pg_variable /data//base/test/pg_class /data//base/test/pg_class_relname_index /data//base/test/pg_attribute /data//base/test/pg_attribute_relid_attnum_index /data//base/test/pg_trigger /data//base/test/pg_am /data//base/test/pg_index /data//base/test/pg_amproc /data//base/test/pg_amop /data//base/test/pg_operator /data//base/test/pg_index_indexrelid_index /data//base/test/pg_operator_oid_index /data//base/test/pg_index_indexrelid_index /data//base/test/pg_trigger_tgrelid_index /data//pg_shadow /data//pg_database /data//base/test/pg_proc /data//base/test/pg_proc_proname_narg_type_index /data//base/test/pg_type /data//base/test/pg_type_oid_index /data//base/test/pg_proc_oid_index /data//base/test/pg_rewrite /data//base/test/pg_user /data//base/test/pg_attribute_relid_attnam_index /data//base/test/pg_operator_oprname_l_r_k_index /data//base/test/pg_class_oid_index /data//base/test/pg_statistic /data//base/test/pg_statistic_relid_att_index === Subject: RE: [HACKERS] Berkeley DB... From: Andreas Zeugswetter <andreas.zeugswetter@telecom.at> Date: Sun, 28 May 2000 09:30:38 +0200 On Fri, 26 May 2000, Mikheev, Vadim wrote: > > > Well, I've dropped index but INSERTs still take 70 sec and > > > COPY just 1sec -:((( > > > > Well, for those that have fsync turned off we could actually > > avoid most of the writes, could'nt we ? Just leave the page > > marked dirty. We would only need to write each new page once. > > The problem as I see it is, that we don't have a good place > > where the writes would actually be done. Now they are obviously > > done after each insert. > > I've run test without fsync and with all inserts in *single* > transaction - there should be no write after each insert... Yes, but if you don't do the inserts in one big transaction and don't issue transaction statements ( no begin or commit ) then you get the behavior I described. === Subject: RE: [HACKERS] Berkeley DB... From: "Hiroshi Inoue" <Inoue@tpf.co.jp> Date: Mon, 29 May 2000 13:51:42 +0900 Original Message----- > From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM] > > > > Well, I've dropped index but INSERTs still take 70 sec and > > > COPY just 1sec -:((( > > > > > > > Did you run vacuum after dropping indexes ? > > Because DROP INDEX doesn't update relhasindex of pg_class, > > planner/executer may still look up pg_index. > > Actually, I dropped and re-created table without indices... > Oops,aren't you testing in 6.5.3 ? ExecOpenIndices() always refers to pg_index in 6.5.x. Currently it doesn't refer to pg_index if relhasindex is false. === Subject: Re: [HACKERS] Berkeley DB... From: Karel Zak <zakkr@zf.jcu.cz> Date: Mon, 29 May 2000 16:57:04 +0200 (CEST) It will be interesting to see the speed differences between the > 100,000 inserts above and those which have been PREPARE'd using > Karel Zak's PREPARE patch. Perhaps a generic query cache could be My test: postmaster: -F -B 2000 rows: 100,000 table: create table (data text); data: 37B for eache line --- all is in one transaction native insert: 66.522s prepared insert: 59.431s - 11% faster IMHO parsing/optimizing is relative easy for a simple INSERT. The query (plan) cache will probably save time for complicated SELECTs with functions ...etc. (like query that for parsing need look at to system tables). For example: insert into tab values ('some data' || 'somedata' || 'some data'); native insert: 91.787s prepared insert: 45.077s - 50% faster (Note: This second test was faster, because I stop X-server and postgres had more memory :-) The best way for large and simple data inserting is (forever) COPY, not exist faster way. pg's path(s) of query: native insert: parser -> planner -> executor -> storage prepared insert: parser (for execute stmt) -> executor -> storage copy: utils (copy) -> storage > amongst other things). I'm looking forward to when the 7.1 branch > occurs... :-) I too. Karel === Subject: RE: [HACKERS] Berkeley DB... From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> Date: Mon, 29 May 2000 15:08:51 -0700 So while the parse/plan overhead looks kinda bad next to a bare COPY, > it's not anything like a 70:1 penalty. But an fsync per insert is Isn't it because of your table has 16 columns and my table has only 2? > that bad and worse. Of course -:) 2Hiroshi - yes, I've used 6.5.3... ===