pgsql-like_and_indexes

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



To: pgsql-general-postgresql.org@localhost.postgresql.org
Date: Thu, 05 Feb 2004 15:30:07 +0700
From: David Garamond <lists@zara.6.isreserved.com>
Subject: [GENERAL] Can LIKE use indexes or not?

use index (and ILIKE can't; so to do case-insensitive search you need to 
create a functional index on LOWER(field) and say: LOWER(field) LIKE 
'foo%').

However, EXPLAIN always says seq scan for the test data I'm using. I've 
done 'set enable_seqscan to off' and it still says seq scan. I was 
curious as to how the index will help this query:

db1=> set enable_seqscan to off;
SET
Time: 5.732 ms
db1=> explain select * from t where f like 'xx%';
                             QUERY PLAN
-------------------------------------------------------------------
  Seq Scan on t  (cost=100000000.00..100002698.90 rows=89 width=14)
    Filter: (f ~~ 'xx%'::text)
(2 rows)

db1=> explain select * from t where lower(f) like 'xx%';
                              QUERY PLAN
--------------------------------------------------------------------
  Seq Scan on t  (cost=100000000.00..100002893.68 rows=390 width=14)
    Filter: (lower(f) ~~ 'xx%'::text)
(2 rows)

The table is:

db1=> \d t
      Table "public.t"
  Column | Type | Modifiers
--------+------+-----------
  f      | text |
Indexes:
     "i1" unique, btree (lower(f))
     "i2" unique, btree (f)

It contains +- 250k rows of totally random 10-char-long strings 
(containing upper- & lowercase letters and numbers). Here's how the LIKE 
performs:

db1=> select * from t where f like 'xx%';
      f
------------
  xxEqfLZMkH
  xxBRRnLetJ
  ...
  xxFPYJEiYf
(98 rows)

Time: 452.613 ms

Would using an index potentially help the performance of this query, and 
if yes, how do I force Postgres to use the index?

db1=> select * from t where lower(f) like 'mmm%';
      f
------------
  MmmyEVmfSY
  MMmzolhHtq
  ...
  mMMWEQzlKm
(16 rows)

Time: 634.470 ms

-- 
dave


===

Date: Thu, 5 Feb 2004 08:58:50 -0000 (GMT)
Subject: Re: [GENERAL] Can LIKE use indexes or not?
From: "John Sidney-Woollett" <johnsw@wardbrook.com>
To: "David Garamond" <lists@zara.6.isreserved.com>
Cc: "pgsql-general" <pgsql-general@postgresql.org>

David Garamond said:
> Would using an index potentially help the performance of this query, and
> if yes, how do I force Postgres to use the index?
>
> db1=> select * from t where lower(f) like 'mmm%';

I suspect the fact that you're specifying the lower function on the column
data, ie lower(f), implies that the function has to be applied to every
row in the table in order to calculate the value prior to testing the like
condition.

I don't know enough about what you can and cannot do index-wise in PG, in
terms of creating an index based on a computed (upper/lower) value of a
column.

But you could consider adding an extra column to the table and a trigger
so that the trigger places an UPPER or LOWER version of the column "f"
into the new column.

Like searches would then be

select * from t where new_upper_f like upper('MMM%');

Provided that there is an index on the new column, new_upper_f, you should
avoid the full table scan. (I think, I haven't tested this out)...

===

Date: Thu, 05 Feb 2004 10:22:29 +0100
From: Jan Poslusny <pajout@gingerall.cz>
To: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Can LIKE use indexes or not?

try this:
CREATE [ UNIQUE ] INDEX my_index ON t ( lower(f));

===

Date: Thu, 5 Feb 2004 09:20:18 -0000 (GMT)
Subject: Re: [GENERAL] Can LIKE use indexes or not?
From: "John Sidney-Woollett" <johnsw@wardbrook.com>
To: johnsw@wardbrook.com
Cc: "David Garamond" <lists@zara.6.isreserved.com>,

John Sidney-Woollett said:
> select * from t where new_upper_f like upper('MMM%');

I think I meant

select * from t where new_upper_f like 'MMM%';

or

select * from t where new_upper_f like upper('mmm%');

===

Date: Thu, 05 Feb 2004 17:43:31 +0800
To: David Garamond <lists@zara.6.isreserved.com>,
From: Lincoln Yeoh <lyeoh@pop.jaring.my>
Subject: Re: [GENERAL] Can LIKE use indexes or not?

If you use an exact = does it use the index?
e.g. explain select ... where lower(f)='xxxxxxxx'

If so it could be your locale setting.  On some versions of Postgresql like 
is disabled on non-C locales. On some versions of Postgresql on some 
platforms the default is a non-C locale. With version 7.4 you can 
workaround that: 
http://www.postgresql.org/docs/current/static/indexes-opclass.html

===

Date: Thu, 05 Feb 2004 17:25:00 +0700
From: David Garamond <lists@zara.6.isreserved.com>
To: Lincoln Yeoh <lyeoh@pop.jaring.my>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Can LIKE use indexes or not?

Lincoln Yeoh wrote:
> If you use an exact = does it use the index?
 > e.g. explain select ... where lower(f)='xxxxxxxx'

Yes it does.

> If so it could be your locale setting.  On some versions of Postgresql 
> like is disabled on non-C locales.

I'm using 7.4.1. These are the lines in postgresql.conf (it's basically 
pristine from the one created by initdb).

# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.iso885915' #locale for system error message strings
lc_monetary = 'en_US.iso885915' #locale for monetary formatting
lc_numeric = 'en_US.iso885915'  #locale for number formatting
lc_time = 'en_US.iso885915'     #locale for time formatting

> On some versions of Postgresql on 
> some platforms the default is a non-C locale. With version 7.4 you can 
> workaround that: 
> http://www.postgresql.org/docs/current/static/indexes-opclass.html

Yes, that was the cause of the problem. I've now recreated the index 
using the varchar_pattern_ops:

db1=> create unique index i1 on t(i varchar_pattern_ops);
db1=> create unique index i2 on t(lower(i) varchar_pattern_ops);

and now EXPLAIN tells me the query uses Index scan:

db1=> explain select * from t where f like 'xx%';
                                       QUERY PLAN
--------------------------------------------------------------------------------------
  Index Scan using i1 on t  (cost=0.00..6.01 rows=322 width=14)
    Index Cond: ((f ~>=~ 'xx'::character varying) AND (f ~<~ 
'xy'::character varying))
    Filter: (f ~~ 'xx%'::text)
(3 rows)

db1=> explain select * from t where lower(f) like 'xx%';
                                              QUERY PLAN
----------------------------------------------------------------------------------------------------
  Index Scan using i2 on t  (cost=0.00..4049.64 rows=1421 width=14)
    Index Cond: ((lower(f) ~>=~ 'xx'::character varying) AND (lower(f) 
~<~ 'xy'::character varying))
    Filter: (lower(f) ~~ 'xx%'::text)
(3 rows)

> Hope that helps,

Yes it does, thanks. Apparently using the index does improve the speed:

db1=> select * from t where f like 'xx%';
      f
------------
  xxAGRrXrXr
  xxAwScNpWh
  ...
  xxyuFyyDtn
(98 rows)

Time: 9.679 ms

db1=> select * from t where lower(f) like 'xx%';
      f
------------
  xxaAvoarIZ
  XXadJWnXcK
  ...
  xXzynzWllI
(413 rows)

Time: 8.626 ms

===



the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu