1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

  2. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[SQL] Aggregate functions vs result set limiting performance impact on views

Discussão em 'Outras Linguagens' iniciado por Stack, Setembro 13, 2021.

  1. Stack

    Stack Membro Participativo

    The problem


    I ran into an performance issue selecting the highest id from a view that joins two tables, depending on the select statement I execute.

    Here's a sample setup:

    CREATE TABLE test (
    id BIGSERIAL NOT NULL,
    joincol VARCHAR,
    CONSTRAINT pk PRIMARY KEY ("id")
    );


    CREATE TABLE test2(
    joincol VARCHAR
    );


    CREATE INDEX ON "test" ("id");
    CREATE INDEX ON "test" ("joincol");
    CREATE INDEX ON "test2" ("joincol");

    CREATE VIEW testview AS
    (
    SELECT table1."id" AS id,
    table1."joincol" AS t1charcol,
    table2."joincol" AS t2charcol
    FROM "test" AS table1,
    "test2" AS table2
    WHERE
    table1."joincol" = table2."joincol"
    );

    What I found out


    Now I'm executing two statements which result in completely different execution plans and also different runtimes. The following statement executes in less than 100ms. As far as I understand the execution plan, the runtime is independent of the rowcount, since postgres iterates the rows one by one (starting at the highest id, using the index) until a join on a row is possible and immediately returns.

    SELECT id FROM testview ORDER BY ID DESC LIMIT 1;


    However this one takes over 1 second on average (depending on rowcount), since the two tables are "joined completely", before postgres uses the index to select the highest id.

    SELECT MAX(id) FROM testview;


    Please refer to this sample on dbfiddle to check the explain plans: https://www.db-fiddle.com/f/bkMNeY6zXqBAYUsprJ5eWZ/1

    My real environment


    On my real environment table1 contains only a hand full of rows (< 100), having unique values in joincol. table2 contains up to ~10M rows, where joincol always matches a value of table1's joincol. Table2's joincol is not nullable.

    I use postgreSQL in version 13.

    The actual question


    My question basically is, why does postgres not recognize that it could use a Index Scan Backward on row basis for the second select? Is there anything I could improve on the tables/indexes?

    Continue reading...

Compartilhe esta Página