1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[SQL] How to better structure this complex multi-join CTE-based query

Discussão em 'Outras Linguagens' iniciado por Stack, Outubro 5, 2024 às 11:03.

  1. Stack

    Stack Membro Participativo

    I am building a billing system for a service provider that works as follows:

    • Delivered services (deliveries) are charged by time
    • The hourly rate for a delivery depends on
      • The client who received the service
      • The role of the person that delivered the service
      • A possible override of that role for the delivery
      • The hourly rate for the role at the point of delivery

    Here is a simplified version of how I modelled this in Postgres:

    CREATE TABLE client (
    id TEXT PRIMARY KEY
    );

    CREATE TABLE role (
    id TEXT PRIMARY KEY
    );

    CREATE TABLE rate (
    id TEXT PRIMARY KEY,
    client TEXT NOT NULL REFERENCES client(id),
    role TEXT NOT NULL REFERENCES role(id),
    valid_from DATE NOT NULL,
    hourly_rate FLOAT8 NOT NULL
    );

    CREATE TABLE person (
    id TEXT PRIMARY KEY,
    default_role TEXT NOT NULL REFERENCES role(id)
    );

    CREATE TABLE delivery (
    id TEXT PRIMARY KEY,
    delivered DATE NOT NULL,
    client TEXT NOT NULL REFERENCES client(id),
    person TEXT NOT NULL REFERENCES person(id),
    role_override TEXT
    );


    Here is some sample data:

    INSERT INTO role(id)
    VALUES
    ('cheap-role'),
    ('expensive-role');

    INSERT INTO person(id,default_role)
    VALUES
    ('cheap-person','cheap-role'),
    ('expensive-person','expensive-role');

    INSERT INTO client(id)
    VALUES
    ('client-1'),
    ('client-2');

    INSERT INTO rate(id, client, role, valid_from, hourly_rate)
    VALUES
    ('c1-cheap-pre-2000','client-1','cheap-role','1900-01-01', 11),
    ('c1-cheap-post-2000','client-1','cheap-role','2000-01-01', 21),
    ('c1-expensive-pre-2000','client-1','expensive-role','1900-01-01', 101),
    ('c1-expensive-post-2000','client-1','expensive-role','2000-01-01', 201),
    ('c2-cheap-pre-2000','client-1','cheap-role','1900-01-01', 12),
    ('c2-cheap-post-2000','client-1','cheap-role','2000-01-01', 22),
    ('c2-expensive-pre-2000','client-1','expensive-role','1900-01-01', 102),
    ('c2-expensive-post-2000','client-1','expensive-role','2000-01-01', 202);

    INSERT INTO delivery(id, client, delivered, person, role_override)
    VALUES
    ('1900','client-1', '1950-1-1','cheap-person',NULL),
    ('1900-or','client-1', '1950-1-1','cheap-person','expensive-role'),
    ('2000','client-1','2050-1-1','cheap-person',NULL),
    ('2000-or','client-1','2050-1-1','cheap-person','expensive-role');


    I now want a query that returns deliveries with the correct (effective) hourly rate – that is then multiplied by the duration to compute the cost of the delivery.

    Here is my current solutions (using CTEs to avoid lots of coalesced sub-queries):

    WITH delivery_role AS (
    SELECT
    delivery.id AS delivery_id,
    delivery.delivered AS delivery_delivered,
    delivery.client AS client_id,
    delivery.role_override AS override_role,
    person.default_role AS default_role,
    COALESCE(delivery.role_override,
    person.default_role) AS effective_role
    FROM
    delivery
    JOIN person ON person.id = delivery.person
    ),
    delivery_rate AS (
    SELECT DISTINCT ON (delivery_role.delivery_id)
    delivery_role.delivery_id AS delivery_id,
    override_billing_rate.hourly_rate AS override_hourly_rate,
    override_billing_rate.valid_from AS override_valid_from,
    default_billing_rate.hourly_rate AS default_hourly_rate,
    default_billing_rate.valid_from AS default_valid_from,
    effective_billing_rate.hourly_rate AS effective_hourly_rate,
    effective_billing_rate.valid_from AS effective_valid_from
    FROM
    delivery_role
    JOIN rate AS effective_billing_rate ON delivery_role.effective_role = effective_billing_rate.role
    AND effective_billing_rate.valid_from <= delivery_role.delivery_delivered
    AND effective_billing_rate.client = delivery_role.client_id
    JOIN rate AS default_billing_rate ON delivery_role.default_role = default_billing_rate.role
    AND default_billing_rate.valid_from <= delivery_role.delivery_delivered
    AND default_billing_rate.client = delivery_role.client_id
    LEFT JOIN rate AS override_billing_rate ON delivery_role.override_role = override_billing_rate.role
    AND override_billing_rate.client = delivery_role.client_id
    AND override_billing_rate.valid_from <= delivery_role.delivery_delivered
    AND override_billing_rate.client = delivery_role.client_id
    ORDER BY
    delivery_role.delivery_id,
    effective_billing_rate.valid_from DESC
    )
    SELECT
    delivery.id AS delivery_id,
    delivery.client AS client,
    delivery_role.delivery_id AS role_delivery,
    delivery_rate.delivery_id AS rate_delivery,
    delivery_role.default_role AS default_role,
    delivery_role.override_role AS override_role,
    delivery_role.effective_role AS effective_role,
    delivery_role.client_id AS client,
    delivery.delivered AS delivered,
    delivery_rate.default_hourly_rate AS default_hourly_rate,
    delivery_rate.default_valid_from AS default_valid_from,
    delivery_rate.override_hourly_rate AS override_hourly_rate,
    delivery_rate.override_valid_from AS override_valid_from,
    delivery_rate.effective_hourly_rate AS effective_hourly_rate,
    delivery_rate.effective_valid_from AS effective_valid_from,
    delivery_rate.effective_hourly_rate IS NULL as missing_rate
    FROM
    delivery
    JOIN delivery_role ON delivery_role.delivery_id = delivery.id
    LEFT JOIN delivery_rate ON delivery_rate.delivery_id = delivery.id
    LEFT JOIN role AS billing_role ON billing_role.id = delivery_role.effective_role;


    This seems to work and would be fine if all I wanted to do was use the effective hourly rate. I would, however, also like to see the default rate that would have applied to the delivery if the role had not been overriden. This does not get computed correctly because of the DISTINCT ON I use to find the valid effective rate (by ordering by the valid_from date)

    So my questions are:

    1. Can I somehow see the correct default rate using this approach?
    2. Is there a generally better approach to solving this problem?

    Thanks!

    Here is a fiddle: https://www.db-fiddle.com/f/qT4shgSTeTaR2EFvrGL8c5/0

    Continue reading...

Compartilhe esta Página