1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[SQL] How to Incrementally Append and Upsert Rows in Kedro directly to PostgreSQL DBt?

Discussão em 'Outras Linguagens' iniciado por Stack, Outubro 25, 2024 às 10:12.

  1. Stack

    Stack Membro Participativo

    I'm working on a Kedro project where I have a dataset defined in catalog.yml as follows:

    daily_stats_dataset:
    type: ${datasets.orm_table}
    orm_model: my_proj.schemas.sqla_schemas.DailyStats
    credentials: my_database


    monthly_stats_dataset:
    type: ${datasets.orm_table}
    orm_model: my_proj.schemas.sqla_schemas.MonthlyStats
    credentials: my_database


    I need to incrementally append new rows from my daily_stats_dataset to my monthly_stats_dataset. However the daily stats are aggregated each time and added to the monthly stats. This means that, when the monthly stats already exist, I need to overwrite that line. I want to ensure that I do not insert duplicate entries based on a date column. I would thus like to implement upsert functionality for existing rows.

    Is it possible to do this with Kedro, since the nodes look like this

    node(
    func=update_monthly_stats,
    inputs=daily_stats_dataset,
    outputs="monthly_stats_dataset",
    ),


    but the monthly_stats will be inserted, resulting in unique constraints to be violated.

    Continue reading...

Compartilhe esta Página