1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[SQL] DBT Post_hook execution

Discussão em 'Outras Linguagens' iniciado por Stack, Novembro 8, 2024 às 06:32.

  1. Stack

    Stack Membro Participativo

    I'm encountering an issue with a macro in dbt that is meant to check if a table exists before executing a conditional delete. The macro looks like this:

    {% macro check_table_exists(project, staging_schema, dmt_table, audit_table) %}
    {%- set sql_query = "SELECT * FROM " ~ project ~ "." ~ staging_schema ~ ".INFORMATION_SCHEMA.TABLES WHERE table_name='" ~ dmt_table ~ "'" -%}
    {%- set result = dbt_utils.get_single_value(sql_query) -%}
    {{ log("result: " ~ result, info=true) }}
    {{ log(result is not none, info=true) }}

    {% if result is not none %}
    DELETE FROM {{ project ~ '.' ~ staging_schema ~ '.' ~ dmt_table }}
    WHERE date_extraction IN (
    SELECT DISTINCT date_extraction FROM {{ project ~ '.' ~ staging_schema ~ '.' ~ audit_table }}
    )
    {% else %}
    SELECT 1
    {% endif %}
    {% endmacro %}


    The macro is invoked in the following way:

    {% set table_exists = true %}
    {% set project = env_var('GCP_PROJECT_APP') %}
    {% set staging_schema = 'MP0_travel_exp' %}
    {% set audit_table = 'audit_dmt_fact_HotelLocFee_t' %}
    {% set dmt_table = 'dmt_fact_HotelLocFee_t' %}

    {{ config(
    post_hook=[check_table_exists(project, staging_schema, dmt_table, audit_table)]
    ) }}


    Issue: The check_table_exists macro always defaults to the else block, even when the table exists. Upon debugging, I found that the result variable first returns None and then later receives the table name. It seems that the post-hook executes with None, causing the macro to default to else and skip the intended DELETE operation.

    Is there a way to ensure that result correctly holds the table name from the start, or to delay the execution of the post_hook until the correct value is available? Any guidance on how to manage this behavior in dbt would be greatly appreciated.

    this is an exemple of Output with log :

    13:37:28 Registered adapter: bigquery=1.8.2
    13:37:29 result: None
    13:37:29 False
    13:37:29 result: None
    13:37:29 False
    13:37:29 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
    There are 1 unused configuration paths:
    - models.mobi_pilot.MP0_exp.compliance
    13:37:29 Found 70 models, 2 seeds, 54 data tests, 47 sources, 613 macros
    13:37:29
    13:37:31 Concurrency: 16 threads (target='dev')
    13:37:31
    13:37:31 1 of 1 START sql table model MP0_travel_exp.audit_dmt_fact_HotelLocFee_t ....... [RUN]
    13:37:32 result: fco-data-sg-mobipilot-dev
    13:37:32 True
    13:37:37 1 of 1 OK created sql table model MP0_travel_exp.audit_dmt_fact_HotelLocFee_t .. [CREATE TABLE (3.8k rows, 907.0 KiB processed) in 5.81s]
    13:37:37
    13:37:37 Finished running 1 table model in 0 hours 0 minutes and 7.28 seconds (7.28s).
    13:37:37
    13:37:37 Completed successfully
    13:37:37
    13:37:37 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1


    FYI : I used different solutions for table verif such as adapter.get_relation

    Thank you!

    Continue reading...

Compartilhe esta Página