1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[SQL] DataBricks - GeoSpatial Data

Discussão em 'Outras Linguagens' iniciado por Stack, Novembro 6, 2024 às 06:42.

  1. Stack

    Stack Membro Participativo

    We have a shape file and we loaded into a table , our geo code values we stored it as a string using Mosaic and it has list of multi polygon values example below.

    MULTIPOLYGON (((19.5059070000000006 59.4480390000000014, 19.5089129999999997 59.4449729999999974, 19.5087739999999989 59.4425499999999971, 19.5063069999999996 59.4386119999999991, 19.5048250000000003 59.4396679999999975, 19.5004950000000008 59.4395019999999974, 19.4944050000000004 59.4416360000000026, 19.4947769999999991 59.4430910000000026, 19.4993049999999997 59.4451729999999969, 19.4974220000000003 59.4462359999999990, 19.4984310000000001 59.4474480000000014, 19.5016140000000000 59.4471959999999982, 19.5059070000000006 59.4480390000000014)),


    like many lat and long included in the above list. COlumne name - Geo

    While transforming our next stage ,we want to find out the lat and long for individual record falls under which multi polygon list. so we used the below case function

    case
    when st_contains(
    st_geomfromwkt(regs.geo),
    st_setsrid(
    st_point(
    base_hex.center_point_lon, base_hex.center_point_lat
    ),
    4326
    )
    ) then 1
    else 0
    end as locate_region


    the over all record count is 1 million records , we are trying to create a table and which is taking more time in databricks sql ,more than an hour and still running.

    Is there any alternative approach for the above case statement? or can we need any indexing or how to improve the performance ? Please suggest.

    Continue reading...

Compartilhe esta Página