1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Cloud Function running more than once because it takes too much to run [closed]

Discussão em 'Python' iniciado por Stack, Setembro 11, 2024 às 01:43.

  1. Stack

    Stack Membro Participativo

    I'm currently working in a project that involves doing a certain procedure to a lot of .csvs, basically I use one API to check if the email on certain rows of the .csvs are working good or not. And also I use chatGPT API to correct some issues with names and description of companies.

    For now what I have is a Cloud Function that runs when a specific Bucket is changed (Finished writing a file) and uses that file (if its a .csv) to do the whole process. The problem is that the function uses differents APIs that use a lot of time that can reach a whole hour if the .csv is long enaugh and by now my only choice is to use those APIs.

    I don't know particularly why in some point Google Functions decide to run another excecution of the function when the first one is still running. I'm not adding any file to the bucket and I specified to Cloud Run that I want only 1 intance and 1 request per instance by now so I dont understand why it starts again. Is this a limitation of Cloud Functions, should I change to another tool? Or it can be fixed?

    The General Information of the function I'm using is setted like this:

    • Region: us-central1
    • Memory allocated: 512 MiB
    • CPU: 1
    • Timeout: 3,600 seconds
    • Minimum instances: 0
    • Maximum instances: 1
    • Concurrency: 1

    Here are some logs of what's happening:

    In this first part what the logs are saying is that the first API finished its processing (It took at least 15 minutes), what it should do next is to split that output into two dataframes and send those to ChatGPT to do the corrections, but it takes so much time in the OpenAI part that it does some weird POST i don't know why and then restarts the function.

    2024-09-09 17:06:28.482 ART - Successfully processed file.

    2024-09-09 17:06:28.482 ART - Splitting into the two final csvs...

    2024-09-09 17:06:28.576 ART - Successfully split the files.

    2024-09-09 17:06:28.577 ART - Initializing OpenAI API...

    2024-09-09 17:06:28.671 ART - Starting the prompts for each row...

    --- Here is taking long with the OpenAI part and so starts doing this weird POSTs

    2024-09-09 17:06:39.764 ART - POST42914 B0 msAPIs-Google; (+https://developers.google.com/webmasters/APIs-Google.html) url/?__GCP_CloudEventsMode=GCS_NOTIFICATION

    2024-09-09 17:14:13.795 ART - POST42914 B0 msAPIs-Google; (+https://developers.google.com/webmasters/APIs-Google.html) url/?__GCP_CloudEventsMode=GCS_NOTIFICATION

    2024-09-09 17:23:15.269 ART - POST42914 B0 msAPIs-Google; (+https://developers.google.com/webmasters/APIs-Google.html) url/?__GCP_CloudEventsMode=GCS_NOTIFICATION

    2024-09-09 17:29:51.327 ART - POST42914 B0 msAPIs-Google; (+https://developers.google.com/webmasters/APIs-Google.html) url/?__GCP_CloudEventsMode=GCS_NOTIFICATION

    2024-09-09 17:37:40.456 ART - POST200130 B1,552.3 sAPIs-Google; (+https://developers.google.com/webmasters/APIs-Google.html) url/?__GCP_CloudEventsMode=GCS_NOTIFICATION

    --- Here it just restarted the function with the same .csv

    2024-09-09 17:37:40.478 ART - Starting the process for the file: blabla.csv

    2024-09-09 17:37:40.478 ART - Creating Storage Client.

    2024-09-09 17:37:40.593 ART - Storage Client created successfully.

    2024-09-09 17:37:40.593 ART - Opening .csv...

    --- And when you continue reading the logs, basically the function kept running in the other instance and finished when the OpenAI part of the process finished, but this new instance that Google Cloud started because no apparent reason also ran.

    When the first API ends downloading the report, then goes to OpenAI processing and the problem is that when thats ocurring, suddently it starts again when it says "Starting the process for the file: blabla.csv", and that file is the same one OpenAI is currently processing but it's taking some time.

    So basically, I need to know mainly why Google Functions decides to run my function again instead of waiting another instance to finish or at least kill that other instance.

    Continue reading...

Compartilhe esta Página