1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] AWS DynamoDB download speed with Boto3

Discussão em 'Python' iniciado por Stack, Outubro 25, 2024 às 07:42.

  1. Stack

    Stack Membro Participativo

    I am using python/Boto3 to download logs from a health tracking device. I wrote a python app that lets you run a paginated query for a a given user & timeframe:


    def paginated_query(self, tstart, tend, user):
    paginator = self.datapoints_table.meta.client.get_paginator('query')
    page_iterator = paginator.paginate(
    TableName=self.table_name,
    KeyConditionExpression= Key('userId').eq(user_id) & Key('timestamp').between(tstart, tend)
    )



    The query itself works and executes within a few milliseconds.

    Then, to download the data and process it with python, I convert the generator into a list:

    page_iterator = self.paginated_query(tstart, tend, user)

    log_list = []
    for p in page_iterator:
    items = p.get('Items', []) # download the data
    log_list.extend(items)


    This last snippet takes a very long time depending on how much data I am trying to download. I tried to process pages in parallel, instead of sequentially in a for loop, but it makes no difference in the execution time. I measured a data download rate from DynamoDB/AWS at 1.8MB/s max, which seems much slower than what my connection could achieve. I can only assume there is a bottleneck on AWS side.

    Does anyone know how to make the data download faster?

    Continue reading...

Compartilhe esta Página