1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] What is the fastest way to read in a large yaml file containing lists of lists?

Discussão em 'Python' iniciado por Stack, Setembro 12, 2024.

  1. Stack

    Stack Membro Participativo

    I have a number of yaml files I need to read in which contain lists of list. Here is a way to make some example data:

    from time import time
    import random
    import yaml

    # First make a list of lists
    N = 2**17
    lol = []
    for _ in range(N):
    lol.append([random.uniform(0, 2) for _ in range(10)])

    # Write the list of lists to a yaml file
    with open('data.yml', 'w') as outfile:
    yaml.dump(lol, outfile, default_flow_style=True)


    I want to read them in as quickly as possible. Pyyaml is unfortunately slow.

    # Now time how long it takes to read it back in
    t = time()
    with open("data.yml", "r") as f:
    lol = yaml.safe_load(f)
    print(f"Reading took {round(time()-t, 2)} seconds")


    This give over 60 seconds for me. The file is 27MB in size.

    Is there a faster way to read in a yaml fie of exactly this format?

    Continue reading...

Compartilhe esta Página