1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Parallelize the validation of 1 epoch with training of another in pytorch lightning

Discussão em 'Python' iniciado por Stack, Outubro 1, 2024 às 09:02.

  1. Stack

    Stack Membro Participativo

    I am training a pytorch lightning model on a GPU. I validating it at every epoch and each validation run takes about 30 mins on a dataset big enough to give me a reliable estimate.

    I would like to speed up the process by validating in parallel (on a CPU) with the training of the next epoch. Does pytorch lightning support this functionality?

    The only solution I could think of is to run a parallel thread that would be waiting for a new epoch to finish (with the weights then stored in a checkpoint file).

    Continue reading...

Compartilhe esta Página