[Python] Pytorch throwing GPU out of memory error in a bizarre way

Stack · Outubro 3, 2024 às 17:42

I am trying to run a 770M model on RTX 4070 (8G) with 64G RAM and I have this error:

OutOfMemoryError: CUDA out of memory. Tried to allocate 146.00 MiB. GPU 0 has a total capacity of 8.00 GiB of which 0 bytes is free. Of the allocated memory 37.52 GiB is allocated by PyTorch, and 508.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I have 300 samples dataset to fine-tune a t5-base model, it worked fine, but when I tried t5-large, it threw this error.

It came suddenly, any practical help?

I tried to do torch.cuda.empty_cache() to clear the cache as suggested but it doesn't help.

The same t5-base model now giving the same error, and I ran it multiple times for training past days, and I didn't get this error.

Continue reading...

Logar ou Criar uma Conta

[Python] Pytorch throwing GPU out of memory error in a bizarre way

Stack Membro Participativo

Compartilhe esta Página

Logar ou Criar uma Conta

[Python] Pytorch throwing GPU out of memory error in a bizarre way

Stack Membro Participativo

Compartilhe esta Página

Pesquisas Úteis