1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Convert PDF page to image with PyPDF2 and BytesIO

Discussão em 'Python' iniciado por Stack, Setembro 12, 2024.

  1. Stack

    Stack Membro Participativo

    I have a function that gets a page from a PDF file via PyPDF2 and should convert the first page to a png (or jpg) with Pillow (PIL Fork)

    from PyPDF2 import PdfFileWriter, PdfFileReader
    import os
    from PIL import Image
    import io

    # Open PDF Source #
    app_path = os.path.dirname(__file__)
    src_pdf= PdfFileReader(open(os.path.join(app_path, "../../../uploads/%s" % filename), "rb"))

    # Get the first page of the PDF #
    dst_pdf = PdfFileWriter()
    dst_pdf.addPage(src_pdf.getPage(0))

    # Create BytesIO #
    pdf_bytes = io.BytesIO()
    dst_pdf.write(pdf_bytes)
    pdf_bytes.seek(0)

    file_name = "../../../uploads/%s_p%s.png" % (name, pagenum)
    img = Image.open(pdf_bytes)
    img.save(file_name, 'PNG')
    pdf_bytes.flush()


    That results in an error:


    OSError: cannot identify image file <_io.BytesIO object at 0x0000023440F3A8E0>

    I found some threads with a similar issue, (PIL open() method not working with BytesIO) but I cannot see where I am wrong here, as I have pdf_bytes.seek(0) already added.

    Any hints appreciated

    Continue reading...

Compartilhe esta Página