1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Parse data from local html-file using bs4?

Discussão em 'Python' iniciado por Stack, Outubro 3, 2024 às 10:02.

  1. Stack

    Stack Membro Participativo

    i try to parse a local html-document using the following code -

    import os, sys
    from bs4 import BeautifulSoup

    path = os.path.abspath(os.path.dirname(sys.argv[0]))
    fnHTML = os.path.join(path, "inp.html")
    page = open(fnHTML)
    soup = BeautifulSoup (page.read(), 'lxml')

    worker = soup.find("span")
    wHeadLine = worker.text.strip()
    wPara = worker.find_next("td").text.strip()
    print(wHeadLine)
    print(wPara)


    The output look like that:

    Find your faves—faster
    We’ve made it easier than ever to see what’s on now and continue watching your recordings, favorite teams and more.


    But the text on the html looks like that - see the picture

    [​IMG]

    Why is the text not outputed with "—" and "We’ve"?

    Continue reading...

Compartilhe esta Página