1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Python Scrapper Not woking Correctly [closed]

Discussão em 'Python' iniciado por Stack, Outubro 7, 2024.

  1. Stack

    Stack Membro Participativo

    I am not familiar with the scrapers so it might be easy for the pros, so I am scraping this webpage https://www.casamance.com/en/catalog/product/view/id/44450/s/venizia-71460918/ Here the product has different colors, when you press the color(img) the color name and other data is changing, I want to get that data and have a dictionary like : color name - reference num - etc.

    I tried to do it via bs4 but couldn't really end it, when i print the "soup" there is no html in there.


    import requests
    from bs4 import BeautifulSoup

    # Function to scrape product details from Casamance
    def scrape_casamance_product(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')

    # Scrape product name
    try:
    name = soup.find('span', class_='base', attrs={'data-ui-id': 'page-title-wrapper'}).text.strip()
    except AttributeError:
    name = "Could not find 'name'"

    # Scrape subcategory
    try:
    subcategory = soup.find('span', class_='cs-m-product__subtitle').text.strip()
    except AttributeError:
    subcategory = "Could not find 'subcategory'"

    # Scrape color
    try:
    color = None
    list_items = soup.find_all('li', class_='cs-m-list_item')
    for item in list_items:
    label = item.find('span', class_='left')
    if label and label.text.strip() == 'Colors':
    color = item.find('span', class_='right').text.strip()
    break
    if not color:
    color = "Could not find 'color'"
    except AttributeError:
    color = "Could not find 'color'"

    # Scrape reference
    try:
    reference = soup.find_all('span', class_='right')[1].text.strip()
    except (AttributeError, IndexError):
    reference = "Could not find 'reference'"

    # Scrape collection
    try:
    collection = soup.find('span', class_='right_collection').a.text.strip()
    except AttributeError:
    collection = "Could not find 'collection'"

    # Find the <ul> with class 'limited' to get the width
    try:
    limited_list = soup.find('ul', class_='cs-m-list cs-m-list--table js-box_limitable limited')
    width = None

    if limited_list:
    # Iterate through <li> elements to find width
    for item in limited_list.find_all('li', class_='cs-m-list_item'):
    left = item.find('span', class_='left').text.strip()
    if left == 'Width':
    width = item.find('span', class_='right').text.strip()
    break

    if not width:
    width = "Could not find 'width'"
    except Exception as e:
    width = "Could not find 'width'"

    # Scrape image URL

    img_link = soup.find('li', class_='no-thumbs cs-m-list_item swiper-slide cs-m-carousel_item swatch-option image swiper-slide-active')
    print(img_link)


    # Return scraped data as a dictionary
    return {
    'name': name,
    'subcategory': subcategory,
    'color': color,
    'reference': reference,
    'collection': collection,
    'width': width,
    'img_link': img_link
    }

    # Example usage
    product_url = 'https://www.casamance.com/en/catalog/product/view/id/44443/s/venizia-71460102/' # Replace with actual product URL
    product_data = scrape_casamance_product(product_url)

    # Print the output
    print("Scraped Product Data:")
    for key, value in product_data.items():
    print(f"{key}: {value}")



    This code give me this output "Scraped Product Data: name: VENIZIA subcategory: Wallcovering CASAMANCE color: Pierre bleue reference: 71460918 collection: PALATINO width: 137 cm / 54 Inches img_link: None"

    so it is not right, I want to get every color from there and after that get every specific info to that color. Thanks in Advance!

    Continue reading...

Compartilhe esta Página