[Python] Using Beautifulsoup To Scrape the data from a worldmap and store this into a csv-file

Stack · Outubro 25, 2024 às 12:32

try to scrape the data of the site https://www.startupblink.com/startups - in order to grab all the startups: well i think this is a good chance to do this with python and beautiful soup.

Technically, we could use Python and Beautiful Soup to scrape the data from the website https://www.startupblink.com/startups

what is needed: .. here some overfiew on the steps:

first we need to send a GET request to the website using the requests library in Python. then we parse the HTML content of the response using Beautiful Soup.

we need to find the HTML elements that contain the startup data we re interested in using Beautiful Soup's find or find_all methods.

afterwards we try to extract the relevant information from the HTML elements using Beautiful Soup's string or get methods. finally we store the data in a format of our choice, such as a CSV file or a database ( note - if we would use pandas it would be a bit easier i get )

Here's some first ideas to get this started:

import requests
from bs4 import BeautifulSoup
import csv

# Send an HTTP request to the website's URL and retrieve the HTML content
url = 'https://www.startupblink.com/startups'
response = requests.get(url)

# Parse the HTML content using Beautiful Soup
soup = BeautifulSoup(response.content, 'html.parser')

# Find all the startup listings on the page
startup_listings = soup.find_all('div', {'class': 'startup-list-item'})

# Create a CSV file to store the extracted data
with open('startup_data.csv', mode='w', newline='') as file:
writer = csv.writer(file)
writer.writerow(['Name', 'Description', 'Location', 'Website'])

# Loop through each startup listing and extract the relevant information
for startup in startup_listings:
name = startup.find('a', {'class': 'startup-link'}).text.strip()
description = startup.find('div', {'class': 'startup-description'}).text.strip()
location = startup.find('div', {'class': 'startup-location'}).text.strip()
website = startup.find('a', {'class': 'startup-link'})['href']

# Write the extracted data to the CSV file
writer.writerow([name, description, location, website])

at this point i think that i have to rework the code - i getback only a tiny csv file with 35 bytes.

i will have to run more tests - to makesure that i get the right approach

Continue reading...

Logar ou Criar uma Conta

[Python] Using Beautifulsoup To Scrape the data from a worldmap and store this into a csv-file

Stack Membro Participativo

Compartilhe esta Página

Logar ou Criar uma Conta

[Python] Using Beautifulsoup To Scrape the data from a worldmap and store this into a csv-file

Stack Membro Participativo

Compartilhe esta Página

Pesquisas Úteis