I am following this example for scraping a news website. When I check the type returned in his case, the type is a scrapy.selector.unified.SelectorList. In my case, since the data of interest is enclosed in <script> tags I managed to extract and parse it in the form of a List via the below python code. fetch('https://newswebsite.com/news/national') data = re.findall("<script type=.application.ld.json. id=.listing-ld.>{.@graph..+?),.@context.:.http:..schema.org..<.script>", response.body.decode("utf-8"), re.S) #convert list to string before converting to json jsonData = json.loads(''.join(data)) Having return a List I cannot keep following the example to implement item loaders Could you guide me on what python concepts are in use in the below code so I can familiarize myself and be able to adapt it to my use case? Why is the item being loaded in the item loader before being parsed with the css selector (.add_css)? from itemloaders.processors import TakeFirst, MapCompose from scrapy.loader import ItemLoader class ChocolateProductLoader(ItemLoader): default_output_processor = TakeFirst() price_in = MapCompose(lambda x: x.split("£")[-1]) url_in = MapCompose(lambda x: 'https://www.chocolate.co.uk' + x ) import scrapy from chocolatescraper.itemloaders import ChocolateProductLoader from chocolatescraper.items import ChocolateProduct class ChocolateSpider(scrapy.Spider): # The name of the spider name = 'chocolatespider' # These are the urls that we will start scraping start_urls = ['https://www.chocolate.co.uk/collections/all'] def parse(self, response): products = response.css('product-item') for product in products: chocolate = ChocolateProductLoader(item=ChocolateProduct(), selector=product) chocolate.add_css('name', "a.product-item-meta__title::text") chocolate.add_css('price', 'span.price', re='<span class="price">\n <span class="visually-hidden">Sale price</span>(.*)</span>') chocolate.add_css('url', 'div.product-item-meta a::attr(href)') yield chocolate.load_item() next_page = response.css('[rel="next"] ::attr(href)').get() if next_page is not None: next_page_url = 'https://www.chocolate.co.uk' + next_page yield response.follow(next_page_url, callback=self.parse) Continue reading...