This article will expose how to block specific resources (HTTP requests, CSS, video, images) from loading in Playwright. Playwright is Puppeteer's successor with the ability to control Chromium, Firefox, and Webkit. So I'd call it the second one of the most widely used web scraping and automation tools with headless browser support.
Java is one of the most popular and high demanded programming languages nowadays. It allows creating highly-scalable and reliable services as well as multi-threaded data extraction solutions. Let's check out the main concepts of web scraping with Java and review the most popular libraries to setup your data extraction flow.
In this article, we'll take a look at how to submit forms using Playwright. This knowledge might be beneficial while scraping the web, as it allows to get the information from the target web page, which requires providing parameters before.
In this article, we will share several ideas on how to download files with Playwright. Automating file downloads can sometimes be confusing. You need to handle a download location, download multiple files simultaneously, support streaming, and even more. Unfortunately, not all the cases are well documented. Let's go through several examples and take a deep dive into Playwright's APIs used for file download.
In this article, we will learn how to create a simple e-commerce search API with multiple platform support: eBay and Amazon. AutoScraper and FastAPi provide the ability to create a powerful JSON API for the date. With Playwright's help, we'll extend our scraper and avoid blocking by using ScrapingAnt's web scraping API.
Web scraping a website with the actually supported or other browsers has a real benefit in ensuring that the scraper will not be banned by the fingerprint or the behavioral pattern. Playwright already provides full support for Chromium, Firefox, and WebKit out of the box without installing the browsers manually, but since most of the users out there use Google Chrome or Microsoft Edge instead of the open-source Chromium variant, in some scenarios, it's safer to use them to emulate a more realistic browser environment.
In this article, we'd like to share the current state of Playwright integration with Python and several helpful code snippets for understanding the code techniques.