Playwright can be considered as Puppeteer's successor with a similar API, so many developers prefer to use it for a single page applications data extraction and anti-scraping avoidance while automating their data mining tasks. On the other hand it has a different way to set up a proxy parameters than Puppeteer. Before the Jun 2020, it was a huge problem to make proxy works across all the browsers, but, luckily, the API been unified to pass proxy options via a browser's
launch method. Let's try it out for all the browsers:
The proxy server in the examples below can be outdated at the moment of article reading. You can find the freshest proxies at our Free proxy page.
It's possible to pass proper proxy settings inside
proxy property in
options object for
As a result you'll observe the similar output:
As you can observe, all the browsers have a different ways to pass a proxy settings. For example, Firefox requires passing profile configuration file to set up browser proxy.
It's also possible to pass proxy settings via command line arguments like we do it with Puppeteer. Below you can find the example for Chromium proxy options:
Other browsers also allows you to set up proxy parameters by their native way, but the behaviour may differ between operating systems and browser versions.
By using the methods above you'll be able to set up proxy settings for the whole browser session, not for request or the page. At our previous article we have shared info about setting up your own rotation proxy server and separating each request with using of it.
In order to simplify your web scraper and have more time for data mining tasks itself you might want to get rid of the infrastructure hell and just focus on what you really want to achieve (extract the data).