Skip to main content

Web browser automation with Python and Playwright

Oleg Kulyk

Oleg Kulyk

Co-Founder @ ScrapingAnt

Web browser automation with Python and Playwright

In this article, we'd like to share the current state of Playwright integration with Python and several helpful code snippets for understanding the code techniques.

How to use Playwright for controlling Chromium, Firefox, or WebKit with Python#

Playwright is a Python library to automate Chromium, Firefox and WebKit with a single API. Playwright is built to enable cross-browser web automation that is ever-green, capable, reliable and fast.

In comparison to other automation libraries like Selenium, Playwright offers:

  • Native emulation support for mobile devices
  • Cross-browser single API
  • Microsoft Open Source team maintenance
  • Scenarios that span multiple pages, domains, and iframes
  • Auto-wait for elements to be ready before executing actions (like click, fill)
  • Better developer experience by automatically installing the browsers
  • Native input events for mouse and keyboard or up-/downloading files

And by that, all these features are also available in the Python integration. Be aware, that Playwright Python is currently in beta but already exposes many of the common methods and functions to be used. Since communication with browsers is mostly async based, Playwright does also provide an async based interface. You can pick the one that works best for you. They are identical in terms of capabilities and only differ in the way each consumes the API.

Also, most of these features are available in our API workers (except Microsoft maintenance).

Playwright Python examples#

Let's check out the main Playwright via the following examples:

Synchronous page screenshot#

This code snippet navigates to scrapingant.com in Chromium, Firefox and WebKit, and saves 3 screenshots.

from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch()
page = browser.newPage()
page.goto('http://scrapingant.com/')
page.screenshot(path=f'scrapingant-{browser_type.name}.png')
browser.close()

Asynchronous page screenshot#

The code snippet below does the same as above, but in an async way.

import asyncio
from playwright import async_playwright
async def main():
async with async_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = await browser_type.launch()
page = await browser.newPage()
await page.goto('http://scrapingant.com/')
await page.screenshot(path=f'scrapingant-{browser_type.name}.png')
await browser.close()
asyncio.get_event_loop().run_until_complete(main())

How does ScrapingAnt look on a mobile browser?#

Spoiler: Not as good as on a desktop one

But let's evaluate it with a screenshot! The following code snippet will help us to make a screen render from WebKit IPhone-like browser:

from playwright import sync_playwright
with sync_playwright() as p:
iphone_11 = p.devices['iPhone 11 Pro']
browser = p.webkit.launch(headless=False)
context = browser.newContext(
**iphone_11,
locale='en-US'
)
page = context.newPage()
page.goto('https://scrapingant.com')
page.screenshot(path='scrapingant-iphone.png')
browser.close()

Want more?#

To know more about Python version of Playwright library just visit the official Github page: https://github.com/microsoft/playwright-python

And the original NodeJS version (Python and NodeJS APIs looks pretty much the same): https://github.com/microsoft/playwright

Also, the list of awesome Playwright resources: https://github.com/mxschmitt/awesome-playwright

The official website: https://playwright.dev/

Summary#

The Python implementation of Playwright is still not so well-known and used as traditional NodeJS one, but Microsoft maintenance makes it better, more usable, and bug-free with frequent releases. Don't hesitate to help this awesome open-source library. If you encounter any bugs or find some missing features, feel free to file an issue on GitHub.

Our web scraping API runs thousands of headless browsers in the cloud, so you can just connect it and use without setting it up on your own.

Forget about getting blocked while scraping the Web

Try out ScrapingAnt Web Scraping API with thousands of proxy servers and an entire headless Chrome cluster