Skip to main content

24 posts tagged with "javascript"

View All Tags

· 17 min read
Oleg Kulyk

How to download a file with Puppeteer?

Puppeteer, a powerful Node.js library, allows developers to control Chrome or Chromium over the DevTools Protocol. Its high-level API facilitates a wide range of web automation tasks, including file downloads. This guide aims to provide a comprehensive overview of setting up Puppeteer for automated file downloads, using various methods and best practices to ensure efficiency and reliability. Whether you're scraping data, automating repetitive tasks, or handling protected content, Puppeteer offers robust tools to streamline the process.

To get started with Puppeteer, you'll need Node.js installed on your machine and a basic understanding of JavaScript and Node.js. Once installed, Puppeteer provides several ways to download files, including using the browser's fetch feature, simulating user interaction, leveraging the Chrome DevTools Protocol (CDP), and combining Puppeteer with HTTP clients like Axios. Each method has its unique advantages and is suited for different use cases.

Throughout this guide, we'll explore detailed steps for configuring Puppeteer for file downloads, handling various file types and MIME types, managing download timeouts, and implementing error handling. Additionally, we'll cover advanced topics such as handling authentication, managing dynamic content, and monitoring download progress. By following these best practices and considerations, you can create robust and efficient file download scripts using Puppeteer.

For more detailed code examples and explanations, you can refer to the Puppeteer API documentation and other relevant resources mentioned throughout this guide.

· 10 min read
Oleg Kulyk

How to Use Proxies with NodeJS Axios

When scraping websites for all important data, developers are looking for, above all, privacy and security. Using proxies is the best and most effective way to do so without the risk of exposing your IP address, making it less likely that those sites will ban your address in future visits.

Axios is a popular JavaScript Node.js library frequently used for website scraping. It’s a popular tool used widely due to its fast and accurate downloading of website content.

· 9 min read
Oleg Kulyk

Puppeteer Debugging and Troubleshooting - Best Practices

Puppeteer is a powerful tool for automating web testing and scraping. However, it is still subject to problems and bugs like any other software.

It's crucial to have a well-thought-out plan for solving issues in place for times like these.

In this post, we'll explore some of the best practices for Puppeteer debugging and troubleshooting with Puppeteer.

· 5 min read
Oleg Kulyk

How to download images with NodeJS?

Working with images in NodeJS extends your web scraping capabilities, from downloading the image with an URL to retrieving photo attributes like EXIF. How to achieve the image download and obtain the data?

This article is a part of the series on image downloading with different programming languages. Check out the other articles in the series:

· 6 min read
Oleg Kulyk

How to download a file with Playwright?

In this article, we will share several ideas on how to download files with Playwright. Automating file downloads can sometimes be confusing. You need to handle a download location, download multiple files simultaneously, support streaming, and even more. Unfortunately, not all the cases are well documented. Let's go through several examples and take a deep dive into Playwright's APIs used for file download.