40 posts tagged with "javascript"

How to use rotating proxies with Puppeteer

March 4, 2021 · 4 min read

Co-Founder @ ScrapingAnt

How to use rotating proxies with Puppeteer?

Puppeteer is a high-level API to control headless Chrome. Most things that you can do manually in the browser can be done using Puppeteer, so it quickly became one of the most popular web scraping tool in Node.js and Python. Many developers use it for a single page applications (SPA) data extraction as it allows executing client-side Javascript. In this article, we are going to show how to set up a proxy in Puppeteer and how to spin up your own rotating proxy server.

How to use Microsoft Edge with Playwright

December 1, 2020 · 4 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

How to use Microsoft Edge with Playwright

Web scraping a website with the actually supported or other browsers has a real benefit in ensuring that the scraper will not be banned by the fingerprint or the behavioral pattern. Playwright already provides full support for Chromium, Firefox, and WebKit out of the box without installing the browsers manually, but since most of the users out there use Google Chrome or Microsoft Edge instead of the open-source Chromium variant, in some scenarios, it's safer to use them to emulate a more realistic browser environment.

How to Collect Data from TikTok

September 13, 2020 · 3 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

How to Collect Data from TikTok

There is a lot of news related to TikTok being sold to US companies and the issue of scraping TikTok data becomes more real due to the possible closing of the service.

HTML Parsing Libraries - JavaScript

September 6, 2020 · 5 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

HTML Parsing Libraries - JavaScript

HTML is a simple structured markup language and everyone who is going to write the web scraper should deal with HTML parsing. The goal of this article is to help you to find the right tool for HTML processing. We are not going to present libraries for more specific tasks, such as article extractors, product extractors, or web scrapers.

Open Source Javascript Web Scraping

July 20, 2020 · 4 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

Open Source Javascript Web Scraping

In this article, I’d like to list some most popular Javascript open-source projects that can be useful for web scraping. It consists of both libraries and standalone niche scrapers that can scrape a particular site (Amazon, iTunes, Instagram, Google Play, etc.)

Scraping with millions of browsers or Puppeteer Cluster

July 14, 2020 · 3 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

Scraping with millions of browsers or Puppeteer Cluster

In this article, we’d like to introduce an awesome open-source Web Scraping solution for running a pool of Chromium instances using Puppeteer.

How to run Playwright on AWS Lambda

July 6, 2020 · 4 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

Playwright on AWS lambda

In this article, I’d like to share a quick guide of how to run Playwright inside AWS Lambda. There are a bunch of similar guides about Puppeteer, but only a few are about the successor from Microsoft.

Top Popular JavaScript Libraries for Web Scraping in 2024

June 30, 2020 · 4 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

Top 5 Popular Javascript Libraries for Web Scraping in 2024

We’d like to continue the sequence of our posts about Top 5 Popular Libraries for Web Scraping in 2024 with a new programming language - JavaScript.

JS is a quite well-known language with a great spread and community support. It can be used for both client and server web scraping scripting that makes it pretty suitable for writing your scrapers and crawlers.

Most of these libraries' advantages can be received by web scraping API and some of these libraries can be used in stack with it.

So let’s check them out.

AngularJS site scraping. The easy deal with Puppeteer and Headless Chrome.

June 4, 2020 · 3 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

AngularJS sites scraping

AngularJS is a quite common framework for building modern Single Page Applications, but what about the ability to scrape sites based on it? Let’s find out.

Amazon Product Scraping. Relatively Easy.

May 26, 2020 · 4 min read

Oleg Kulyk

Co-Founder @ ScrapingAnt

Amazon product scraping

In the current article, I’d like to share my experience with Amazon products scraping. The well-known Amazon marketplace offers the best deals for thousands of product types and from thousands of sellers. The potential amount of data to scrape is quite insane and can be used for:

Market price comparison
Price change tracking
Analyzing product reviews
Copyright check
Finding the best products for selling or dropshipping
A lot of data science and machine learning stuff