- Axios: Promise based HTTP client for the browser and node.js.
Features: XMLHttpRequests from the browser, HTTP requests from node.js, Promise API, intercepting of request and response, transforming of request and response, automatic transforming for JSON data
- Got: Human-friendly and powerful HTTP request library for Node.js.
Features: HTTP/2 support, Promise API, Stream API, Pagination API, Cookies (out-of-box), Progress events.
- Superagent: Small progressive client-side HTTP request library, and Node.js module with the same API, supporting many high-level HTTP client features.
Features: HTTP/2 support, Promise API, Stream API, Request cancelation, Follows redirects, Retries on failure, Progress events.
DOM manipulation and HTML parsing
- Cheerio: Fast, flexible & lean implementation of core jQuery designed specifically for the server.
- htmlparser2: A forgiving HTML/XML/RSS parser. The parser can handle streams and provides a callback interface.
This module started as a fork of the htmlparser module. The main difference is that htmlparser2 is intended to be used only with node (it runs on other platforms using browserify). htmlparser2 was rewritten multiple times and, while it maintains an API that’s compatible with htmlparser in most cases, the projects don’t share any code anymore.
- Puppeteer: Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.
- Awesome resources for Puppeteer: https://github.com/transitive-bullshit/awesome-puppeteer
- Selenium: Selenium is an umbrella project encapsulating a variety of tools and libraries enabling web browser automation. Selenium specifically provides infrastructure for the W3C WebDriver specification — a platform and language-neutral coding interface compatible with all major web browsers.
- PlayWright: Playwright is a Node library to automate Chromium, Firefox and WebKit with a single API. Playwright is built to enable cross-browser web automation that is ever-green, capable, reliable and fast.
- amazon-scraper: Useful tool to scrape product information from the amazon
- app-store-scraper: Node.js module to scrape application data from the iTunes/Mac App Store.
- instagram-scraper: Since Instagram has removed the option to load public data through its API, this actor should help replace this functionality.
- google-play-scraper: Node.js module to scrape application data from the Google Play store.
- scrapedin: Scraper for LinkedIn full profile data. Unlike other scrapers, it’s working in 2020 with their new website.
- tiktok-scraper: Scrape and download useful information from TikTok.
Also, our scraping API is language agnostic, so you can check it even if you’re not very familiar with JS or Python.