Skip to main content

How to Collect Data from TikTok

ยท 3 min read
Oleg Kulyk

How to Collect Data from TikTok

There is a lot of news related to TikTok being sold to US companies and the issue of scraping TikTok data becomes more real due to the possible closing of the service.

In short, TikTok now has a sizeable real-world influence, especially considering that a typical user spends almost an hour per day watching videos on the platform. With this in mind, it is important to understand what TikTok shows to millions of eyeballs every day. To do that, we will need some data.

So, in this guide, we'd like to introduce the method of collecting useful data from this media service using the Python library TikTokApi. The following topics will be covered:

  1. ๐Ÿ‘ค Collecting videos posted by a user
  2. โค๏ธ Collecting videos liked by a user
  3. ๐Ÿท๏ธ Collecting videos by hashtag
  4. ๐Ÿ“ˆ Collecting trending videos
  5. ๐Ÿง‘๐Ÿฝโ€๐Ÿคโ€๐Ÿง‘๐Ÿฝ Collecting a list of users (by a seed account)

0. ๐Ÿค“ Prepare the dependenciesโ€‹

Rather than rewrite all the simple steps, please, check out the official TikTokApi Github repository to be up to date: https://github.com/davidteather/TikTok-Api#getting-started

Also, to start scraping in scale, we'd like to suggest using proxies to avoid a TikTok ban. Check our free proxies list: https://scrapingant.com/free-proxies/

1. ๐Ÿ‘ค Collecting videos posted by a userโ€‹

To scrape videos from the Kourtney Kardashian - @kourtneykardashian TikTok account, here is what we need to do in Python:

from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
username = 'kourtneykardashian'
user_videos = api.byUsername(username, count=n_videos)

print(user_videos)

The user_videos object is now a list of 100 video dictionaries, and the print(user_videos) output will look like the following:

[
{
"id":"6842416492261248262",
"desc":"",
"createTime":1593124239,
"video":{
"id":"awesome",
"height":1024,
"width":576,
"duration":50,
"ratio":"720p",
"cover":"https://p16-sign-sg.tiktokcdn.com/obj/tos-maliva-p-0068/4b45d1820df44e81971cb2981f159cf8_1593124242?x-expires=1600174800&x-signature=anmH2YurlaeKKYUv3fbXt0IIcEA%3D",
"originCover":"https://p16-sign-sg.tiktokcdn.com/obj/tos-maliva-p-0068/f4521a4cf921460e908d300033f13b3e_1593124241?x-expires=1600174800&x-signature=fJrnLur6fsrLdLzRrPMmRk2USrc%3D",
"dynamicCover":"https://p16-sign-sg.tiktokcdn.com/obj/tos-maliva-p-0068/abb62e51195f430daf261120aed1a196_1593124242?x-expires=1600174800&x-signature=QW9I2K%2BosgTDSb5wnaBC0avUuQE%3D",
"playAddr":"https://v77.tiktokcdn.com/81609c19fb240081e71e7a1ddebc7435/5f5e7872/video/tos/useast2a/tos-useast2a-pve-0068/f44d75f329f74133a9005e0c53303ab0/?a=1233&br=2128&bt=1064&cr=0&cs=0&cv=1&dr=0&ds=3&er=&l=202009131351280101901860143105F323&lr=tiktok_m&mime_type=video_mp4&qs=0&rc=anVubmY8a3dsdTMzZjczM0ApNzlnNzlpPDw5N2ZoPGZpZmdnNWowXmleXm5fLS0wMTZzcy4tYl4uXzFiMGEzMTJhX2I6Yw%3D%3D&vl=&vr=",
"downloadAddr":"https://v77.tiktokcdn.com/81609c19fb240081e71e7a1ddebc7435/5f5e7872/video/tos/useast2a/tos-useast2a-pve-0068/f44d75f329f74133a9005e0c53303ab0/?a=1233&br=2128&bt=1064&cr=0&cs=0&cv=1&dr=0&ds=3&er=&l=202009131351280101901860143105F323&lr=tiktok_m&mime_type=video_mp4&qs=0&rc=anVubmY8a3dsdTMzZjczM0ApNzlnNzlpPDw5N2ZoPGZpZmdnNWowXmleXm5fLS0wMTZzcy4tYl4uXzFiMGEzMTJhX2I6Yw%3D%3D&vl=&vr=",
"shareCover":[
....
]

2. โค๏ธ Collecting videos liked by a userโ€‹

Let's continue working with Kourtney Kardashian TikTok account and check what videos have been liked by this account:

from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
username = 'kourtneykardashian'

liked_videos = api.userLikedbyUsername(username, count=n_videos)

print(liked_videos)

3. ๐Ÿท๏ธ Collecting videos by hashtagโ€‹

Let's check out what videos we can scape by the #kardashian hashtag:

from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
hashtag = 'kardashian'

hashtag_videos = api.byHashtag(hashtag, count=n_videos)

print(hashtag_videos)

Maybe you just need to collect trending videos for content analysis. The API makes that pretty simple:

from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
trending_videos = api.trending(count=n_videos)

print(trending_videos)

5. ๐Ÿง‘๐Ÿฝโ€๐Ÿคโ€๐Ÿง‘๐Ÿฝ Collecting a list of users from a seed accountโ€‹

This is probably the most interesting part of the analysis and machine learning experiments. So, for example, we'd like to get all suggested users for Kourtney Kardashian for further retrieving of TikToks:

from TikTokApi import TikTokApi
api = TikTokApi()
n_suggestions = 100
username = 'kourtneykardashian'
user_id = api.getUser(username)['userInfo']['user']['id']
suggested = api.getSuggestedUsersbyID(count=n_suggestions, user_id=user_id)

print(suggested)

Summaryโ€‹

TikTok provides a large amount of useful data that can be converted into a machine learning dataset or used for manual analysis. Also, media research allows an understanding of trends dynamic for creating the most popular account.

Forget about getting blocked while scraping the Web

Try out ScrapingAnt Web Scraping API with thousands of proxy servers and an entire headless Chrome cluster