There is a lot of news related to TikTok being sold to US companies and the issue of scraping TikTok data becomes more real due to the possible closing of the service.
How to scrape TikTok videos posted or liked by a user, collect a large user list from seed accounts, and collect trending videos - all with a simple API.โ
In short, TikTok now has a sizeable real-world influence, especially considering that a typical user spends almost an hour per day watching videos on the platform. With this in mind, it is important to understand what TikTok shows to millions of eyeballs every day. To do that, we will need some data.
So, in this guide, we'd like to introduce the method of collecting useful data from this media service using the Python library TikTokApi. The following topics will be covered:
- ๐ค Collecting videos posted by a user
- โค๏ธ Collecting videos liked by a user
- ๐ท๏ธ Collecting videos by hashtag
- ๐ Collecting trending videos
- ๐ง๐ฝโ๐คโ๐ง๐ฝ Collecting a list of users (by a seed account)
0. ๐ค Prepare the dependenciesโ
Rather than rewrite all the simple steps, please, check out the official TikTokApi Github repository to be up to date: https://github.com/davidteather/TikTok-Api#getting-started
Also, to start scraping in scale, we'd like to suggest using proxies to avoid a TikTok ban. Check our free proxies list: https://scrapingant.com/free-proxies/
1. ๐ค Collecting videos posted by a userโ
To scrape videos from the Kourtney Kardashian - @kourtneykardashian TikTok account, here is what we need to do in Python:
from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
username = 'kourtneykardashian'
user_videos = api.byUsername(username, count=n_videos)
print(user_videos)
The user_videos
object is now a list of 100 video dictionaries, and the print(user_videos)
output will look like the following:
[
{
"id":"6842416492261248262",
"desc":"",
"createTime":1593124239,
"video":{
"id":"awesome",
"height":1024,
"width":576,
"duration":50,
"ratio":"720p",
"cover":"https://p16-sign-sg.tiktokcdn.com/obj/tos-maliva-p-0068/4b45d1820df44e81971cb2981f159cf8_1593124242?x-expires=1600174800&x-signature=anmH2YurlaeKKYUv3fbXt0IIcEA%3D",
"originCover":"https://p16-sign-sg.tiktokcdn.com/obj/tos-maliva-p-0068/f4521a4cf921460e908d300033f13b3e_1593124241?x-expires=1600174800&x-signature=fJrnLur6fsrLdLzRrPMmRk2USrc%3D",
"dynamicCover":"https://p16-sign-sg.tiktokcdn.com/obj/tos-maliva-p-0068/abb62e51195f430daf261120aed1a196_1593124242?x-expires=1600174800&x-signature=QW9I2K%2BosgTDSb5wnaBC0avUuQE%3D",
"playAddr":"https://v77.tiktokcdn.com/81609c19fb240081e71e7a1ddebc7435/5f5e7872/video/tos/useast2a/tos-useast2a-pve-0068/f44d75f329f74133a9005e0c53303ab0/?a=1233&br=2128&bt=1064&cr=0&cs=0&cv=1&dr=0&ds=3&er=&l=202009131351280101901860143105F323&lr=tiktok_m&mime_type=video_mp4&qs=0&rc=anVubmY8a3dsdTMzZjczM0ApNzlnNzlpPDw5N2ZoPGZpZmdnNWowXmleXm5fLS0wMTZzcy4tYl4uXzFiMGEzMTJhX2I6Yw%3D%3D&vl=&vr=",
"downloadAddr":"https://v77.tiktokcdn.com/81609c19fb240081e71e7a1ddebc7435/5f5e7872/video/tos/useast2a/tos-useast2a-pve-0068/f44d75f329f74133a9005e0c53303ab0/?a=1233&br=2128&bt=1064&cr=0&cs=0&cv=1&dr=0&ds=3&er=&l=202009131351280101901860143105F323&lr=tiktok_m&mime_type=video_mp4&qs=0&rc=anVubmY8a3dsdTMzZjczM0ApNzlnNzlpPDw5N2ZoPGZpZmdnNWowXmleXm5fLS0wMTZzcy4tYl4uXzFiMGEzMTJhX2I6Yw%3D%3D&vl=&vr=",
"shareCover":[
....
]
2. โค๏ธ Collecting videos liked by a userโ
Let's continue working with Kourtney Kardashian TikTok account and check what videos have been liked by this account:
from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
username = 'kourtneykardashian'
liked_videos = api.userLikedbyUsername(username, count=n_videos)
print(liked_videos)
3. ๐ท๏ธ Collecting videos by hashtagโ
Let's check out what videos we can scape by the #kardashian hashtag:
from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
hashtag = 'kardashian'
hashtag_videos = api.byHashtag(hashtag, count=n_videos)
print(hashtag_videos)
4. ๐ Collecting trending videosโ
Maybe you just need to collect trending videos for content analysis. The API makes that pretty simple:
from TikTokApi import TikTokApi
api = TikTokApi()
n_videos = 100
trending_videos = api.trending(count=n_videos)
print(trending_videos)
5. ๐ง๐ฝโ๐คโ๐ง๐ฝ Collecting a list of users from a seed accountโ
This is probably the most interesting part of the analysis and machine learning experiments. So, for example, we'd like to get all suggested users for Kourtney Kardashian for further retrieving of TikToks:
from TikTokApi import TikTokApi
api = TikTokApi()
n_suggestions = 100
username = 'kourtneykardashian'
user_id = api.getUser(username)['userInfo']['user']['id']
suggested = api.getSuggestedUsersbyID(count=n_suggestions, user_id=user_id)
print(suggested)
Summaryโ
TikTok provides a large amount of useful data that can be converted into a machine learning dataset or used for manual analysis. Also, media research allows an understanding of trends dynamic for creating the most popular account.