Letterboxd allows users to list up to four favourite films, but there’s no built-in way to find people with similar tastes. To solve this, I developed a script to scrape ‘fans’ of specific films and identify overlapping favourites.
Scraping Fans
As there is no public Letterboxd APII requested access to the Letterboxd API beta in 2016, but never received a response
but I still want to find people with similar movie tastes,
I wrote a script that takes a Letterboxd movie identifier as an argument and scrapes the list of users who have listed this movie as one of their four favourites (also called “fans” on Letterboxd).
This script curl
s every fan page of the given movie, extracting user identifiers through a regex with grep
, and appending them to a CSV
file.
#!/bin/sh movie=$1page=1 while :; do fans=$(curl --silent "https://letterboxd.com/film/$movie/fans/page/$page/" | \ grep -Po '(?<=a href="/)[^"]+(?=/\" class=\"name\")') [ -z "$fans" ] && break echo "$fans" >> "${movie}_fans.csv" page=$((page + 1))done
As a test, we can look at the “The Innocents”, which currently has 88 fans.
"./letterboxdFans.sh the-innocents-2021"
produces a CSV
named "the-innocents-2021_fans.csv"
in the same folder.
Using the fish
shell, we check the number of entries by counting newlines in the CSV
with cat
and count
.
$ cat the-innocents-2021_fans.csv | count88
Looking at some popular films (like Parasite, for example) shows that we can see at most 6.4k fans even though there are 102k of them in this specific case, however. As there are 25 fans displayed per page, and we can only scrape 6.4k of them, this means that the fan pages are capped at page 256. Manually confirming this indeed shows that there are at most 256 pages of alphabetically ordered fans per film.
Finding Similar Profiles
After scraping all at most 6.4k fans of each of my four favourite films and putting the resulting CSV
s in a separate folder, we can find profiles that have at least two favourite films in common by looking for duplicate entries:
$ ls letterboxd_fans/good-time_fans.csv paterson_fans.csv perfect-days-2023_fans.csv uncut-gems_fans.csv
$ cat letterboxd_fans/* | sort > merged.csv$ uniq -c merged.csv | sort -nr | head -n 5 2 mandypixel 2 malcolmconger 2 louistoth 2 loshmy 2 ln42
Automatically Following Profiles
I attempted to automatically follow profiles that share at least two favourite films using Selenium, but only got as far as automatically visiting every profile with shared favourite films.
Current issues include not being able to use an existing Firefox profile that is logged into my Letterboxd account to Selenium, and not being able to automatically click on the Follow button.
As this is just a quick one-time project, I manually logged in using the browser that pops up after starting my script before automatically cycling through profiles from the merged.csv
list and manually clicking the follow button before the next profile page loads. This allowed me to quickly follow a couple of profiles with common favourite movies.