about

blog

Common Favourite Films on Letterboxd

Letterboxd allows users to list up to four favourite films, but there’s no built-in way to find people with similar tastes. To solve this, I developed a script to scrape ‘fans’ of specific films and identify overlapping favourites.

My favourite films on Letterboxd


Scraping Fans

As there is no public Letterboxd APII requested access to the Letterboxd API beta in 2016, but never received a response but I still want to find people with similar movie tastes, I wrote a script that takes a Letterboxd movie identifier as an argument and scrapes the list of users who have listed this movie as one of their four favourites (also called “fans” on Letterboxd). This script curls every fan page of the given movie, extracting user identifiers through a regex with grep, and appending them to a CSV file.

#!/bin/sh
 
movie=$1
page=1
 
while :; do
fans=$(curl --silent "https://letterboxd.com/film/$movie/fans/page/$page/" | \
grep -Po '(?<=a href="/)[^"]+(?=/\" class=\"name\")')
[ -z "$fans" ] && break
echo "$fans" >> "${movie}_fans.csv"
page=$((page + 1))
done

As a test, we can look at the “The Innocents”, which currently has 88 fans. "./letterboxdFans.sh the-innocents-2021" produces a CSV named "the-innocents-2021_fans.csv" in the same folder. Using the fish shell, we check the number of entries by counting newlines in the CSV with cat and count.

$ cat the-innocents-2021_fans.csv | count
88

Looking at some popular films (like Parasite, for example) shows that we can see at most 6.4k fans even though there are 102k of them in this specific case, however. As there are 25 fans displayed per page, and we can only scrape 6.4k of them, this means that the fan pages are capped at page 256. Manually confirming this indeed shows that there are at most 256 pages of alphabetically ordered fans per film.


Finding Similar Profiles

After scraping all at most 6.4k fans of each of my four favourite films and putting the resulting CSVs in a separate folder, we can find profiles that have at least two favourite films in common by looking for duplicate entries:

$ ls letterboxd_fans/
good-time_fans.csv
paterson_fans.csv
perfect-days-2023_fans.csv
uncut-gems_fans.csv
$ cat letterboxd_fans/* | sort > merged.csv
$ uniq -c merged.csv | sort -nr | head -n 5
2 mandypixel
2 malcolmconger
2 louistoth
2 loshmy
2 ln42

Automatically Following Profiles

I attempted to automatically follow profiles that share at least two favourite films using Selenium, but only got as far as automatically visiting every profile with shared favourite films.

Current issues include not being able to use an existing Firefox profile that is logged into my Letterboxd account to Selenium, and not being able to automatically click on the Follow button.

As this is just a quick one-time project, I manually logged in using the browser that pops up after starting my script before automatically cycling through profiles from the merged.csv list and manually clicking the follow button before the next profile page loads. This allowed me to quickly follow a couple of profiles with common favourite movies.