Recommender Systems

A surprising amount of modern life is shaped by recommender systems: the films a streaming service surfaces, the products a shop suggests, the posts a feed shows you next. They're the engines of personalisation, and the idea at their core is genuinely elegant — predict how much a particular person will like a particular thing, from the patterns of what everyone has liked before, and rank accordingly.

It's worth knowing properly for two reasons. The mechanics tie beautifully back to dimensionality reduction — the central technique is the same latent-factor idea — and the failure modes (filter bubbles, feedback loops) are some of the most socially consequential in all of applied ML. This page is the landscape: the two great strategies, the matrix-factorisation engine, and what goes wrong.

Predicting what you'll like

Frame the problem as a giant, mostly-empty table: rows are users, columns are items, and each cell is how much that user likes that item — a rating, a click, a purchase. The catch is that the table is overwhelmingly sparse: any one person has interacted with a tiny fraction of all items. The recommender's job is to fill in the blanks — predict the missing cells — and then recommend the items it predicts you'll rate highest.

There are two fundamentally different ways to do that, and the difference is where they look for signal: at the items themselves, or at the crowd of other users.

Content-based filtering: more like what you liked

Content-based filtering looks at the items. It builds a profile of what you like from the features of things you've liked before, then recommends items with similar features. Liked several hard sci-fi films? Here's another tagged sci-fi. The item features can be explicit (genre, author, price) or learned text embeddings of descriptions, and "similar" is the same vector-distance idea from the NLP page.

Its strengths and weaknesses are two sides of one coin: it can recommend brand-new items the moment they're catalogued (it only needs their features), and it never needs other users — but it's trapped in your existing tastes. It can only ever suggest more of the same, never the delightful out-of-left-field find. For that, you need the crowd.

Collaborative filtering: the wisdom of the crowd

Collaborative filtering ignores item features entirely and uses only the pattern of interactions — the wisdom of the crowd. The intuition: "people who agreed with you in the past will agree with you in the future." Two flavours:

User-user — find people whose tastes resemble yours, and recommend what they liked that you haven't seen.
Item-item — find items that tend to be liked by the same people, and recommend items similar (in that co-liking sense) to ones you rated highly. ("Customers who bought this also bought…") This is what powers most large-scale systems, because item-item relationships are more stable than user tastes.

The magic of collaborative filtering is that it needs no knowledge of what the items actually are — only who interacted with what. That's also its weakness, which the next section's technique addresses, and the cold-start problem after it exposes.

Matrix factorisation: the latent-factor engine

The breakthrough that powered modern collaborative filtering — and famously won the Netflix Prize — is matrix factorisation. The idea: approximate the giant sparse user-item rating matrix $R$ as the product of two much smaller, dense matrices:

R_{\,m \times n} \;\approx\; U_{\,m \times k}\, V_{\,n \times k}^{\top}

Each user becomes a short vector of $k$ latent factors, and so does each item. A predicted rating is just the dot product of a user's vector and an item's vector. The factors are learned, not labelled — but they often correspond to interpretable dimensions ("how much sci-fi", "how light-hearted"), and a user's score on a factor times an item's score on the same factor captures their match.

Matrix factorisation. The huge, sparse user-item matrix is approximated by two thin matrices — a short latent-factor vector per user and per item. A predicted rating is the dot product of the two. This is the same low-rank, latent-dimension idea as PCA.

This is the same low-rank, latent-dimension idea as PCA — compressing a huge matrix into a few meaningful dimensions — which is why the linear-algebra foundations pay off here directly. It also gracefully handles sparsity: you only fit on the cells you do observe, and the factorisation generalises to the rest.

The cold-start problem

Collaborative filtering's great weakness is the cold-start problem: it needs interaction history, and a brand-new user or item has none. You can't recommend to someone you know nothing about, and you can't surface a just-added item nobody has touched. It's the chicken-and-egg at the heart of every new recommender.

The standard fix is to go hybrid: lean on content-based methods (which only need features) for new users and items, then shift to collaborative filtering as interaction history accumulates. Most production systems are hybrids precisely for this reason — each method covers the other's blind spot.

Measuring success: it's about ranking

Evaluating recommenders is subtler than a single accuracy number, because what matters is the order of what you show, not a precise rating. The metrics are ranking-focused: precision@k (of the top k recommendations, how many were relevant?) and NDCG (which also rewards putting the best items highest). You care about the top of the list — nobody scrolls to recommendation 200.

Bubbles & feedback loops

Recommenders don't just predict behaviour — they shape it, and that creates problems bigger than any accuracy metric:

Popularity bias — the crowd's favourites get recommended most, so they get interacted with most, so they get recommended even more. The rich get richer, and niche items stay invisible.
Filter bubbles / echo chambers — by showing you ever-more of what you already engage with, the system narrows your world, which is benign for films and corrosive for news and opinion.
Feedback loops — the model's recommendations become the data it next trains on, so it learns from its own influence and can spiral. It's the reward-shaping problem in another guise: optimise raw engagement and you may amplify exactly the sensational or addictive content that maximises clicks, regardless of whether it's good for anyone.

Where it shows up in my work

Ranking, latent factors, and feedback loops

Recommender systems aren't a core government-analyst tool, but the machinery generalises to any ranking or prioritisation problem — surfacing the cases, items, or leads most worth attention from a long list, which is a shape that recurs constantly. The matrix-factorisation / latent-factor idea is the same low-rank compression as PCA, and the ranking metrics (precision@k) are how you judge any "show me the top N" system.

What carries over most, though, is the cautionary half. The feedback loop — a model trained on the consequences of its own past outputs — is a trap well beyond recommenders: any system that acts on the world and then learns from that changed world risks entrenching its own bias, which connects straight to fairness and the drift problem. Knowing the failure mode is what lets you watch for it.

Refresh in 60 seconds

A recommender fills in a sparse user × item matrix and ranks the predicted favourites.
Content-based (recommend similar items by features — handles new items, but traps you in your tastes) vs collaborative (use the crowd's interactions — "people like you liked"; item-item powers most big systems).
Matrix factorisation ( $R \approx U V^\top$ ) learns short latent-factor vectors per user/item; rating = dot product. Same low-rank idea as PCA.
The cold-start problem (new user/item, no history) → go hybrid (content for new, collaborative as history grows).
Evaluate by ranking (precision@k, NDCG) — but mind the offline-online gap; judge by live A/B tests.
The big issues: popularity bias, filter bubbles, feedback loops — the recommender shapes behaviour, not just predicts it.

The content-vs-collaborative split, matrix-factorisation/cold-start framing, and filter-bubble / feedback-loop cautions reflect current recommender-systems references alongside coursework.