## Problem
We want to detect when a "new" url is seen, create a collection out of it, and (later PR) send a notification in the weekly digest about this new URL
I went back and forth on some decisions, but heres where they stand rn:
1. "New URLS": havent been seen in 90 days. Made this decision because I was worried about the performance of checking all URLs in perpetuity -- also, seems semi reasonable, if a URL isnt seen for 3 months then is seen again, thats still interesting?
2. Pattern matching: I was worried about params as part of URLs, so we do some regex stuff to try and group URLs together into one playlist when the only difference is a id/uuid/hash as part of a url (ie, /settings/2 will be grouped with settings/3)
3. LIMITs / short circuits: again seemed a semi-sensible tradeoff on exact accuracy vs performance
## Changes
1. defined new synthetic playlist source `NewUrlsSyntheticPlaylistSource() `
2. This playlist source can actually return multiple playlists (one per new URL)
3. 2 caches -- normalized_url -> count (for list display), and normalized_url -> list of session ids (for populating the collection when a user clicks into it)
4. when user clicks on a collection to watch, use URL hash to lookup session IDs for that collection in the cache
4. do lots of URL normalizing/pattern matching / logic to ensure we're grouping URLs together, and only showing URLS first seen in the last 14 days (and not otherwise within the last 90 days)
5. limit this to 20 playlists/urls
3. gate all this behind a flag
## How did you test this code?
had claude help write lots of date-checking and url-pattern-matching tests
test shows me locally running a helper (to clear cache, offscreen) before refreshing, see a ~3-4 second cold load time
[Screen Recording 2025-11-04 at 11.22.01 AM.mov <span class="graphite__hidden">(uploaded via Graphite)</span> <img class="graphite__hidden" src="https://app.graphite.dev/user-attachments/thumbnails/f8803d5d-c241-4c8b-91f4-ee00b0a06394.mov" />](https://app.graphite.dev/user-attachments/video/f8803d5d-c241-4c8b-91f4-ee00b0a06394.mov)
<!-- Docs reminder: If this change requires updated docs, please do that! Engineers are the primary people responsible for their documentation. 🙌 -->
👉 _Stay up-to-date with [PostHog coding conventions](https://posthog.com/docs/contribute/coding-conventions) for a smoother review._
## Changelog: (features only) Is this feature complete?
<!-- Optional, but helpful for our content team! -->
<!-- Yes if this is okay to go in the changelog. No if it's still hidden behind a feature flag, or part of a feature that's not complete yet, etc. -->
turns out you can't merge the top of a graphite stack in github without editing what you're merging it into so https://github.com/PostHog/posthog/pull/40854/ was merged into the aether
remimplement here
-----
pairs with https://github.com/PostHog/posthog-cloud-infra/pull/5608
we expire exported assets in s3 on different timers
but we don't tell people
and we don't reflect that in how we set expiry in postgres
(i'm not going to go back and fix existing exports in postgres since they've been mismatched with s3 for the longest time and nobody has complained)