* chore(recordings): use cooperative-sticky rebalance strategy
This should make rebalances and lag during deploys a little less
painful. I'm setting this as the globally used strategy, when we e.g.
want to use another strategy for a specific consumer group, we can make
this configurable.
* disable rebalance_callback
* use node-rdkafka-acosom fork instead, for cooperative support
* test(plugin-server): use librdkafka for functional tests
While trying to port the session recordings to use node-librdkafka I
found it useful to first implement it in the functional tests.
* use obj destructuring to make calls more self explanatory
This removes the timekeeper library and uses jest fake timers instead.
This also creates the hub once and reuses it for all tests, which is
faster than creating a new hub for each test.
* chore(plugin-server): Add metrics for time of last processed message
Previously we have been alerting on Kafka consumer group offset lag.
However, really we care about the delay between messages being written
to Kafka and being processed by the plugin server.
By adding the last processed timestamp, as a gauge, we can then alert on
if that time and now is greater than a threshold.
This alert would not require the plugin-server to be up to trigger, just
that there be some time registered so it handles complete failure also.
For the case that there are no messages past the committed offsets, we
will end up triggering the alert if we do not also take into
consideration the production rate into the topic.
* wip
* wip
* fix imports order
* fix group id
* Add and use waitForExpect instead
* remove yarn.lock
* move comment
* wip
* chore: use pnpm to manage dependencies
* Fix CI errors
* Don't report Docker image size for external PRs
* Fix pnpm-lock.yaml formatting
* Fix module versions
* Ignore pnpm-lock.yaml
* Upgrade Cypress action for pnpm support
* Set up node and pnpm before Cypress
* Fix typescript issues
* Include patches directory in Dockerfile
* Fix Jest tests in CI
* Update lockfile
* Update lockfile
* Clean up Dockerfile
* Update pnpm-lock.yaml to reflect current package.json files
* remove yarn-error.log from .gitignore
* formatting
* update data exploration readme
* type jest.config.ts
* fix @react-hook issues for jest
* fix react-syntax-highlighter issues for jest
* fix jest issues from query-selector-shadow-dom
* fix transform ignore patterns and undo previous fixes
* add missing storybook peer dependencies
* fix nullish coalescing operator for storybook
* reorder storybook plugins
* update editor-update-tsd warning to new npm script
* use legacy ssl for chromatic / node 18 compatibility
* use pnpm for visual regression testing workflow
* use node 16 for chromatic
* add @babel/plugin-proposal-nullish-coalescing-operator as direct dependency
* try fix for plugin-server
* cleanup
* fix comment and warning
* update more comments
* update playwright dockerfile
* update plugin source types
* conditional image size reporting
* revert react-native instructions
* less restrictive pnpm verions
* use ref component name in line with style guide
Co-authored-by: Jacob Gillespie <jacobwgillespie@gmail.com>
* chore(ingestion): set ingested_event before delaying anon. events
Now that we delay anonymous events, we end up having a delay in the
onboarding flow where we need to wait for he flag to be set before
informing the user that an event has successfully been captured.
This is a lesser cousin of
https://github.com/PostHog/posthog/pull/13191 which offers the added
feature of informing the user of the capture date of the latest event,
but has some more work to do re. performance. I would like to get this
in first to unblock the person-on-events re-enabling for new customers.
* Move call to only run when buffering
* feat: remove version from docker compose to support new spec
* feat: simplify the docker-compose setup so we do less version coordinations
* update hobby bin
* bump docker-compose version for hobby for extends compat
* move ci to ubuntu-latest
* Revert "move ci to ubuntu-latest"
This reverts commit a0462adfecf182ca7398d809ebb49fac36110d63.
* use docker compose for github ci
* correct comments on base
* refactor(plugin-server): separate api from functional_tests
This just moves the api helpers to a separate file, such that we can
import from other files.
* test(plugin-server): add functional tests for property definitions
I was going to take a stab at
https://github.com/PostHog/posthog/issues/12529 but I wasn't sure how
the definition bits worked, so thought I'd add some tests first.
This doesn't just add tests but also:
1. starts demonstrating how we can split up the tests into
different files, thereby also allowing jest test isolation.
2. removes --runInBand, such that isolated tests can run in parallel
* ci(plugin-server): add coverage for functional_tests
Now I've completely separated out the test runner from the running
server (i.e. they are in completely separate processes), we can pull out
a coverage report using `c8` (which is using node v8 built in coverage
tooling and outputing to an istambul compatible html format).
* wip
* wip
* wip
* wip
* wip
* chore(plugin-server): add perf test using generate_demo_data
Uses generate_demo_data as a basic per test. Not perfect but it's a
start. Still need to consider:
1. buffer functionality
2. testing specifically without the buffer functionality, there's
something around using graphile worker directly to dely events from
the managemenet script but we shouldn't need to do this to provide
correctness of ingestion imo, rather only the order of events.
* wip
* wip
* wip
* wip
* wip
* wip
* wip
* push events through plugin-server
* build plugin server dist
* run gen demo data earlier
* add timer
* change debug/print to logger debug/info
* update logs, remove workflow
Doesn't try to do any comparison to base yet although that would be
great, but as it stands it offers some useful insights into where we
might be missing coverage.
* refactor(plugin-server): split out plugin server functionality
To get better isolation we want to allow specific functionality to run
in separate pods. We already have the ingestion / async split, but there
are further divides we can make e.g. the cron style scheduler for plugin
server `runEveryMinute` tasks.
* split jobs as well
* Also start Kakfa consumers on processAsyncHandlers
* add status for async
* add runEveryMinute test
* avoid fake timers, just accept slower tests
* make e2e concurrent
* chore: also test ingestion/async split
* increase timeouts
* increase timeouts
* lint
* Add functional tests dir
* fix
* fix
* hack
* hack
* fix
* fix
* fix
* wip
* wip
* wip
* wip
* wip
* fix
* remove concurrency
* remove async-worker mode
* add async-handlers
* wip
* add modes to overrideWithEnv validation
* fix: async-handlers -> exports
* update comment
* refactor(plugin-server): use JSON logs when not in dev
To improve observability of the plugin-server, for instance easily being
able to view all error logs, we enable JSON logs in production. This
should enable us to, for instance easily parse for KafkaJS events like
GROUP_JOIN but initially we can just use for filtering down on log
level.
In development we will still get a plain text log line.
* ensure stderr goes through pino-pretty as well
* output log level names not number
* update versions
* chore(historical-exports): Dont call exportEvents if no events to export
* Make historical export logging more accurate
* Give a progress percentage for export
* Only log & exportEvents if there is events to export
* Track voided promises
* Add more typing to vm upgrade
* Add first test for upgrades
* Test setupPlugin()
* feat(plugin-server): Use Snappy compression codec for kafka production
This helps avoid 'message too large' type errors (see
https://github.com/PostHog/posthog/pull/10968) by compressing in-flight
messages.
I would have preferred to use zstd, but the libraries did not compile
cleanly on my machine.
* Update tests
* fix(autocapture): ensure `$elements` passed to `onEvent`
Before calling `onEvent` the plugin does, amoung other things, a delete
on `event.properties` of the `$elements` associated with `$autocapture`.
This means that for instance the S3 plugin doesn't include this data in
it's dump.
We could also include other data like `elements_chain` that we also
store in `ClickHouse` but I've gone for just including `elements` for
now as `elements_chain` is derived from `elements` anyhow.
* revert .env changes, I'll do that separately
* run prettier
* update to scaffold 1.3.0
* fix lint
* chore: update scaffold to 1.3.1
* update scaffold
* chore(plugin-server): Consume from buffer topic
* Refactor `posthog` extension for buffering
* Properly form `bufferEvent` and don't throw error
* Add E2E test
* Test buffer more end-to-end and properly
* Put buffer-enabled test in a separate file
* Update each-batch.test.ts
* Test that the event goes through the buffer topic
* Fix formatting
* Refactor out `spyOnKafka()`
* Ensure reliability batching-wise
* Send heartbeats every so often
* Make test less flaky
* Commit offsets if necessary before sleep too
* Update tests
* Use seek-based mechanism (with KafkaJS 2.0.2)
* Add comment to clarify seeking
* Update each-batch.test.ts
* Make minor improvements
* Use built-in sharding for jest
* Set up caching for plugin server CI
Note caching scheme is reused from backend tests for python
* Upgrade jest to 28
* Cache yarn cache in plugin-server tests
* Test removing SAML for plugin-server dependencies
* Run docker-compose in background