* chore(plugin-server): Validate fetch hostnames
* Only apply Python host check on Cloud
* Update tests to use valid hook URLs
* Only apply plugin server host check in prod
* Update URLs in a couple more tests
* Only check hostnames on Cloud and remove port check
* Fix fetch mocking
* Roll out hostname guard per project
* Fix fetch call assertions
* Make `fetchHostnameGuardTeams` optional
* chore(plugin-server): remove piscina workers
Using Piscina workers introduces complexity that would rather be
avoided. It does offer the ability to scale work across multiple CPUs,
but we can achieve this via starting multiple processes instead. It may
also provide some protection from deadlocking the worker process, which
I believe Piscina will handle by killing worker processes and
respawning, but we have K8s liveness checks that will also handle this.
This should simplify 1. prom metrics exporting, and 2. using
node-rdkafka.
* remove piscina from package.json
* use createWorker
* wip
* wip
* wip
* wip
* fix export test
* wip
* wip
* fix server stop tests
* wip
* mock process.exit everywhere
* fix health server tests
* Remove collectMetrics
* wip
* chore(recordings): use cooperative-sticky rebalance strategy
This should make rebalances and lag during deploys a little less
painful. I'm setting this as the globally used strategy, when we e.g.
want to use another strategy for a specific consumer group, we can make
this configurable.
* disable rebalance_callback
* use node-rdkafka-acosom fork instead, for cooperative support
* test(plugin-server): use librdkafka for functional tests
While trying to port the session recordings to use node-librdkafka I
found it useful to first implement it in the functional tests.
* use obj destructuring to make calls more self explanatory
This removes the timekeeper library and uses jest fake timers instead.
This also creates the hub once and reuses it for all tests, which is
faster than creating a new hub for each test.
* chore(plugin-server): Add metrics for time of last processed message
Previously we have been alerting on Kafka consumer group offset lag.
However, really we care about the delay between messages being written
to Kafka and being processed by the plugin server.
By adding the last processed timestamp, as a gauge, we can then alert on
if that time and now is greater than a threshold.
This alert would not require the plugin-server to be up to trigger, just
that there be some time registered so it handles complete failure also.
For the case that there are no messages past the committed offsets, we
will end up triggering the alert if we do not also take into
consideration the production rate into the topic.
* wip
* wip
* fix imports order
* fix group id
* Add and use waitForExpect instead
* remove yarn.lock
* move comment
* wip
* chore: use pnpm to manage dependencies
* Fix CI errors
* Don't report Docker image size for external PRs
* Fix pnpm-lock.yaml formatting
* Fix module versions
* Ignore pnpm-lock.yaml
* Upgrade Cypress action for pnpm support
* Set up node and pnpm before Cypress
* Fix typescript issues
* Include patches directory in Dockerfile
* Fix Jest tests in CI
* Update lockfile
* Update lockfile
* Clean up Dockerfile
* Update pnpm-lock.yaml to reflect current package.json files
* remove yarn-error.log from .gitignore
* formatting
* update data exploration readme
* type jest.config.ts
* fix @react-hook issues for jest
* fix react-syntax-highlighter issues for jest
* fix jest issues from query-selector-shadow-dom
* fix transform ignore patterns and undo previous fixes
* add missing storybook peer dependencies
* fix nullish coalescing operator for storybook
* reorder storybook plugins
* update editor-update-tsd warning to new npm script
* use legacy ssl for chromatic / node 18 compatibility
* use pnpm for visual regression testing workflow
* use node 16 for chromatic
* add @babel/plugin-proposal-nullish-coalescing-operator as direct dependency
* try fix for plugin-server
* cleanup
* fix comment and warning
* update more comments
* update playwright dockerfile
* update plugin source types
* conditional image size reporting
* revert react-native instructions
* less restrictive pnpm verions
* use ref component name in line with style guide
Co-authored-by: Jacob Gillespie <jacobwgillespie@gmail.com>
* chore(ingestion): set ingested_event before delaying anon. events
Now that we delay anonymous events, we end up having a delay in the
onboarding flow where we need to wait for he flag to be set before
informing the user that an event has successfully been captured.
This is a lesser cousin of
https://github.com/PostHog/posthog/pull/13191 which offers the added
feature of informing the user of the capture date of the latest event,
but has some more work to do re. performance. I would like to get this
in first to unblock the person-on-events re-enabling for new customers.
* Move call to only run when buffering
* feat: remove version from docker compose to support new spec
* feat: simplify the docker-compose setup so we do less version coordinations
* update hobby bin
* bump docker-compose version for hobby for extends compat
* move ci to ubuntu-latest
* Revert "move ci to ubuntu-latest"
This reverts commit a0462adfecf182ca7398d809ebb49fac36110d63.
* use docker compose for github ci
* correct comments on base
* refactor(plugin-server): separate api from functional_tests
This just moves the api helpers to a separate file, such that we can
import from other files.
* test(plugin-server): add functional tests for property definitions
I was going to take a stab at
https://github.com/PostHog/posthog/issues/12529 but I wasn't sure how
the definition bits worked, so thought I'd add some tests first.
This doesn't just add tests but also:
1. starts demonstrating how we can split up the tests into
different files, thereby also allowing jest test isolation.
2. removes --runInBand, such that isolated tests can run in parallel
* ci(plugin-server): add coverage for functional_tests
Now I've completely separated out the test runner from the running
server (i.e. they are in completely separate processes), we can pull out
a coverage report using `c8` (which is using node v8 built in coverage
tooling and outputing to an istambul compatible html format).
* wip
* wip
* wip
* wip
* wip
* chore(plugin-server): add perf test using generate_demo_data
Uses generate_demo_data as a basic per test. Not perfect but it's a
start. Still need to consider:
1. buffer functionality
2. testing specifically without the buffer functionality, there's
something around using graphile worker directly to dely events from
the managemenet script but we shouldn't need to do this to provide
correctness of ingestion imo, rather only the order of events.
* wip
* wip
* wip
* wip
* wip
* wip
* wip
* push events through plugin-server
* build plugin server dist
* run gen demo data earlier
* add timer
* change debug/print to logger debug/info
* update logs, remove workflow
Doesn't try to do any comparison to base yet although that would be
great, but as it stands it offers some useful insights into where we
might be missing coverage.
* refactor(plugin-server): split out plugin server functionality
To get better isolation we want to allow specific functionality to run
in separate pods. We already have the ingestion / async split, but there
are further divides we can make e.g. the cron style scheduler for plugin
server `runEveryMinute` tasks.
* split jobs as well
* Also start Kakfa consumers on processAsyncHandlers
* add status for async
* add runEveryMinute test
* avoid fake timers, just accept slower tests
* make e2e concurrent
* chore: also test ingestion/async split
* increase timeouts
* increase timeouts
* lint
* Add functional tests dir
* fix
* fix
* hack
* hack
* fix
* fix
* fix
* wip
* wip
* wip
* wip
* wip
* fix
* remove concurrency
* remove async-worker mode
* add async-handlers
* wip
* add modes to overrideWithEnv validation
* fix: async-handlers -> exports
* update comment
* refactor(plugin-server): use JSON logs when not in dev
To improve observability of the plugin-server, for instance easily being
able to view all error logs, we enable JSON logs in production. This
should enable us to, for instance easily parse for KafkaJS events like
GROUP_JOIN but initially we can just use for filtering down on log
level.
In development we will still get a plain text log line.
* ensure stderr goes through pino-pretty as well
* output log level names not number
* update versions
* chore(historical-exports): Dont call exportEvents if no events to export
* Make historical export logging more accurate
* Give a progress percentage for export
* Only log & exportEvents if there is events to export
* Track voided promises
* Add more typing to vm upgrade
* Add first test for upgrades
* Test setupPlugin()