264 Commits

Author SHA1 Message Date
Brett Hoerner
4756d476be chore(plugin-server): bump node-rdkafka to 2.18.0 (librdkafka 2.3.0) (#18228) 2023-10-30 13:56:41 -06:00
Tiina Turban
b6efd06478 chore: Update plugin-scaffold to latest (#17930) 2023-10-17 12:26:35 +02:00
Brett Hoerner
d68fa14f10 chore(plugin-server): run node-rdkafka with cooperative rebalancing patched in (#17747)
* Remove node-rdkafka-acosom

* Add node-rdkafka

* Replace node-rdkafka-acosom imports with node-rdkafka

* Patch node-rdkafka with changes from https://github.com/PostHog/node-rdkafka/

* Add patch directions to README
2023-10-10 08:13:05 -06:00
Brett Hoerner
e6b9703b74 fix(plugin-server): move ipaddr dep from devDependencies to dependencies (#17528) 2023-09-19 08:46:15 -06:00
Michael Matloka
b7fe004d6b chore(plugin-server): Validate fetch hostnames (#17183)
* chore(plugin-server): Validate fetch hostnames

* Only apply Python host check on Cloud

* Update tests to use valid hook URLs

* Only apply plugin server host check in prod

* Update URLs in a couple more tests

* Only check hostnames on Cloud and remove port check

* Fix fetch mocking

* Roll out hostname guard per project

* Fix fetch call assertions

* Make `fetchHostnameGuardTeams` optional
2023-09-18 14:38:02 +02:00
Brett Hoerner
5b5d0d43a3 chore(plugin-server): make it easier to run multiple plugin-server instances locally (#17456)
* chore(plugin-server): allow customizing the HTTP server port

* chore(plugin-server): add NO_WATCH mode for development

* fix: http-server test
2023-09-15 08:55:46 -06:00
Xavier Vello
46b16d9db5 feat(plugin-server): add profiling http endpoints (#17214) 2023-08-28 16:31:39 +02:00
James Greenhill
bc05c1b8cd fix: Move to RE2 for chainToElements (#17198)
* fix: Move to RE2 for chainToElements

* update actual plugin-server chainToElements

* sort imports

* pull RE2 regex definitions into the module

* Update ci-plugin-server.yml

* bump

* bump re2 version

* moar mem

* update pnpm lock

---------

Co-authored-by: Xavier Vello <xavier.vello@gmail.com>
Co-authored-by: Xavier Vello <xavier@posthog.com>
2023-08-25 10:05:33 -07:00
Marius Andra
59eca63f30 fix: Revert "feat(hogvm): hogvm bytecode action matching in the plugin server" (#17036)
Revert "feat(hogvm): hogvm bytecode action matching in the plugin server (#16937)"

This reverts commit e0d2582e32.
2023-08-15 16:22:19 +01:00
Marius Andra
e0d2582e32 feat(hogvm): hogvm bytecode action matching in the plugin server (#16937) 2023-08-15 15:37:19 +02:00
Tihomir Valkanov
c33ff90b22 fix: use latest pnpm version (plugin-server) (#16724) 2023-08-11 11:21:14 +02:00
Ben White
d198daadf4 feat: Moved to gzipping on write stream instead of read (#16754) 2023-07-26 11:44:41 +02:00
Xavier Vello
5e28ad5c3d chore(plugin-server): patch unhandled exception in pg library (#15960) 2023-06-20 10:22:41 +02:00
Xavier Vello
57d8a60e6a chore(plugin-server): add support for sentry profiling (#15696) 2023-06-01 11:06:23 +02:00
Xavier Vello
44e3bda1ce chore(deps): upgrade vm2@3.9.18 to address CVE (#15567) 2023-05-16 14:26:27 +02:00
Harry Waye
7ba6fa7148 chore(plugin-server): remove piscina workers (#15327)
* chore(plugin-server): remove piscina workers

Using Piscina workers introduces complexity that would rather be
avoided. It does offer the ability to scale work across multiple CPUs,
but we can achieve this via starting multiple processes instead. It may
also provide some protection from deadlocking the worker process, which
I believe Piscina will handle by killing worker processes and
respawning, but we have K8s liveness checks that will also handle this.

This should simplify 1. prom metrics exporting, and 2. using
node-rdkafka.

* remove piscina from package.json

* use createWorker

* wip

* wip

* wip

* wip

* fix export test

* wip

* wip

* fix server stop tests

* wip

* mock process.exit everywhere

* fix health server tests

* Remove collectMetrics

* wip
2023-05-03 14:42:16 +00:00
Harry Waye
fad0fa3e76 chore(plugin-server): only reload app on src changes (#15343)
When you're working on tests, it's annoying to have to wait for the main
app to restart.
2023-05-03 07:26:12 +00:00
Harry Waye
84d5704b86 chore(plugin-server): use swc for plugin server dev (#15324)
Rather than using ts-node-dev which looks unmaintained now, we use
nodemon and SWC to make dev faster.
2023-05-02 12:21:24 +00:00
Xavier Vello
c4cd48b403 chore: upgrade pnpm to 8.3.1 (#15273) 2023-04-28 11:35:58 +02:00
Xavier Vello
c6cf1f7f10 chore: align prettier versions to fix CI (#15274) 2023-04-27 10:11:39 +00:00
James Greenhill
e0711e2dcf chore: bump vm2 due to CVE (#15268)
* chore: bump vm2 due to CVE

* add lock

* use the right version of pnpm
2023-04-26 14:30:26 -07:00
Harry Waye
96fe16fd3c chore(recordings): use cooperative-sticky rebalance strategy (#15260)
Revert "revert(recordings): use cooperative-sticky rebalance strategy (#15211)"

This reverts commit a40f01138e.
2023-04-26 13:09:13 +00:00
Ben White
fdb2c71a39 feat: S3 backed recording ingestion (take 2) (#14864) 2023-04-25 09:43:07 +00:00
Harry Waye
a40f01138e revert(recordings): use cooperative-sticky rebalance strategy (#15211)
Revert "chore(recordings): use cooperative-sticky rebalance strategy (#15197)"

This reverts commit 3eddb96b9b.
2023-04-24 15:06:33 +00:00
Harry Waye
3eddb96b9b chore(recordings): use cooperative-sticky rebalance strategy (#15197)
* chore(recordings): use cooperative-sticky rebalance strategy

This should make rebalances and lag during deploys a little less
painful. I'm setting this as the globally used strategy, when we e.g.
want to use another strategy for a specific consumer group, we can make
this configurable.

* disable rebalance_callback

* use node-rdkafka-acosom fork instead, for cooperative support
2023-04-24 13:25:24 +00:00
Xavier Vello
05a8d1f302 chore(plugin-server): collect prometheus metrics from piscina workers (#15002) 2023-04-11 13:27:07 +02:00
Harry Waye
a26a87cf40 chore: add profiler to session recordings load test (#14649)
chore: add profiler to session recordings load test
2023-03-09 14:07:54 +00:00
Harry Waye
a4a3a0c902 test(plugin-server): use librdkafka for functional tests (#14468)
* test(plugin-server): use librdkafka for functional tests

While trying to port the session recordings to use node-librdkafka I
found it useful to first implement it in the functional tests.

* use obj destructuring to make calls more self explanatory
2023-03-01 11:03:13 +00:00
Harry Waye
ac6b863c73 refactor: remove timekeeper, use jest fake timers, create hub once (#14166)
This removes the timekeeper library and uses jest fake timers instead.
This also creates the hub once and reuses it for all tests, which is
faster than creating a new hub for each test.
2023-02-09 11:30:41 +00:00
Xavier Vello
be607d50ad chore: add plugin-server prom metrics 1/n (#14014)
* update prom-client
* add kafka error and instrumentation function metrics
2023-01-31 14:58:14 +01:00
Xavier Vello
77d0125138 feat: set export batch size based on plugin settings (#13559) 2023-01-26 16:31:02 +01:00
timgl
d4f9790234 chore(plugin-server): Upgrade jsonwebtoken to 9.0.0 (#13900) 2023-01-26 15:27:02 +01:00
Guido Iaquinti
19d65fb005 chore(dockerfile): switch base image to non alpine (#13314) 2022-12-16 11:12:11 +01:00
Harry Waye
1e82569bbb chore(plugin-server): Add metrics for time of last processed message (#13350)
* chore(plugin-server): Add metrics for time of last processed message

Previously we have been alerting on Kafka consumer group offset lag.
However, really we care about the delay between messages being written
to Kafka and being processed by the plugin server.

By adding the last processed timestamp, as a gauge, we can then alert on
if that time and now is greater than a threshold.

This alert would not require the plugin-server to be up to trigger, just
that there be some time registered so it handles complete failure also.

For the case that there are no messages past the committed offsets, we
will end up triggering the alert if we do not also take into
consideration the production rate into the topic.

* wip

* wip

* fix imports order

* fix group id

* Add and use waitForExpect instead

* remove yarn.lock

* move comment

* wip
2022-12-15 18:28:43 +00:00
Thomas Obermüller
4a30e78b22 chore: use pnpm to manage dependencies (closes #12635) (#13190)
* chore: use pnpm to manage dependencies

* Fix CI errors

* Don't report Docker image size for external PRs

* Fix pnpm-lock.yaml formatting

* Fix module versions

* Ignore pnpm-lock.yaml

* Upgrade Cypress action for pnpm support

* Set up node and pnpm before Cypress

* Fix typescript issues

* Include patches directory in Dockerfile

* Fix Jest tests in CI

* Update lockfile

* Update lockfile

* Clean up Dockerfile

* Update pnpm-lock.yaml to reflect current package.json files

* remove yarn-error.log from .gitignore

* formatting

* update data exploration readme

* type jest.config.ts

* fix @react-hook issues for jest

* fix react-syntax-highlighter issues for jest

* fix jest issues from query-selector-shadow-dom

* fix transform ignore patterns and undo previous fixes

* add missing storybook peer dependencies

* fix nullish coalescing operator for storybook

* reorder storybook plugins

* update editor-update-tsd warning to new npm script

* use legacy ssl for chromatic / node 18 compatibility

* use pnpm for visual regression testing workflow

* use node 16 for chromatic

* add @babel/plugin-proposal-nullish-coalescing-operator as direct dependency

* try fix for plugin-server

* cleanup

* fix comment and warning

* update more comments

* update playwright dockerfile

* update plugin source types

* conditional image size reporting

* revert react-native instructions

* less restrictive pnpm verions

* use ref component name in line with style guide

Co-authored-by: Jacob Gillespie <jacobwgillespie@gmail.com>
2022-12-12 10:28:06 +01:00
Harry Waye
8018aa7244 chore(ingestion): set ingested_event before delaying anon. events (#13236)
* chore(ingestion): set ingested_event before delaying anon. events

Now that we delay anonymous events, we end up having a delay in the
onboarding flow where we need to wait for he flag to be set before
informing the user that an event has successfully been captured.

This is a lesser cousin of
https://github.com/PostHog/posthog/pull/13191 which offers the added
feature of informing the user of the capture date of the latest event,
but has some more work to do re. performance. I would like to get this
in first to unblock the person-on-events re-enabling for new customers.

* Move call to only run when buffering
2022-12-09 14:38:16 +00:00
Xavier Vello
c0702b8b19 fix(ingestion): fix noisy 'Invalid unit value NaN' errors (#13027) 2022-11-30 15:46:42 +01:00
James Greenhill
696028e800 feat: simplify the docker-compose setup so we do less version coordinations (#12998)
* feat: remove version from docker compose to support new spec

* feat: simplify the docker-compose setup so we do less version coordinations

* update hobby bin

* bump docker-compose version for hobby for extends compat

* move ci to ubuntu-latest

* Revert "move ci to ubuntu-latest"

This reverts commit a0462adfecf182ca7398d809ebb49fac36110d63.

* use docker compose for github ci

* correct comments on base
2022-11-29 20:50:42 +00:00
Harry Waye
b010073ec4 test(plugin-server): add functional tests for property definitions (#12659)
* refactor(plugin-server): separate api from functional_tests

This just moves the api helpers to a separate file, such that we can
import from other files.

* test(plugin-server): add functional tests for property definitions

I was going to take a stab at
https://github.com/PostHog/posthog/issues/12529 but I wasn't sure how
the definition bits worked, so thought I'd add some tests first.

This doesn't just add tests but also:

 1. starts demonstrating how we can split up the tests into
    different files, thereby also allowing jest test isolation.
 2. removes --runInBand, such that isolated tests can run in parallel
2022-11-08 06:56:19 +00:00
Harry Waye
aa5231307e ci(plugin-server): add coverage for functional_tests (#12593)
* ci(plugin-server): add coverage for functional_tests

Now I've completely separated out the test runner from the running
server (i.e. they are in completely separate processes), we can pull out
a coverage report using `c8` (which is using node v8 built in coverage
tooling and outputing to an istambul compatible html format).

* wip

* wip

* wip

* wip

* wip
2022-11-03 13:06:12 +00:00
Harry Waye
fff4483ff2 chore(plugin-server): add perf test using HedgeBox Matrix (#12392)
* chore(plugin-server): add perf test using generate_demo_data

Uses generate_demo_data as a basic per test. Not perfect but it's a
start. Still need to consider:

 1. buffer functionality
 2. testing specifically without the buffer functionality, there's
    something around using graphile worker directly to dely events from
    the managemenet script but we shouldn't need to do this to provide
    correctness of ingestion imo, rather only the order of events.

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* push events through plugin-server

* build plugin server dist

* run gen demo data earlier

* add timer

* change debug/print to logger debug/info

* update logs, remove workflow
2022-10-25 10:48:11 +00:00
Harry Waye
cb771f10d3 ci(plugin-server): add coverage output to plugin-server functional tests (#12361)
Doesn't try to do any comparison to base yet although that would be
great, but as it stands it offers some useful insights into where we
might be missing coverage.
2022-10-20 11:11:05 +01:00
Harry Waye
d3f9d865f5 refactor(plugin-server): split out plugin server functionality (#12191)
* refactor(plugin-server): split out plugin server functionality

To get better isolation we want to allow specific functionality to run
in separate pods. We already have the ingestion / async split, but there
are further divides we can make e.g. the cron style scheduler for plugin
server `runEveryMinute` tasks.

* split jobs as well

* Also start Kakfa consumers on processAsyncHandlers

* add status for async

* add runEveryMinute test

* avoid fake timers, just accept slower tests

* make e2e concurrent

* chore: also test ingestion/async split

* increase timeouts

* increase timeouts

* lint

* Add functional tests dir

* fix

* fix

* hack

* hack

* fix

* fix

* fix

* wip

* wip

* wip

* wip

* wip

* fix

* remove concurrency

* remove async-worker mode

* add async-handlers

* wip

* add modes to overrideWithEnv validation

* fix: async-handlers -> exports

* update comment
2022-10-20 09:22:46 +00:00
Yakko Majuri
c47a73165a feat(plugin-server): use graphile-worker crontab (#12242)
* yeet references to redlock

* rename jobs/ to graphile-worker/

* feat(plugin-server): use graphile-worker crontab

* remove debugging

* yeet redlock dependency

* remove legacy test

* Update comment

* Update plugin-server/src/main/pluginsServer.ts

Co-authored-by: Harry Waye <harry@posthog.com>

* address review, update tests

* fix old tests

* testing, testing

* maybe fix sigterm

Co-authored-by: Harry Waye <harry@posthog.com>
2022-10-18 11:44:41 -03:00
Karl-Aksel Puulmann
4d1c0e45fb feat(app-metrics): Gather and show information on errors (#12250)
* Add dependencies

* WIP: Gather error context

* Downgrade package

* Update tracking logic

* Query errors along with app metrics/export metrics

* last_seen

* Fetch and show errors table underneath metrics

* Sorting order

* Endpoint for fetching sample error details for an error

* Render error details drawer

* Render tabs in ErrorDetailsDrawer

* Tests for AppMetricsErrorsQuery

* Tests for AppMetricsErrorDetailsQuery

* Tests for historical_export_metrics

* Update existing app metrics API tests

* /error_details endpoint tests

* Update retries tests

* Update v2 historical exports tests

* Tests for app-metrics.ts

* run prettier

* Avoid reloading data on table sorting

* Fix fat-finger
2022-10-18 09:58:08 +03:00
dependabot[bot]
933cebb1b3 chore(deps): bump vm2 from 3.9.6 to 3.9.11 in /plugin-server (#12013)
Bumps [vm2](https://github.com/patriksimek/vm2) from 3.9.6 to 3.9.11.
- [Release notes](https://github.com/patriksimek/vm2/releases)
- [Changelog](https://github.com/patriksimek/vm2/blob/master/CHANGELOG.md)
- [Commits](https://github.com/patriksimek/vm2/compare/3.9.6...3.9.11)

---
updated-dependencies:
- dependency-name: vm2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-07 09:39:09 -03:00
Harry Waye
8118ded811 refactor(plugin-server): use JSON logs when not in dev (#11330)
* refactor(plugin-server): use JSON logs when not in dev

To improve observability of the plugin-server, for instance easily being
able to view all error logs, we enable JSON logs in production. This
should enable us to, for instance easily parse for KafkaJS events like
GROUP_JOIN but initially we can just use for filtering down on log
level.

In development we will still get a plain text log line.

* ensure stderr goes through pino-pretty as well

* output log level names not number

* update versions
2022-09-20 11:49:52 +01:00
Karl-Aksel Puulmann
514c0caea5 chore(plugin-server): Historical exports fixes (#11464)
* chore(historical-exports): Dont call exportEvents if no events to export

* Make historical export logging more accurate

* Give a progress percentage for export

* Only log & exportEvents if there is events to export

* Track voided promises

* Add more typing to vm upgrade

* Add first test for upgrades

* Test setupPlugin()
2022-08-25 08:58:43 +03:00
Neil Kakkar
7356da1c94 chore: bump posthog-node version (#11435) 2022-08-24 11:55:54 +01:00
Karl-Aksel Puulmann
c52261b4cf chore(plugin-server): Update @maxmind/geoip2-node to 3.4.0 (#11285)
* update scaffold

* fix yarn.lock
2022-08-22 11:09:00 +03:00