Commit Graph

94 Commits

Author SHA1 Message Date
Brett Hoerner
30bafdd382 chore(plugin-server): kafka ack cleanup and metric (#21111)
* cleanup: remove unused team arg from registerLastStep

* cleanup: rename promises to ackPromises to make it more clear thats what they are

* cleanup(plugin-server): make waitForAck explicit/required

* add Kafka produce/ack metrics

* Clarify Kafka produce metric/labels
2024-03-25 13:01:15 +00:00
Michael Matloka
f02d045bf9 chore(environments): Add migration to backfill projects (#20887)
* chore(environments): Add migration to backfill projects

* Fix `noop`

* Add `project_id` to plugin server test setup

* Fix `project_id`

* Also add `posthog_project` to plugin server tests

* Update `createTeam`

* Fix func tests
2024-03-14 13:45:28 +01:00
Tiina Turban
9c0802a541 chore: skip falky autocapture functional test (#20638)
chore: skip falky autocapture test

We're not really touching that code atm, this test flakes quite a bit, let's skip it.
2024-02-29 14:24:33 +01:00
Xavier Vello
6117206ea2 feat(ingestion): pass PLUGIN_SERVER_MODE as pg app name (#20613) 2024-02-29 10:34:28 +01:00
Marius Andra
4f50326aec fix(taxonomy): don't convert numeric strings to numbers (#19774) 2024-01-22 17:58:43 +01:00
Tiina Turban
e391387794 feat: Remove exportEvents (#18682) 2023-12-05 14:40:05 +01:00
ted kaemming
9299aa09e5 fix(plugin-server): Remove Postgres-based plugin error logging in favor of existing ClickHouse-based approaches (#18764) 2023-11-27 10:41:36 -08:00
ted kaemming
f342f35f33 test(plugin-server): Reduce flakiness of plugin teardown functional tests (#18896) 2023-11-27 07:51:39 -08:00
Xavier Vello
538a1be24f chore(CI): remove functional_tests/exports-v2.test.ts (#18549) 2023-11-10 15:59:23 +01:00
Tiina Turban
e3298f897f feat: p-s to support composeWebhook (#18465) 2023-11-09 18:19:18 +01:00
ted kaemming
9ade506258 feat: Log ingest warning on messages that are too large (#18318) 2023-11-03 07:56:46 -07:00
Xavier Vello
68fd523c35 fix(ci): fix flaky 'plugins can use attachements' test (#18341) 2023-11-02 15:40:33 +01:00
Brett Hoerner
d16408784f fix(plugin-server): fix unicode null byte blowing up the pipeline (#18282) 2023-10-30 16:39:10 -06:00
Brett Hoerner
286b689998 chore(plugin-server): remove INGESTION_DELAY_WRITE_ACKS and workerMethods (#17932)
* chore: stop using piscina worker methods for runEventPipeline

* chore(plugin-server): remove INGESTION_DELAY_WRITE_ACKS

---------

Co-authored-by: Tiina Turban <tiina303@gmail.com>
2023-10-16 09:19:49 -06:00
Paul D'Ambra
31c1cdf301 chore: yeet CH recordings ingestion (#17572)
Removing ClickHouse based recordings

One big yeet for a man, a great yeet for humanity
2023-10-11 14:23:41 +01:00
Brett Hoerner
d68fa14f10 chore(plugin-server): run node-rdkafka with cooperative rebalancing patched in (#17747)
* Remove node-rdkafka-acosom

* Add node-rdkafka

* Replace node-rdkafka-acosom imports with node-rdkafka

* Patch node-rdkafka with changes from https://github.com/PostHog/node-rdkafka/

* Add patch directions to README
2023-10-10 08:13:05 -06:00
Ben White
05079fa1a7 feat: Save app properties and others to Person from events (#17393) 2023-09-14 12:57:54 +02:00
Xavier Vello
1b6628055d feat(plugin-server): allow to use several PG connection pools (#17001)
Co-authored-by: Tiina Turban <tiina303@gmail.com>
2023-08-24 11:09:10 +02:00
Ben White
d8df34f4ab feat: Replay events consumer (#16642) 2023-07-20 14:41:25 +00:00
Harry Waye
f901665bfa chore: make sure dlqs exist in function tests before consuming (#16550)
In CI it's often the case that we get an error saying the
topic-partition pair doesn't exist. This creates the topic explicitly.
2023-07-13 10:49:01 +01:00
Harry Waye
06bd75ee1e chore(plugin-server): we split onevent and webhooks consumers (#16511)
* Revert "Revert "chore(plugin-server): we split onevent and webhooks consumers" (#16510)"

This reverts commit 59af5b904d.

* remove capabiliity

* fix load actions

* fix typing

* fix tests

* wip
2023-07-12 15:44:21 +00:00
Harry Waye
5b883cbda4 fix: historical test was missing a topic for one capture (#16493)
* fix: historical test was missing a topic for one capture

As a result we were processing out of order and the test was failing.

* remove comment
2023-07-11 14:37:35 +00:00
Harry Waye
9f0cf9f40f chore: make attachement test less flakey
We do this by making sure the plugin config and the attachement are
committed together.
2023-07-11 14:08:29 +00:00
Ben White
5a636f6bd1 feat: Optimise resource usage for blob ingester (#16478) 2023-07-11 15:11:36 +02:00
Tiina Turban
34f4f12d99 feat: backfill consumer (#16460)
* feat: backfill consumer

* Add functional test

* stop consumer

* fix test flake

---------

Co-authored-by: Harry Waye <harry@posthog.com>
2023-07-11 11:40:47 +00:00
Harry Waye
47fcd871bd chore: disable slack on zapier test (#16483)
There's a race condition in the test in that if we check the request
after slack has fired but before zapier has, we'll get a false negative.
2023-07-11 10:55:33 +00:00
Harry Waye
c85d94266c chore: disconnect consumer on error handling functional_tests (#16480)
If we don't we end up with a bunch of errors in the logs about imports
happening after jest tests have finished.
2023-07-11 10:40:40 +00:00
Harry Waye
c6a2449d3a chore(plugin-server): just remove jobs-worker test, it's flaky (#16319)
It's also an optimization as we also already do not schedule the job if
the plugin is disabled. It just means that if loads of jobs are
scheduled to graphile before the plugin was disabled then it will take a
while to get through.
2023-06-30 11:18:02 +01:00
Harry Waye
480b9724dc chore: simplify message size too large ingestion test (#16125)
We didn't need to test with so many events. And we add a check on the
DLQ just to make sure we actually did error out.
2023-06-18 20:36:33 +01:00
Harry Waye
924deae8dc fix(ingestion): add DLQ for non-retriable errors (#16124)
* fix(ingestion): add DLQ for non-retriable errors

This is due to
https://posthog.slack.com/archives/C0185UNBSJZ/p1687006425094159 which
is causing some lag on ingestion.

* fix error handleing

* fix tests
2023-06-17 22:14:00 +01:00
Harry Waye
aa488321e6 chore: allow plugins to be made non-global but continue to work (#15860)
We want to be able to make plugins non-global, but we don't want to
break existing plugins. This commit adds a test to ensure that plugins
continue to work when they are made non-global.

Previously they would have been disabled by the `is_global` check.

TODO: what will be the impact of making this change.
2023-06-02 10:42:06 +01:00
Harry Waye
d8773d99d9 test: add tests for plugin secrets (#15754)
* test(plugin-server): add test for attachments

We didn't have one before, now we do!

* test: add tests for plugin secrets

Adds basic test for using configurations in plugins. This is a first
step towards getting lazy loading of plugins in safely:
https://github.com/PostHog/posthog/pull/15704
2023-05-26 18:00:45 +00:00
Harry Waye
685d1e2fc8 test(plugin-server): add test for attachments (#15752)
We didn't have one before, now we do!
2023-05-26 17:17:14 +00:00
Xavier Vello
6b0abd05af feat(jobs): don't execute queued jobs for disabled configs (#15738)
* feat(jobs): don't execute queued jobs for disabled configs

* add functional test for disabled plugins

---------

Co-authored-by: Harry Waye <harry@posthog.com>
2023-05-26 14:29:52 +00:00
Harry Waye
39224b018e chore: fix flaky dlq functional tests (#15439)
There is a race condition in these tests where the consumer isn't
consuming in time to pick up bad messages, so we ensure that we set the
offsets to the earliest messages.
2023-05-09 14:36:14 +01:00
Tiina Turban
011f600386 fix: functional tests flakiness (#15386) 2023-05-05 15:56:32 +02:00
Harry Waye
2f9e2928fe chore(plugin-server): use librdkafka producer everywhere (#15314)
* chore(plugin-server): use librdkafka producer everywhere

We say some 10x improvements in the throughput for session recordings.
Hopefully there will be more improvements here as well, although it's a
little less clear cut.

I don't try to provide any improvements in guarantees around message
production here.

* we still need to enable snappy for kafkajs
2023-05-04 13:02:44 +00:00
Tiina Turban
a5544cf7e4 feat: Async handlers use person info from event (#15307) 2023-05-04 13:25:56 +02:00
Harry Waye
7ba6fa7148 chore(plugin-server): remove piscina workers (#15327)
* chore(plugin-server): remove piscina workers

Using Piscina workers introduces complexity that would rather be
avoided. It does offer the ability to scale work across multiple CPUs,
but we can achieve this via starting multiple processes instead. It may
also provide some protection from deadlocking the worker process, which
I believe Piscina will handle by killing worker processes and
respawning, but we have K8s liveness checks that will also handle this.

This should simplify 1. prom metrics exporting, and 2. using
node-rdkafka.

* remove piscina from package.json

* use createWorker

* wip

* wip

* wip

* wip

* fix export test

* wip

* wip

* fix server stop tests

* wip

* mock process.exit everywhere

* fix health server tests

* Remove collectMetrics

* wip
2023-05-03 14:42:16 +00:00
Paul D'Ambra
359177127d fix: push the buffer files storage down a level (#15295) 2023-05-02 08:12:52 +01:00
Harry Waye
9b4d455a29 docs(set/set_once): add $set/$set_once to docs (#15306)
This also adds a test to ensure we are capturing usage of $set/$set_once
at the top level of the event, as posthog-js uses this method.

This was initiated by the issue mentioned
[here](https://github.com/PostHog/posthog-js/issues/615).
2023-05-01 12:09:38 +00:00
Xavier Vello
013ac5cd93 chore(tests): use PoEv2 join instead of dict for functional_tests (#15188) 2023-04-28 16:27:35 +02:00
Harry Waye
96fe16fd3c chore(recordings): use cooperative-sticky rebalance strategy (#15260)
Revert "revert(recordings): use cooperative-sticky rebalance strategy (#15211)"

This reverts commit a40f01138e.
2023-04-26 13:09:13 +00:00
Paul D'Ambra
b75d560a55 fix: SESSION_RECORDING_BLOB_PROCESSING_TEAMS config handling (#15247)
* correct comment

* correct usage of enabled teams config

* turn on SESSION_RECORDING_BLOB_PROCESSING_TEAMS for all teams in CI

* skip failing tests
2023-04-26 13:23:31 +01:00
Harry Waye
3f4c0498df chore(plugin-server): remove recording forwarding (#15230)
We were forwarding events for backwards compatibility with the old
session recording system. Now that we've removed that, we can remove
this code.
2023-04-25 17:03:43 +01:00
Paul D'Ambra
261deda641 fix: simple functional tests for blob ingestion (#15225)
adding recorded ingestion functional tests was causing other functional tests to fail

now, it doesn't
2023-04-25 14:23:12 +00:00
Ben White
fdb2c71a39 feat: S3 backed recording ingestion (take 2) (#14864) 2023-04-25 09:43:07 +00:00
Harry Waye
a40f01138e revert(recordings): use cooperative-sticky rebalance strategy (#15211)
Revert "chore(recordings): use cooperative-sticky rebalance strategy (#15197)"

This reverts commit 3eddb96b9b.
2023-04-24 15:06:33 +00:00
Harry Waye
3eddb96b9b chore(recordings): use cooperative-sticky rebalance strategy (#15197)
* chore(recordings): use cooperative-sticky rebalance strategy

This should make rebalances and lag during deploys a little less
painful. I'm setting this as the globally used strategy, when we e.g.
want to use another strategy for a specific consumer group, we can make
this configurable.

* disable rebalance_callback

* use node-rdkafka-acosom fork instead, for cooperative support
2023-04-24 13:25:24 +00:00
Harry Waye
0b64b3a79f chore(async-liveness): add async liveness check (#14811)
To make sure pods are restarted if they get stuck.
2023-03-17 18:12:56 +00:00