Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add a generic webhook for sending event notifications #879

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ SLACK_BOT_USER_ACCESS_TOKEN=''
GOCD_WEBHOOK_SECRET=''
KAFKA_CONTROL_PLANE_WEBHOOK_SECRET=''
SENTRY_OPTIONS_WEBHOOK_SECRET=''
EXAMPLE_SERVICE_SECRET=''

# Silence some GCP noise
DRY_RUN=true
1 change: 1 addition & 0 deletions .env.test
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ SLACK_BOT_APP_ID="5678"
GOCD_WEBHOOK_SECRET="webhooksecret"
KAFKA_CONTROL_PLANE_WEBHOOK_SECRET="kcpwebhooksecret"
SENTRY_OPTIONS_WEBHOOK_SECRET="sentryoptionswebhooksecret"
EXAMPLE_SERVICE_SECRET="examplewebhooksecret"

# Other
GOCD_SENTRYIO_FE_PIPELINE_NAME="getsentry-frontend"
Expand Down
4 changes: 4 additions & 0 deletions src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ Below are descriptions for how this application is organized. Each directory con

## Common Use Cases

## Generic Event Notifier

You can use this service to send a message to Sentry Slack or Datadog. All you have to do is create a small PR to create a HMAC secret for your use case, and your service can send messages to Sentry Slack and Datadog via infra-hub. See [this README](webhooks/README.md) for more details.

### Adding a New Webhook

To add a new webhook, nagivate to `webhooks` and follow the directions there. Most of the logic should be self-contained within the `webhooks` directory, with handlers in `brain` being appropriate if the webhook is for receiving event streams. To send a message to external sources, use the APIs in `api`.
Expand Down
7 changes: 0 additions & 7 deletions src/buildServer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ import { loadBrain } from '@utils/loadBrain';

import { SENTRY_DSN } from './config';
import { routeJobs } from './jobs';
import { SlackRouter } from './slack';

export async function buildServer(
logger: boolean | { prettyPrint: boolean } = {
Expand Down Expand Up @@ -96,11 +95,5 @@ export async function buildServer(
// Endpoints for Cloud Scheduler webhooks (Cron Jobs)
server.register(routeJobs, { prefix: '/jobs' });

server.post<{ Params: { service: string } }>(
'/slack/:service/webhook',
{},
SlackRouter(server)
);

brian-lou marked this conversation as resolved.
Show resolved Hide resolved
return server;
}
13 changes: 13 additions & 0 deletions src/config/secrets.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
/*

This file contains secrets used for verifying incoming events from different HTTP sources.

*/

export const EVENT_NOTIFIER_SECRETS = {
// Follow the pattern below to add a new secret
// 'example-service': process.env.EXAMPLE_SERVICE_SECRET,
};
if (process.env.ENV !== 'production')
EVENT_NOTIFIER_SECRETS['example-service'] =
process.env.EXAMPLE_SERVICE_SECRET;
7 changes: 0 additions & 7 deletions src/slack/README.md

This file was deleted.

38 changes: 0 additions & 38 deletions src/slack/index.ts

This file was deleted.

23 changes: 23 additions & 0 deletions src/types/index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import { IncomingMessage, Server, ServerResponse } from 'http';

import { EventAlertType } from '@datadog/datadog-api-client/dist/packages/datadog-api-client-v1';
import { Block, KnownBlock } from '@slack/types';
import { FastifyInstance } from 'fastify';

// e.g. the return type of `buildServer`
Expand All @@ -26,3 +28,24 @@ export interface KafkaControlPlaneResponse {
title: string;
body: string;
}

export type GenericEvent = {
source: string;
timestamp: number;
service_name?: string; // Official service registry name if applicable

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you going to use this for ? If I remember correctly the service registry was supposed to be used to route messages to the right channel, but the channel is also passed as a parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea was to use the service-registry if the service_name is supplied, but I'm not sure if everything that will use this in the future will be associated with a service. If we can make that guarantee, I can remove the channel parameter

Copy link

@fpacifici fpacifici Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can make that guarantee, I can remove the channel parameter
we cannot make this guarantee.

So the problem is the same as below. Please let's not have an api method with a lot of optional fields at the same level and rely on the user to know which combination are valid and which ones are not.

Instead, I would suggest one of these two so that making mistakes is harder:

  • Have more granular methods: one for slack with smart routing that goes to service and one where we specify the channel. Though we need to think through which methods we are going to build upfront.
  • Have one method but provide different structured types for the different types of notification: slack with direct routing, slack with service routing (service is relevant only for slack), jira, datadog, etc.

data: {
title: string;
message: string;
channels: {
slack?: string[]; // list of Slack Channels
datadog?: string[]; // list of DD Monitors
jira?: string[]; // list of Jira Projects
bigquery?: string;
};
tags?: string[]; // Not used for Slack
misc: {
alertType?: EventAlertType; // Datadog alert type
blocks?: (KnownBlock | Block)[]; // Optional Slack blocks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IT has been quite some time since when we discussed this design.
I cannot remember anymore whether the generic event was supposed to know where to route the message or whether it was supposed to be the simple api the user would use to route a message to a specific channel.

If it is the first (the user sends the message and there is logic in eng-pipes that figures out where to route the message) then it should not be a generic event and it should not contain channel specific parameters (like tags and the list of channels)

If it is the simple one (the user picks the channel), then I don't think putting all the channel fields in the same message and api is a good design, you should have dedicated apis per channel. It would be quite hard to understand which fields are supposed to be populated in which scenario.

If there are good reaons to have one api (still in the simple option where eng-pipes does not provide any routing loigic) and if you wanted to support multiple channels in one api call then a better api design would be to require a list of objects, each object containing all the details needed for a specific channel:

channel[]
  
where channel is the union of 
Slack: {
   channel: string
   blocks
}

Datadog {
   monitor: string
   tags: string[]
}

...

If there are common fields they can be specified only once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea was to have a combination of both: a generic, simple api which a user can use to route a message to a specific channel, and a more complex webhook which has logic to route more complicated events.

I agree that the typing for the generic event payload can be improved, will work on that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining the public api of a service is a task that should be thought through carefully as changing it afterwards when clients are using is hard.
So feel free to take the time to think the details of all the apis method you expect to create and send a PR with only those types and the empty business logic so we can review if the whole api is coherent and meaningful.

};
};
};
30 changes: 30 additions & 0 deletions src/webhooks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,36 @@
* Webhooks in "production" are deployed to a Google Cloud Run instance, in the project `super-big-data`. Why? (TODO insert why)
* The webhook points to `https://product-eng-webhooks-vmrqv3f7nq-uw.a.run.app`

## Generic Event Notifier

The folder `generic-notifier` provides a generic webhook which can be used to send messages to Sentry Slack channels and Sentry Datadog. Using this webhook is VERY simple.
brian-lou marked this conversation as resolved.
Show resolved Hide resolved

Simply, go to `@/config/secrets.ts` and add an entry to the `EVENT_NOTIFIER_SECRETS` object. This entry should contain a mapping from the name of your service (for example, `example-service`) to an environment variable. [TODO: Fill in how to set the prod env var here]. Make a PR with this change and get it approved & merged.

Once this has been deployed, all you have to do is send a POST request to `https://product-eng-webhooks-vmrqv3f7nq-uw.a.run.app/event-notifier/v1` with a JSON payload in the format of the type `GenericEvent` defined in `@/types/index.ts`. Example:

```json
{
"source": "example-service", // This must match the mapping string you define in the EVENT_NOTIFIER_SECRETS obj
"timestamp": 0,
"service_name": "official_service_name",
"data": {
"title": "This is an Example Notification",
"message": "Random text here",
"tags": [
"source:example-service", "sentry-region:all", "sentry-user:bob"
],
"misc": {},
"channels": {
"slack": ["C07EH2QGGQ5"],
"jira": ["TEST"]
}
}
}
```

Additionally, you must compute the HMAC SHA256 hash of the raw payload string computed with the secret key, and attach it to the `Authorization` header. EX: `Authorization: <Hash here>`

## Adding a webhook to GoCD event emitter

* goto [gocd](deploy.getsentry.net)
Expand Down
95 changes: 95 additions & 0 deletions src/webhooks/generic-notifier/generic-notifier.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
import testInvalidPayload from '@test/payloads/generic-notifier/testInvalidPayload.json';
import testPayload from '@test/payloads/generic-notifier/testPayload.json';
import { createNotifierRequest } from '@test/utils/createGenericMessageRequest';

import { buildServer } from '@/buildServer';
import { DATADOG_API_INSTANCE } from '@/config';
import { bolt } from '@api/slack';

import { messageSlack } from './generic-notifier';

describe('generic messages webhook', function () {
let fastify;
beforeEach(async function () {
fastify = await buildServer(false);
});

afterEach(function () {
fastify.close();
jest.clearAllMocks();
});

it('correctly inserts generic notifier when stage starts', async function () {
jest.spyOn(bolt.client.chat, 'postMessage').mockImplementation(jest.fn());
jest
.spyOn(DATADOG_API_INSTANCE, 'createEvent')
.mockImplementation(jest.fn());
const response = await createNotifierRequest(fastify, testPayload);

expect(response.statusCode).toBe(200);
});

it('returns 400 for an invalid source', async function () {
const response = await fastify.inject({
method: 'POST',
url: '/event-notifier/v1',
payload: testInvalidPayload,
});
expect(response.statusCode).toBe(400);
});
it('returns 400 for invalid signature', async function () {
const response = await fastify.inject({
method: 'POST',
url: '/event-notifier/v1',
headers: {
'x-infra-hub-signature': 'invalid',
},
payload: testPayload,
});
expect(response.statusCode).toBe(400);
});

it('returns 400 for no signature', async function () {
const response = await fastify.inject({
method: 'POST',
url: '/event-notifier/v1',
payload: testPayload,
});
expect(response.statusCode).toBe(400);
});

describe('messageSlack tests', function () {
afterEach(function () {
jest.clearAllMocks();
});

it('writes to slack', async function () {
const postMessageSpy = jest.spyOn(bolt.client.chat, 'postMessage');
await messageSlack(testPayload);
expect(postMessageSpy).toHaveBeenCalledTimes(1);
const message = postMessageSpy.mock.calls[0][0];
expect(message).toEqual({
channel: '#aaaaaa',
text: 'Random text here',
unfurl_links: false,
});
});
});

it('checks that slack msg is sent', async function () {
const postMessageSpy = jest.spyOn(bolt.client.chat, 'postMessage');
const response = await createNotifierRequest(fastify, testPayload);

expect(postMessageSpy).toHaveBeenCalledTimes(1);

expect(response.statusCode).toBe(200);
});
it('checks that dd msg is sent', async function () {
const ddMessageSpy = jest.spyOn(DATADOG_API_INSTANCE, 'createEvent');
const response = await createNotifierRequest(fastify, testPayload);

expect(ddMessageSpy).toHaveBeenCalledTimes(1);

expect(response.statusCode).toBe(200);
});
});
84 changes: 84 additions & 0 deletions src/webhooks/generic-notifier/generic-notifier.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import { v1 } from '@datadog/datadog-api-client';
import * as Sentry from '@sentry/node';
import { FastifyReply, FastifyRequest } from 'fastify';
import moment from 'moment-timezone';

import { GenericEvent } from '@types';

import { bolt } from '@/api/slack';
import { DATADOG_API_INSTANCE } from '@/config';
import { EVENT_NOTIFIER_SECRETS } from '@/config/secrets';
import { extractAndVerifySignature } from '@/utils/auth/extractAndVerifySignature';

export async function genericEventNotifier(
request: FastifyRequest<{ Body: GenericEvent }>,
reply: FastifyReply
): Promise<void> {
try {
// If the webhook secret is not defined, throw an error
const { body }: { body: GenericEvent } = request;
if (
body.source === undefined ||
EVENT_NOTIFIER_SECRETS[body.source] === undefined
) {
reply.code(400).send('Invalid source or missing secret');
throw new Error('Invalid source or missing secret');
}

const isVerified = await extractAndVerifySignature(
request,
reply,
'x-infra-hub-signature',
EVENT_NOTIFIER_SECRETS[body.source]
);
if (!isVerified) {
// If the signature is not verified, return (since extractAndVerifySignature sends the response)
return;
}

await messageSlack(body);
await sendEventToDatadog(body, moment().unix());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the other channels ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other one was Jira, but never really got around to getting that added since Alex was (i think?) working a little on the env setup, so I was originally waiting for him to get that done first.

reply.code(200).send('OK');
return;
} catch (err) {
console.error(err);
Sentry.captureException(err);
reply.code(500).send();
return;
}
}

export async function sendEventToDatadog(
message: GenericEvent,
timestamp: number
) {
if (message.data.channels.datadog) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a pattern where you decide which methods to call in the genericEventNotifier function. That is the orchestrating function, which parses the message and trigger the relevant channels. From their name sendEventToDatadog and messageSlack should always send the message, not decide when not to send.

const params: v1.EventCreateRequest = {
title: message.data.title,
text: message.data.message,
alertType: message.data.misc.alertType,
dateHappened: timestamp,
tags: message.data.tags,
};
await DATADOG_API_INSTANCE.createEvent({ body: params });
}
}

export async function messageSlack(message: GenericEvent) {
if (message.data.channels.slack) {
for (const channel of message.data.channels.slack) {
const text = message.data.message;
try {
await bolt.client.chat.postMessage({
channel: channel,
blocks: message.data.misc.blocks,
text: text,
unfurl_links: false,
});
} catch (err) {
Sentry.setContext('msg:', { text });
Sentry.captureException(err);

Check warning on line 80 in src/webhooks/generic-notifier/generic-notifier.ts

View check run for this annotation

Codecov / codecov/patch

src/webhooks/generic-notifier/generic-notifier.ts#L79-L80

Added lines #L79 - L80 were not covered by tests
}
}
}
}
4 changes: 4 additions & 0 deletions src/webhooks/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import { FastifyReply, FastifyRequest } from 'fastify';
import { Fastify } from '@/types';

import { bootstrapWebhook } from './bootstrap-dev-env/bootstrap-dev-env';
import { genericEventNotifier } from './generic-notifier/generic-notifier';
import { gocdWebhook } from './gocd/gocd';
import { kafkactlWebhook } from './kafka-control-plane/kafka-control-plane';
import { sentryOptionsWebhook } from './sentry-options/sentry-options';
Expand Down Expand Up @@ -54,6 +55,9 @@ export async function routeHandlers(server: Fastify, _options): Promise<void> {
server.post('/metrics/webpack/webhook', (request, reply) =>
handleRoute(webpackWebhook, request, reply, 'webpack')
);
server.post('/event-notifier/v1', (request, reply) =>
handleRoute(genericEventNotifier, request, reply, 'generic-notifier')
);

// Default handler for invalid routes
server.all('/metrics/*/webhook', async (request, reply) => {
Expand Down
5 changes: 5 additions & 0 deletions test/payloads/generic-notifier/testAdminPayload.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"source": "admin",
"title": "this is a title",
"body": "this is a text body"
}
15 changes: 15 additions & 0 deletions test/payloads/generic-notifier/testBadPayload.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"service_name": "official_service_name",
"data": {
"message": "Random text here",
"tags": [
"source:example-service", "sentry-region:all", "sentry-user:bob"
],
"misc": {},
"channels": {
"slack": ["#C07GZR8LA82"],
"datadog": ["example-proj-id"],
"jira": ["INC"]
}
}
}
Loading
Loading