-
Notifications
You must be signed in to change notification settings - Fork 235
feat: extract text content from message blocks in conversations_history #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: extract text content from message blocks in conversations_history #190
Conversation
Messages using Slack's Block Kit format (emails forwarded via Slack, bot notifications from Grafana/Datadog, app messages) return empty text when retrieved via conversations_history. This extracts text from block structures so these messages are no longer empty. Closes korotovsky#186
|
Could you please provide a couple of screenshots/examples that illustrate before/after results? |
|
@korotovsky Sure! Here are before/after examples based on real Slack messages. Example 1: CloudWatch Alarm (attachment with Block Kit only)This is a real message from an AWS monitoring bot. The message has Attachment JSON (simplified): {
"text": "",
"attachments": [{
"title": "",
"text": "",
"blocks": [
{"type": "section", "text": {"type": "mrkdwn", "text": "*<URL|:rotating_light: CloudWatch Alarm | MyAlarm | ap-northeast-1>*"}},
{"type": "section", "text": {"type": "mrkdwn", "text": "Threshold Crossed: 1 out of the last 1 datapoints [1.0] was greater than or equal to the threshold (1.0)"}},
{"type": "actions", "elements": [...]},
{"type": "section", "fields": [{"type": "mrkdwn", "text": "*Namespace*\nMyNamespace"}, {"type": "mrkdwn", "text": "*Metric*\nMyMetric"}]},
{"type": "section", "fields": [{"type": "mrkdwn", "text": "*Alarm State*\nALARM"}]},
{"type": "image", "...": "..."},
{"type": "context", "elements": [{"type": "mrkdwn", "text": "<!date^1738900000^{date_short_pretty} at {time}|2025-02-07>"}]}
]
}]
}Before — After — Block Kit content is extracted from Example 2: Datadog Alert (legacy attachment format — no change)A Datadog alert uses legacy attachment fields ( {
"text": "",
"attachments": [{
"title": "Triggered: ServerError - /ecs/my-service",
"text": "Host: /ecs/my-service\nLog status: error\nMore than 1 log event matched..."
}]
}Before and After — Same output (already extracted via legacy fields): Example 3: Top-level Block Kit message (bot with
|
|
@derodero24 thank you, and how does serialized CSV message look like in such cases? |
|
@korotovsky Here's the actual serialized CSV output generated by running the code against representative Block Kit messages: Example 1: CloudWatch Alarm (attachment with blocks only)Before: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770433087.532289,U001,aws-bot,Amazon Q Developer,C001,,,2026-02-07T02:58:02Z,Amazon Q Developer,After: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770433087.532289,U001,aws-bot,Amazon Q Developer,C001,,"Blocks: https://console.aws.amazon.com/cloudwatch - :rotating_light: CloudWatch Alarm MyAlarm ap-northeast-1 Account: 123456789012, Threshold Crossed: 1 out of the last 1 datapoints 1.0 07/02/26 02:57:00 was greater than or equal to the threshold 1.0 Namespace MyNamespace Metric MyMetric Alarm State ALARM",2026-02-07T02:58:02Z,Amazon Q Developer,Example 2: Bot message with top-level blocksBefore: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770400000.000001,U002,deploy-bot,Deploy Bot,C002,,,2026-02-07T10:00:00Z,Deploy Bot,After: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770400000.000001,U002,deploy-bot,Deploy Bot,C002,,Deploy Complete Service my-api deployed to production Deployed by CI/CD pipeline,2026-02-07T10:00:00Z,Deploy Bot,Example 3: Legacy attachment (no blocks) — unchangedMsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770433155.658769,U003,datadog,Datadog,C001,,Title: Triggered: ServerError Text: Host: /ecs/my-service Log status: error More than 1 log event matched,2026-02-07T02:59:15Z,Datadog, |
Summary
Extract text from Slack Block Kit structures in messages retrieved via
conversations_historyandconversations_replies, so that block-only messages (emails, bot notifications, app messages) are no longer returned empty.Closes #186
Problem
Messages using Block Kit format (e.g., Slack Email integration, Grafana/Datadog alerts) have an empty
Textfield. The existingAttachmentsTo2CSVextracts text from attachment fields but ignores:blocksarrayattachments[].blocksarraySolution
BlocksToTextfunction inpkg/text/text_processor.goextracts text from common block types:header.textsection.textandsection.fieldsrich_text(sections, lists, quotes, preformatted)context.elements[].textmsg.Textis empty, fall back toBlocksToText(msg.Blocks)— avoids duplication since Slack typically populatestextas a plaintext fallback of blocksAttachmentToTextnow also extractsatt.BlocksChanges
pkg/text/text_processor.go: AddBlocksToTextand rich text helper functions; callBlocksToTextinAttachmentToTextpkg/handler/conversations.go: Use block text as fallback inconvertMessagesFromHistoryandconvertMessagesFromSearchpkg/text/text_processor_test.go: Add 15 test cases covering all supported block types, edge cases, and attachment integrationTesting
go test -run TestUnit ./...)