Skip to content

Conversation

@derodero24
Copy link

@derodero24 derodero24 commented Feb 8, 2026

Summary

  • Add FilesToText() to extract email metadata (From, CC, Subject) from files[] when msg.Text is empty
  • Email messages forwarded to Slack channels store content in files[] with filetype: "email", resulting in completely empty text output — this PR fills that gap

Ref #191 (partial fix — metadata only; email body requires upstream dependency update)

Before / After

Slack API response (forwarded email)

{
  "text": "",
  "files": [{
    "filetype": "email",
    "subject": "Meeting Tomorrow",
    "from": [{"name": "John Doe", "address": "[email protected]"}],
    "cc": [{"address": "[email protected]"}]
  }]
}

conversations_history CSV output

Before:

MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770523338.574369,USLACKBOT,Email,Email,C001,,,2026-02-08T04:02:18Z,Email,

After:

MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770523338.574369,USLACKBOT,Email,Email,C001,,Email, From: John Doe - john at example.com, CC: team at example.com, Subject: Meeting Tomorrow,2026-02-08T04:02:18Z,Email,

Design decisions

Conflict with #190

This PR conflicts with #190 (Block Kit text extraction) at conversations.go L702, since both modify the same fallback logic. The resolution is to chain the fallbacks:

msgText := msg.Text
if msgText == "" {
    msgText = text.BlocksToText(msg.Blocks)   // #190
}
if msgText == "" {
    msgText = text.FilesToText(msg.Files)      // this PR
}
msgText += text.AttachmentsTo2CSV(msgText, msg.Attachments)

Whichever PR merges first, the other can be rebased with this 3-line addition.

Test plan

  • 7 unit tests for FilesToText (filtering, all field combinations, edge cases)
  • 2 pipeline tests verifying output survives ProcessText/filterSpecialChars
  • Verified against real Slack email messages (with/without CC, with attachments, with inline images)

When emails are forwarded to Slack channels, message content is stored in
files[] with filetype "email" rather than in text or blocks. This adds
FilesToText() to extract From, CC, and Subject metadata as a fallback
when msg.Text is empty, so these messages no longer appear as blank rows
in conversations_history output.

Closes korotovsky#191
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant