Skip to content

feat: add BotName, FileCount, HasMedia fields to message output#170

Merged
korotovsky merged 4 commits intokorotovsky:masterfrom
Flare576:feat-message-metadata
Jan 29, 2026
Merged

feat: add BotName, FileCount, HasMedia fields to message output#170
korotovsky merged 4 commits intokorotovsky:masterfrom
Flare576:feat-message-metadata

Conversation

@Flare576
Copy link
Contributor

@Flare576 Flare576 commented Jan 29, 2026

Summary

Adds metadata fields to message CSV output to help identify media-containing messages, and adds a new files_get tool for downloading file content. Addresses #88 and maintainer feedback.


Part 1: Message Metadata Fields

New Fields in Message Output:

Field Type Source Description
BotName string msg.BotProfile.Name Bot name for bot messages (e.g., "giphy")
FileCount int len(msg.Files) Count of attached files
HasMedia bool Files or image blocks True if message contains media

Example Output:

MsgID,UserID,...,BotName,FileCount,HasMedia,Cursor
1769722377.299629,U8T61HX4M,...,giphy,0,true,
1769722390.208069,UBHCH6943,...,,0,false,

Notes:

  • For SearchMessage results, only HasMedia is populated (via blocks check) since the search API doesn't return BotProfile or Files data
  • Uses existing slack.MBTImage constant to detect image blocks

Part 2: files_get Tool (New)

Downloads file content by file ID, allowing LLMs to access file attachments.

Guardrail: Requires SLACK_MCP_FILES_TOOL=true environment variable.

Response Format:

{
  "file_id": "F0ABV8E0KGC",
  "filename": "example.txt",
  "mimetype": "text/plain",
  "size": 94,
  "encoding": "none",
  "content": "File content here..."
}

Design Decisions:

Aspect Choice Reason
Size limit 5MB Keep responses reasonable for LLM context windows
Text handling Plain text (encoding: "none") Directly usable without decoding
Binary handling Base64 (encoding: "base64") Standard way to embed binary in JSON
Text detection Mimetype check text/*, application/json, application/xml, etc.

Text mimetypes treated as plain text:

  • text/*
  • application/json
  • application/xml
  • application/javascript
  • application/x-yaml
  • application/toml

Fixes #88

Adds metadata fields to help identify media-containing messages:
- BotName: populated from msg.BotProfile.Name for bot messages (e.g., 'giphy')
- FileCount: count of attached files
- HasMedia: true if message has files OR image blocks

This provides visibility into message types that was previously stripped
from the Slack API response, addressing user requests in issue korotovsky#88.

For SearchMessage results, only HasMedia is populated (via blocks) since
the search API doesn't return BotProfile or Files data.
@korotovsky
Copy link
Owner

In the original request #88 it has been also asked to provide the tool for fetching the attachment itself. I guess plaintext attachments must be returned as text and blobs must be encoded as base64.

Adds ability to download file content by file ID, addressing maintainer
request on PR korotovsky#170.

- New files_get tool gated by SLACK_MCP_FILES_TOOL env var
- Text files (text/*, application/json, etc.) returned as plain text
- Binary files returned as base64-encoded content
- 5MB size limit to keep responses reasonable for LLM context
- Returns structured JSON: file_id, filename, mimetype, size, encoding, content
@korotovsky
Copy link
Owner

How Agent or MCP Client will get context of file_id? Not sure if we collect all available fileIds per message? Could you share screenshot how did you use tool files_get in combination of conversation tool?

Enables agents to discover file IDs from conversation history,
completing the workflow for files_get tool usage.
@Flare576
Copy link
Contributor Author

Good catch! I've added a FileIDs field (comma-separated) to the message output so agents can discover file IDs directly from conversation history.

Complete Workflow Example

Step 1: Get conversation history

conversations_history(channel_id="DBJCPJLF8", limit="20")

Response (CSV):

MsgID,UserID,...,FileCount,FileIDs,HasMedia,Cursor
1739821275.613919,UBHCH6943,...,1,F08DQ3DBAM8,true,
1727712637.419529,UBHCH6943,...,1,F07QCNL4XFS,true,
1769715839.885789,UBHCH6943,...,0,,false,

Step 2: Agent sees FileCount > 0 and extracts file ID from FileIDs column

Step 3: Fetch file content

files_get(file_id="F08DQ3DBAM8")

Response:

{
  "file_id": "F08DQ3DBAM8",
  "filename": "20250217_134047.jpg",
  "mimetype": "image/jpeg",
  "size": 825923,
  "encoding": "base64",
  "content": "/9j/4AAQSkZJRgABAQAA..."
}

For text files, encoding is "none" and content is plain text.

The new FileIDs field is included in the latest commit.

@korotovsky
Copy link
Owner

Not sure if splitting FileIDs with comma is a good idea, since we split CSV with comma already, I guess LLM might hallucinate then. Maybe ;? Or non CSV delimiter i.e. |.

@Flare576
Copy link
Contributor Author

Just tested with a 4-file message - gocsv handles the quoting automatically:

MsgID,...,FileCount,FileIDs,HasMedia,Cursor
1769728398.924449,...,4,"F0ABEBYP33R,F0ABZDCPAQL,F0ABEBZE495,F0ACQ4FTHCY",true,
1739821275.613919,...,1,F08DQ3DBAM8,true,
1769715839.885789,...,0,,false,

Single file → no quotes needed. Multiple files → properly quoted. CSV parsers should handle both cases correctly.

@korotovsky
Copy link
Owner

And let's rename the tool to attachment_get_data - it will be consistent with Slack naming as they call attachments as attachments, not files. It must also improve performance of the LLM performance on most models, since likely it was trained on word "attachement"

…gy consistency

- Tool: files_get → attachment_get_data
- Field: FileIDs → AttachmentIDs
- Env var: SLACK_MCP_FILES_TOOL → SLACK_MCP_ATTACHMENT_TOOL

Per maintainer feedback - aligns with Slack's 'attachment' terminology
and may improve LLM performance due to training data prevalence.
@Flare576
Copy link
Contributor Author

Done! Renamed for Slack terminology consistency:

Old New
files_get attachment_get_data
FileIDs AttachmentIDs
SLACK_MCP_FILES_TOOL SLACK_MCP_ATTACHMENT_TOOL

(FileCount kept as-is since it's internal/technical, not user-facing terminology)

@korotovsky
Copy link
Owner

@Flare576 yeah, I know, that go library will do automatic quoting in this case as per CSV standard, I'm just trying to make the CSV output as much simpler as possible to be understandable for low-grade models, for them it might be a little confusing different amount of commas in header and in following rows. This is the only my concern.

WDYT?

@Flare576
Copy link
Contributor Author

I totally see your point, but I think in this case you'd be introducing a non-standard delimiter (|, ;, etc.) to a well-known standard format (CSVs with quotes are pretty common) to "help" low-powered LLMs, but I think you'd be introducing another point of confusion to an already confusing system.

I think the problem is CSV, but I don't think this is the venue to solve it 😂

@korotovsky
Copy link
Owner

Ok, let's keep auto-quoting. Btw, CSV saves everyone lots of tokens, rather than if we would use native JSON.

@korotovsky
Copy link
Owner

@Flare576 I appreciate the PRs on reactions and this one too. If you have a spare time today or tomorrow I'd appreciate you could cherry-pick changes from this PR #146 and rebase them + test remaining test base. Then we could make a new release.

@korotovsky korotovsky merged commit 14b4cdc into korotovsky:master Jan 29, 2026
3 checks passed
aron-muon pushed a commit to aron-muon/slack-mcp-server that referenced this pull request Jan 30, 2026
Adds ability to download file content by file ID, addressing maintainer
request on PR korotovsky#170.

- New files_get tool gated by SLACK_MCP_FILES_TOOL env var
- Text files (text/*, application/json, etc.) returned as plain text
- Binary files returned as base64-encoded content
- 5MB size limit to keep responses reasonable for LLM context
- Returns structured JSON: file_id, filename, mimetype, size, encoding, content
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support reading attachments (e.g., images) in Slack MCP

2 participants