Chat

Prerequisites: Before reading this document, it’s recommended to understand the sessions and tasks section in Core Concepts.

Monstrum provides a streaming chat interface for real-time interaction with Bots. Bots can call tools, execute operations, and display results to you during conversations.

Overview

Monstrum’s chat system supports two entry points:

Web Chat: Chat directly with a Bot in the Overview tab of the Bot detail page
IM Channels: Chat with a Bot through external IM platforms like Slack, Feishu, Telegram, etc. (requires Gateway configuration)

Both methods use the same underlying session mechanism — only the message entry and exit points differ.

Web Chat

Starting a Conversation

Go to Bot Management → Click a Bot → Enter the Bot detail page
Type a message in the chat box at the bottom of the Overview tab
Press Enter or click the send button

The Bot uses the configured LLM for reasoning and replies. Conversations support streaming output — you can see the Bot generating its response token by token.

Tool Call Display

When the Bot calls a tool during conversation, the interface displays:

Tool Name: Shows which tool was called (e.g., web_search, ssh_execute)
Execution Result: The result content returned by the tool
You can expand to view the detailed parameters and full results of the tool call

In a single conversation, the Bot may perform multiple rounds of tool calls. For example, it might first search for information, then access a webpage based on search results, and finally summarize the content in its reply.

New Conversation

Click the New Conversation button to start a fresh session. A new session clears the current conversation context, and the Bot starts from scratch.

End Session

Click End Session to manually terminate the current session. When ending a session:

If the Bot has automatic memory extraction enabled, it will extract important information from the conversation and save it as memories
The session history is saved (if session persistence is enabled)

Session Lifecycle

Creation

When the first message arrives, the platform creates a new session. During creation:

The Bot’s memories are loaded (global memories + current scope memories)
Persisted conversation history is loaded (if available)
The system prompt is constructed

Active

Each message is processed in FIFO order. When processing a message, the Bot:

Constructs the system prompt (including resource summaries, memories, skills, etc.)
Resolves the available tool list (Pre-LLM permission filtering)
Calls the LLM for reasoning
If the LLM returns a tool call → permission check → execute tool → continue reasoning
Repeats until the LLM returns a final text response

Expiration

Sessions automatically expire after 30 minutes of inactivity. On expiration:

Automatic memory extraction is triggered (if enabled)
Session history is saved to the database
Session resources are released

Session Persistence

Channel-level conversations support persistence. The next time a session is created, the previous conversation history is loaded.

When conversations accumulate more than 80 messages, the platform automatically compresses them: the LLM summarizes old messages into a digest, keeping the most recent 50 messages. The digest is injected at the beginning of the conversation as a “previously on” summary.

Streaming Response

Monstrum supports SSE (Server-Sent Events) streaming responses. Bot replies are pushed to the frontend token by token for real-time display.

Streaming responses include the following event types:

Event Type	Description
`text`	Text fragments that progressively form the complete reply
`tool_start`	Tool call started
`tool_result`	Tool execution result
`done`	Response complete

Intermediate tool call rounds (Bot calls tool, gets result, continues reasoning) are shown to the user as tool_start and tool_result events, while the final text reply streams token by token.

Session Error Handling

Bots may encounter the following exceptions during conversations. The platform displays corresponding messages in the chat interface:

Exception	Message	Recommended Action
Token budget exhausted	”Token budget exhausted”	Adjust the budget in Bot settings or start a new conversation
Max iterations reached	”Maximum execution rounds reached”	Increase the iteration limit in Bot settings
Timeout	”Request timed out”	Retry or adjust the timeout duration
Execution error	”An error occurred during processing”	Check the Data Center logs for details

IM Channel Chat

After connecting IM platforms through Gateways, users can chat with Bots in Slack, Feishu, Telegram, and other platforms.

Message Trigger Modes

Mode	Description
Smart	In group chats, only replies when @Bot; in direct messages, replies to all messages
All Messages	Replies to all messages
@Bot Only	Only replies when explicitly @mentioned

Group Chat vs Direct Message

Direct Message: Bot has one-on-one conversations with users
Group Chat: Bot can see the message flow in the group and participates when @mentioned

In group chat mode, the Bot’s system prompt automatically appends group chat-related instructions to help the Bot understand the group chat context.

Message Policies

In the Gateway configuration, you can set:

Reply Scope: All / Direct messages only / Group chats only
User Filtering: Allowed users / Blocked users
Channel Filtering: Allowed channels / Blocked channels

See IM Gateways for details.

Task Mode

In addition to session-mode conversations, Bots also support task mode — single-execution tasks triggered via API, scheduled tasks, Bot-to-Bot delegation, or events.

Viewing Tasks

In the Execution History tab of the Bot detail page, you can view all Bot tasks, including:

Task status: Pending / Running / Completed / Failed / Cancelled
Task instruction
Token consumption
Execution duration
Completion time

Click a task to view details, including the complete tool call chain and execution results.

Task Replay

In the Data Center’s task detail page, you can view the complete execution process of a task — each step’s reasoning output, tool calls, and execution results displayed on a timeline.

FAQ

Bot replies slowly

Check the LLM provider’s API response speed
If the Bot needs to call multiple tools, each tool call adds to the response time
Consider lowering the Temperature or selecting a faster model

Conversation context lost

Sessions automatically expire after 30 minutes of inactivity, after which a new session begins
If session persistence is enabled, history is restored when a new session is created
After 80+ messages, compression is triggered — early details may be lost in the summary

Bot keeps calling tools in a loop

Check the max iterations setting and lower it as appropriate
Check whether the system prompt correctly guides the Bot to terminate
Consider using “planning” reasoning mode so the Bot plans before executing

Edit this page on GitHub