Chat
Prerequisites: Before reading this document, it’s recommended to understand the sessions and tasks section in Core Concepts.
Monstrum provides a streaming chat interface for real-time interaction with Bots. Bots can call tools, execute operations, and display results to you during conversations.
Overview
Monstrum’s chat system supports two entry points:
- Web Chat: Chat directly with a Bot in the Overview tab of the Bot detail page
- IM Channels: Chat with a Bot through external IM platforms like Slack, Feishu, Telegram, etc. (requires Gateway configuration)
Both methods use the same underlying session mechanism — only the message entry and exit points differ.
Web Chat
Starting a Conversation
- Go to Bot Management → Click a Bot → Enter the Bot detail page
- Type a message in the chat box at the bottom of the Overview tab
- Press Enter or click the send button
The Bot uses the configured LLM for reasoning and replies. Conversations support streaming output — you can see the Bot generating its response token by token.
Tool Call Display
When the Bot calls a tool during conversation, the interface displays:
- Tool Name: Shows which tool was called (e.g.,
web_search,ssh_execute) - Execution Result: The result content returned by the tool
- You can expand to view the detailed parameters and full results of the tool call
In a single conversation, the Bot may perform multiple rounds of tool calls. For example, it might first search for information, then access a webpage based on search results, and finally summarize the content in its reply.
New Conversation
Click the New Conversation button to start a fresh session. A new session clears the current conversation context, and the Bot starts from scratch.
End Session
Click End Session to manually terminate the current session. When ending a session:
- If the Bot has automatic memory extraction enabled, it will extract important information from the conversation and save it as memories
- The session history is saved (if session persistence is enabled)
Session Lifecycle
Creation
When the first message arrives, the platform creates a new session. During creation:
- The Bot’s memories are loaded (global memories + current scope memories)
- Persisted conversation history is loaded (if available)
- The system prompt is constructed
Active
Each message is processed in FIFO order. When processing a message, the Bot:
- Constructs the system prompt (including resource summaries, memories, skills, etc.)
- Resolves the available tool list (Pre-LLM permission filtering)
- Calls the LLM for reasoning
- If the LLM returns a tool call → permission check → execute tool → continue reasoning
- Repeats until the LLM returns a final text response
Expiration
Sessions automatically expire after 30 minutes of inactivity. On expiration:
- Automatic memory extraction is triggered (if enabled)
- Session history is saved to the database
- Session resources are released
Session Persistence
Channel-level conversations support persistence. The next time a session is created, the previous conversation history is loaded.
When conversations accumulate more than 80 messages, the platform automatically compresses them: the LLM summarizes old messages into a digest, keeping the most recent 50 messages. The digest is injected at the beginning of the conversation as a “previously on” summary.
Streaming Response
Monstrum supports SSE (Server-Sent Events) streaming responses. Bot replies are pushed to the frontend token by token for real-time display.
Streaming responses include the following event types:
| Event Type | Description |
|---|---|
text | Text fragments that progressively form the complete reply |
tool_start | Tool call started |
tool_result | Tool execution result |
done | Response complete |
Intermediate tool call rounds (Bot calls tool, gets result, continues reasoning) are shown to the user as tool_start and tool_result events, while the final text reply streams token by token.
Session Error Handling
Bots may encounter the following exceptions during conversations. The platform displays corresponding messages in the chat interface:
| Exception | Message | Recommended Action |
|---|---|---|
| Token budget exhausted | ”Token budget exhausted” | Adjust the budget in Bot settings or start a new conversation |
| Max iterations reached | ”Maximum execution rounds reached” | Increase the iteration limit in Bot settings |
| Timeout | ”Request timed out” | Retry or adjust the timeout duration |
| Execution error | ”An error occurred during processing” | Check the Data Center logs for details |
IM Channel Chat
After connecting IM platforms through Gateways, users can chat with Bots in Slack, Feishu, Telegram, and other platforms.
Message Trigger Modes
| Mode | Description |
|---|---|
| Smart | In group chats, only replies when @Bot; in direct messages, replies to all messages |
| All Messages | Replies to all messages |
| @Bot Only | Only replies when explicitly @mentioned |
Group Chat vs Direct Message
- Direct Message: Bot has one-on-one conversations with users
- Group Chat: Bot can see the message flow in the group and participates when @mentioned
In group chat mode, the Bot’s system prompt automatically appends group chat-related instructions to help the Bot understand the group chat context.
Message Policies
In the Gateway configuration, you can set:
- Reply Scope: All / Direct messages only / Group chats only
- User Filtering: Allowed users / Blocked users
- Channel Filtering: Allowed channels / Blocked channels
See IM Gateways for details.
Task Mode
In addition to session-mode conversations, Bots also support task mode — single-execution tasks triggered via API, scheduled tasks, Bot-to-Bot delegation, or events.
Viewing Tasks
In the Execution History tab of the Bot detail page, you can view all Bot tasks, including:
- Task status: Pending / Running / Completed / Failed / Cancelled
- Task instruction
- Token consumption
- Execution duration
- Completion time
Click a task to view details, including the complete tool call chain and execution results.
Task Replay
In the Data Center’s task detail page, you can view the complete execution process of a task — each step’s reasoning output, tool calls, and execution results displayed on a timeline.
FAQ
Bot replies slowly
- Check the LLM provider’s API response speed
- If the Bot needs to call multiple tools, each tool call adds to the response time
- Consider lowering the Temperature or selecting a faster model
Conversation context lost
- Sessions automatically expire after 30 minutes of inactivity, after which a new session begins
- If session persistence is enabled, history is restored when a new session is created
- After 80+ messages, compression is triggered — early details may be lost in the summary
Bot keeps calling tools in a loop
- Check the max iterations setting and lower it as appropriate
- Check whether the system prompt correctly guides the Bot to terminate
- Consider using “planning” reasoning mode so the Bot plans before executing