Context efficiencyWhere tokens go
Parses 15 languages into skeletons: imports, type defs, function signatures with their line ranges. Adds 59 tok/turn but saves 224 tok/turn on reads, netting 165 tok/turn saved.
A sandboxed Python interpreter with memory and time limits. All tools are exposed as async functions, so the model can asyncio.gather() a bunch of reads, grep the results, and only return what matters. Intermediate data never reaches your context.
The model picks weak, medium, or strong for each subagent. Haiku-tier for grep-heavy research, opus-tier for architecture. Subagents can be read-only or have full tool access.
The system prompt, tool descriptions, and examples are short. When context gets too long, maki compacts history automatically: strips images, thinking blocks, and summarizes older turns.
User experienceWhat you get
Native binary. No javascript runtime, no react. Even the splash screen animation uses SIMD. Syntax highlighting runs on a background thread pool so it never blocks your input. Fits well on small laptop screens.
Philosophy: don't hide anything. Token count, cost, and model are always in the status bar. Each subagent gets its own chat window you can flip through with Ctrl-N/P. Ctrl-F for fuzzy search. /btw runs a side query without touching the current session. ! runs shell commands, !! runs them silently.
Bash commands are parsed with tree-sitter so maki knows what's actually being run. git diff && rm -rf / correctly flags both git and rm. Most agents only see git. Handles subshells, command substitution, pipes. Per-tool allow/deny rules, or --yolo to skip it all. SSRF protection on webfetch.
Long-term memory that persists across sessions. Tell maki to remember something, somtimes it picks things up on its own. Double-Escape to rewind. Plan mode restricts the agent to read-only. MCP servers over stdio or HTTP. Skills. 26 themes. Paste images. --print for headless (output is Claude Code-compatible).