Skip to content

Quick Start

Quickstart

Basic usage

import fast_rlm

result = fast_rlm.run("Generate 50 fruits and count number of r")
print(result["results"])
print(result["usage"])

The returned dict contains:

{
    "results": ...,        # the agent's final answer
    "log_file": "...",     # path to the JSONL log
    "usage": {
        "prompt_tokens": 12345,
        "completion_tokens": 678,
        "total_tokens": 13023,
        "cached_tokens": 5000,
        "reasoning_tokens": 200,
        "cost": 0.0342
    }
}

Arbitrarily long context

The key idea behind RLMs is that the prompt can be arbitrarily long — far beyond any model's context window. The agent explores it programmatically through the REPL rather than trying to fit it all into a single call.

import fast_rlm

transcripts = open("lex_fridman_all_transcripts.txt").read()  # millions of tokens

result = fast_rlm.run(
    "Here are the transcripts of all Lex Fridman podcasts. "
    "Summarize what the first 5 Machine Learning guests had to say about AGI.\n\n"
    + transcripts
)
print(result["results"])

The agent will write code to search, filter, and chunk the transcripts on its own — no manual splitting required.

With configuration

from fast_rlm import run, RLMConfig

config = RLMConfig.default()
config.primary_agent = "minimax/minimax-m2.5"
config.sub_agent = "minimax/minimax-m2.5"
config.max_depth = 5
config.max_money_spent = 2.0

result = run(
    "Count the r's in 50 fruit names",
    prefix="r_count",
    config=config,
)

Parameters

Parameter Type Default Description
query str (required) The question or context to process
prefix str None Log filename prefix (e.g. "r_count"r_count_2026-02-23T...)
config RLMConfig or dict None Config overrides (see Configuration)
verbose bool True Stream engine output to terminal

Quiet mode

To suppress all terminal output and just get the result:

result = fast_rlm.run("What is 2+2?", verbose=False)