Model Management — Dome Terminal Docs

Why local models — privacy, cost, and offline capability

The choice between local and remote AI isn't just technical — it's a question of what you value and how you work. Here's an honest comparison of what each option gives you.

Local AI

✓Nothing leaves your machine — your journal, chart data, and strategies stay private

✓No usage cost per query after initial setup

✓Works offline — no internet required once the model is downloaded

✓Responds instantly on capable hardware (GPU with 8GB+ VRAM)

–Slower on CPU-only machines, especially larger models

–Smaller models have less reasoning depth than top remote models

–Requires disk space (350MB–8GB+ per model)

Remote AI

✓Highest quality reasoning — top-tier models available

✓No local hardware requirements — works on any machine

✓Fastest for complex, multi-step analysis

–Requires internet connectivity

–Prompt data is sent to the provider's servers

–Per-query cost depending on your plan

–Not available if internet is down

A practical approach for most traders

Use local AI for quick queries, private research, and anything where your journal or strategy details are part of the context. Use remote AI for complex, deep analysis tasks where the quality difference justifies sending data externally. You can switch between the two in the AI provider selector at any time — it's not a permanent choice.

Supported models

Dome Terminal supports any GGUF-format model compatible with Ollama. The models listed below are tested and recommended. Larger models produce better results but require more hardware.

Local Model Browser — Settings → AI → Local Models

Qwen 2.5 0.5B

Ultra-fast · CPU-friendly

350 MB

8K tokens

Quick summaries, short queries, active trading sessions

● Active

Qwen 2.5 1.5B

Balanced · Good for most tasks

1.0 GB

32K tokens

Market analysis, journal review, ORACLE conversations

↓ Download

7B+ Models

Deep reasoning · GPU recommended

4–8 GB

128K tokens

Complex multi-step analysis, AI Committee, deep journal analysis

↓ Download

Hardware	Recommended model tier	Why
CPU only, 8GB RAM	0.5B or 1.5B	Larger models will be slow (>30s per response) and may cause system instability under load.
CPU only, 16GB+ RAM	1.5B comfortably, 7B with patience	7B on CPU is usable for non-time-sensitive research. Expect 30–90 second response times.
GPU with 6–8GB VRAM	7B models	7B models fit in 6-8GB VRAM and produce responses in 5–15 seconds. A good balance of quality and speed.
GPU with 12GB+ VRAM	7B–13B models	Larger models produce meaningfully better reasoning. Response times are still fast (under 15 seconds).

Downloading a model

Models are downloaded directly from within Dome Terminal. No external tools or command-line steps are required for supported models.

01

Open Settings → AI → Local Models

Navigate to Settings (gear icon in the sidebar, or Ctrl+,) → AI tab → Local Models section. You'll see the model browser with all available models and their download status.

02

Find the model and click Download

Browse the available models, check the size and context length against your hardware specs (table above), and click ↓ Download. Make sure you have enough free disk space before starting — the download size is shown in the model browser.

03

Monitor progress in the status bar

A download progress indicator appears in the terminal status bar at the bottom of the screen. You can continue using the terminal normally during the download — it runs in the background.

Qwen 2.5 1.5B — qwen2.5-1.5b-instruct.Q4_K_M.gguf 72%

734 MB / 1.02 GB · ~2m 14s remaining

04

Download complete — model appears in Available Models

When the download finishes, the model status changes from ↓ Download to Set Active. You can now activate it for use. The model file stays on your machine and does not need to be re-downloaded.

Activating a model

Only one local model is active at a time. The active model is loaded into memory and used for all local AI features — ORACLE, AI Committee, and any other module that routes to local AI.

→

How to set the active model

In Settings → AI → Local Models, click Set Active next to any downloaded model. The terminal loads the model into memory — this takes between 5 and 30 seconds depending on model size and your hardware. A loading indicator appears in the AI status bar.

Once loaded, the AI provider indicator in the toolbar changes to show the active local model name. You can switch models at any time — the current model unloads and the new one loads. Switching mid-session does not lose conversation history.

Switch to a smaller model during active trading

During a live trading session, response speed often matters more than reasoning depth. Switch to the 0.5B or 1.5B model for quick queries while trading, and switch back to a larger model for deeper research when the market is closed. You can create a keyboard shortcut for model switching in Settings → Shortcuts.

Privacy controls — what the AI can see

Dome Terminal lets you control exactly what context is included when an AI model processes your queries. These settings are separate for local and remote models — you can give local AI access to sensitive data while keeping it off-limits for remote AI.

Local AI context

Active chart symbol and price

Quant Brain readings

Journal entries (all fields)

Strategy scripts

Account balance and PnL

Open positions

All data stays on your machine. Nothing is transmitted externally.

Remote AI context (defaults)

Active chart symbol and price

Quant Brain readings (anonymized)

Journal entries (mood, tags only)

Strategy scripts — blocked by default

Account balance — blocked by default

Live PnL — blocked by default

Sent fields go to the remote provider's servers. Adjust in Settings → AI → Privacy.

To adjust which fields are included in remote AI context, go to Settings → AI → Privacy and toggle each category. Changes apply immediately to all future queries — they do not affect queries already sent.

Using Ollama to manage models

If you already use Ollama to manage GGUF models on your machine, Dome Terminal can connect directly to Ollama and use its model library — you don't need to download the same models twice.

→

Configure Ollama in Settings → AI → Ollama

Navigate to Settings → AI → Ollama and enter the Ollama host address. If Ollama is running on the same machine, the default http://localhost:11434 will work without changes. Click Test Connection to verify.

Once connected, all models available in your Ollama library appear automatically in the Dome Terminal model browser alongside any natively downloaded models. You can activate, switch, and use them exactly the same way.

Ollama connection is optional

If you don't use Ollama, you don't need to configure it. Dome Terminal has its own built-in model downloader and manager. Ollama integration is for users who prefer to manage their model library with Ollama directly, or who want to use models that aren't in the Dome Terminal model browser.

Performance tips

Local AI performance depends heavily on your hardware and what else is running on your machine. These three tips make the biggest practical difference.

1

Close memory-heavy applications before loading large models

7B+ models require 4–8GB of RAM (or VRAM). If a browser with many tabs, a video editor, or another heavy application is using that memory, the model will either load slowly, run slowly, or fail to load at all. Close what you don't need before activating a large model.

2

Restart the Local LLM service if responses become slow

After extended use, the local AI service can degrade in performance — especially on systems with limited RAM. Go to Settings → AI → Local Models and click Restart LLM Service. This unloads and reloads the active model cleanly without restarting the entire terminal.

3

Use a smaller model during active trading sessions

When you're actively watching the market, a fast 0.5B or 1.5B response in 2–3 seconds is more useful than a thorough 7B response that takes 40 seconds. Reserve larger models for pre-session research and post-session journal review when response speed is less critical.

Manage your local AI models