Changelog

What's new in Cloaked.

v2.0.0 May 20, 2026

Image Vision & V2.0

Image Vision — attach a photo to any chat with a vision-capable model and Cloaked describes, reads, or analyses what it sees. Translate menus, summarise screenshots, extract handwriting, walk through diagrams.
Four vision-capable models — Qwen 3.5 0.8B, 2B, 4B, and 9B now run multimodally through Apple’s MLX framework. Pick one and the photo picker appears in the chat composer.
On-device image processing — attached photos are downscaled to 1024 px and re-encoded locally; EXIF and GPS metadata are stripped before the model sees the file. Images never leave your device.
Photo picker in chat — single tap on the composer attaches a photo from your library. The button only appears when the active model supports vision.

V2.0 milestone — Cloaked is now a fully private multimodal assistant: vision, web search, thinking mode, memory, and projects, all on-device.
Lean catalogue — 12 focused models from 5 AI labs (Meta, Microsoft, Alibaba, DeepSeek, Mistral AI). Qwen 3.5 2B is the recommended default on 8 GB devices.
In-app review prompt — quieter, kinder review request with a mail-to-support fallback if you’d rather give private feedback than a public rating.

Stranded ”?” glyphs — partial UTF-8 bytes across token boundaries no longer leave U+FFFD replacement characters in streamed responses (most visible with emoji and CJK).
Text generation crash on vision models — sending a text-only message to a vision-capable model no longer crashes the generator. Both text and image paths now route through the multimodal processor correctly.

v1.2.0 April 2, 2026

Ministral 3 3B — new edge-optimized model from Mistral AI with chat, coding, and multilingual capabilities (replaces Mistral 7B — smaller download, runs on more devices)
Tappable citation links — inline [1], [2] citations in web search responses now open the source URL directly in Safari
Configurable thinking budget — control how long models spend reasoning (256–4096 tokens) with a new Settings slider
Continuation generation — when the thinking budget is reached, the model transitions smoothly to producing an answer instead of cutting off
Download confirmation — new dialog when downloading models that may run slowly on your device

Smarter model recommendations — Qwen 3.5 2B is now the default for 8GB devices (better speed/quality tradeoff)
Leaner model library — removed legacy and redundant models for a focused catalog of 12 models from 5 AI labs
Memory management — RAM checks now run before loading models, preventing out-of-memory crashes
Onboarding — sub-8GB devices default to Qwen 3.5 0.8B for a smoother first experience
Source pills — numbered labels (“1 · example.com”) now map directly to inline citations

Memory errors now surface specific messages instead of generic “generation failed”
Chat template detection for Ministral model IDs
Missing translated strings across all 10 non-English languages
4B+ models now correctly show “May run slowly” on 8GB iPhones

v1.1.0 March 14, 2026

Qwen 3.5 model series — 0.8B, 2B, 4B, and 9B variants with next-generation reasoning
Qwen 3 model series — 0.6B, 1.7B, 4B, and 8B with built-in thinking mode
Thinking mode toggle — switch between fast responses and chain-of-thought reasoning
Expanded model library — now 15 models from 5 AI labs

v1.0.0 January 15, 2026

On-device AI chat powered by Apple’s MLX framework
7 AI models at launch — Llama 3.2, Qwen 2.5, Phi-4 Mini, DeepSeek R1, Mistral 7B
Voice input — on-device speech recognition, no cloud processing
Text-to-speech output for hands-free use
Web search via DuckDuckGo — privacy-preserving, no search history
Projects — organize conversations with custom system prompts
Persistent memory — context that carries across conversations, stored locally
Markdown rendering with syntax-highlighted code blocks
6 accent color themes with light and dark mode
Siri Shortcuts integration
Full offline support — works in airplane mode after model download
Zero accounts — no sign-up, no API keys, no subscriptions