Skip to content

Changelog

What's new in Cloaked.

v2.0.0 May 20, 2026

Image Vision & V2.0

Added

  • Image Vision — attach a photo to any chat with a vision-capable model and Cloaked describes, reads, or analyses what it sees. Translate menus, summarise screenshots, extract handwriting, walk through diagrams.
  • Four vision-capable models — Qwen 3.5 0.8B, 2B, 4B, and 9B now run multimodally through Apple’s MLX framework. Pick one and the photo picker appears in the chat composer.
  • On-device image processing — attached photos are downscaled to 1024 px and re-encoded locally; EXIF and GPS metadata are stripped before the model sees the file. Images never leave your device.
  • Photo picker in chat — single tap on the composer attaches a photo from your library. The button only appears when the active model supports vision.

Improved

  • V2.0 milestone — Cloaked is now a fully private multimodal assistant: vision, web search, thinking mode, memory, and projects, all on-device.
  • Lean catalogue — 12 focused models from 5 AI labs (Meta, Microsoft, Alibaba, DeepSeek, Mistral AI). Qwen 3.5 2B is the recommended default on 8 GB devices.
  • In-app review prompt — quieter, kinder review request with a mail-to-support fallback if you’d rather give private feedback than a public rating.

Fixed

  • Stranded ”?” glyphs — partial UTF-8 bytes across token boundaries no longer leave U+FFFD replacement characters in streamed responses (most visible with emoji and CJK).
  • Text generation crash on vision models — sending a text-only message to a vision-capable model no longer crashes the generator. Both text and image paths now route through the multimodal processor correctly.
v1.2.0 April 2, 2026

Ministral 3, Tappable Citations & Smarter Memory

Added

  • Ministral 3 3B — new edge-optimized model from Mistral AI with chat, coding, and multilingual capabilities (replaces Mistral 7B — smaller download, runs on more devices)
  • Tappable citation links — inline [1], [2] citations in web search responses now open the source URL directly in Safari
  • Configurable thinking budget — control how long models spend reasoning (256–4096 tokens) with a new Settings slider
  • Continuation generation — when the thinking budget is reached, the model transitions smoothly to producing an answer instead of cutting off
  • Download confirmation — new dialog when downloading models that may run slowly on your device

Improved

  • Smarter model recommendations — Qwen 3.5 2B is now the default for 8GB devices (better speed/quality tradeoff)
  • Leaner model library — removed legacy and redundant models for a focused catalog of 12 models from 5 AI labs
  • Memory management — RAM checks now run before loading models, preventing out-of-memory crashes
  • Onboarding — sub-8GB devices default to Qwen 3.5 0.8B for a smoother first experience
  • Source pills — numbered labels (“1 · example.com”) now map directly to inline citations

Fixed

  • Memory errors now surface specific messages instead of generic “generation failed”
  • Chat template detection for Ministral model IDs
  • Missing translated strings across all 10 non-English languages
  • 4B+ models now correctly show “May run slowly” on 8GB iPhones
v1.1.0 March 14, 2026

Qwen 3.5 Series & Enhanced Reasoning

Added

  • Qwen 3.5 model series — 0.8B, 2B, 4B, and 9B variants with next-generation reasoning
  • Qwen 3 model series — 0.6B, 1.7B, 4B, and 8B with built-in thinking mode
  • Thinking mode toggle — switch between fast responses and chain-of-thought reasoning
  • Expanded model library — now 15 models from 5 AI labs

Improved

  • Model download reliability and resume support
  • Chat response streaming performance
  • Memory management for larger models
  • Model card UI with capability badges

Fixed

  • Occasional crash when switching models mid-conversation
  • Voice input not activating on first tap in some cases
  • Markdown rendering for nested code blocks
v1.0.0 January 15, 2026

Initial Release

Added

  • On-device AI chat powered by Apple’s MLX framework
  • 7 AI models at launch — Llama 3.2, Qwen 2.5, Phi-4 Mini, DeepSeek R1, Mistral 7B
  • Voice input — on-device speech recognition, no cloud processing
  • Text-to-speech output for hands-free use
  • Web search via DuckDuckGo — privacy-preserving, no search history
  • Projects — organize conversations with custom system prompts
  • Persistent memory — context that carries across conversations, stored locally
  • Markdown rendering with syntax-highlighted code blocks
  • 6 accent color themes with light and dark mode
  • Siri Shortcuts integration
  • Full offline support — works in airplane mode after model download
  • Zero accounts — no sign-up, no API keys, no subscriptions