Digest — May 16, 2026

May 16, 2026 AI-DIGEST BY MAURO SICARD

From Product Hunt

Product Hunt

AI video directors, persistent agent memory, color E-ink hardware, and a TUI Slack over SSH.

Loova Agents — AI Director for Cinematic Video Describe your idea in everyday words and Loova's agents plan, direct, and generate your film. Infinite canvas workspace keeps the whole project in one place. Supports scroll-stopping ads, short films, and product videos.

▲ 320 · loova.ai

Agentmemory — Persistent Memory for Coding Agents Gives Claude Code, Codex, and other coding agents long-term memory that persists across sessions. Remembers your preferences, patterns, and project context so agents don't start from scratch every time.

▲ 219 · agent-memory.dev

Wring — Developer Tools, One Menu Click Away macOS menu bar app that puts frequently-used developer tools within a single click. JSON formatting, regex testing, hashing, encoding, timestamps, and more without opening a browser tab.

▲ 125 · getwring.app

M5Stack PaperColor — 4" Color E-Ink Dev Board ESP32-S3 dev board with 4-inch color E-ink display and audio I/O. Programmable, low-power, ideal for custom dashboards, smart home displays, or ambient computing projects.

▲ 115 · shop.m5stack.com

sshoosh — TUI Slack Replacement over SSH Minimal TUI chat app that runs over SSH. No web browser, no Electron, no accounts. Just connect and talk. Self-hostable for teams that live in the terminal.

▲ 5 · puemos.github.io

From Reddit

r/LocalLLaMA r/singularity r/openai

Multi-Token Prediction finally lands in llama.cpp, Mythos appears on Vertex AI, and Mistral's founder tells French Parliament that engineers no longer write code.

The long-awaited MTP (Multi-Token Prediction) support has been merged into llama.cpp master. This enables speculative decoding using draft MTP heads, significantly boosting generation speed for supported models. Early benchmarks show Qwen3.6-27B jumping from 7.6 to 16.2 t/s generation (+112%) on Strix Halo. The 35B variant shows mixed results depending on configuration. MTP quants are now appearing from Unsloth.

github.com · r/LocalLLaMA

Claude Mythos Spotted in Google Vertex AI Update: Claude Mythos Preview has appeared in Google Cloud's Vertex AI console, suggesting broader availability to Glasswing partners is imminent. The model remains restricted to defensive security use cases but API access at $25/$125 per million tokens is now live for approved organizations.

Mistral Founder to French Parliament: "Engineers No Longer Write Code" Arthur Mensch testified: "Today, engineers at Mistral no longer write a single line of code. You're no longer a craftsman, you're a manager. You ask agents to write the code for you." Follows similar statements from Airbnb (60% AI code), Shopify (50%), and Google (75%).

Microsoft AI Chief: 18 Months Until All White-Collar Work Automated Mustafa Suleyman told Fortune that within 18 months, AI will be capable of automating virtually all white-collar knowledge work. The boldest timeline prediction yet from a major tech executive.

FutureSim: GPT-5.5 Leads AI Future Prediction Benchmark Max Planck Institute researchers released FutureSim, an environment where agents predict real-world events. GPT-5.5 (via Codex) leads at 25% accuracy, followed by Opus 4.6 at 20%. On Polymarket-overlapping questions like Super Bowl LX, GPT-5.5 achieved a near-perfect Brier skill score of 0.90, beating market consensus.

Nous Research: Token Superposition Training Speeds Up Pre-Training 2.5x New method reduces LLM pre-training wall-clock time by up to 2.5x at fixed compute without changing architecture, optimizer, tokenizer, or data. Tested at the 10B-A1B MoE scale on 512 H200s. No custom kernels needed. Paper: arxiv.org/abs/2605.06546.

Qwen3.6-35B-A3B Hits 24.6% on Terminal-Bench 2.0 Alibaba's tiny MoE model (3B active params) now scores above Gemini 2.5 Pro on Gemini CLI (19.6%) on the hard agentic coding benchmark. The 9B variant also measured at 9.2%, showing sub-10B models are now measurable on serious agent tasks.

From Reddit

r/StableDiffusion r/comfyui

Flux goes real-time on webcam, DramaBox TTS gets LoRA support, and a new ComfyUI node generates synchronized foley sound from video.

Flux Real-Time Pipeline: Major Updates for Webcam Streaming The Flux.2-Klein real-time webcam streaming pipeline received significant community updates one week after launch. New features include multi-GPU support, ControlNet integration, face swap, better latency, and GGUF model support. Achieves real-time streaming on RTX 40/50 series GPUs.

ComfyUI-DramaBox: Now with LoRA Support + Voice Cloning The ComfyUI node for DramaBox (LTX-based TTS) now supports custom LoRAs for voice cloning. Companion tool Voice-Clone-Studio-DramaBox can generate LoRAs from audio samples. Train and use custom voices entirely within ComfyUI.