Road signs hijack autonomous vehicles via prompt injection, Proc3D enables 400x faster 3D model editing, and Pinchwork launches a marketplace where AI agents hire each other.
NYC decommissions its hallucinating AI chatbot, VulnLLM-R identifies zero-day vulnerabilities via specialized reasoning, and Julie Zero debuts as a screen-aware desktop assistant powered by Llama 4.
Google DeepMind unveils Project Genie for interactive world simulation, Anthropic identifies disempowerment patterns in LLM usage, and Muna transpiles Python AI inference into C++.
Arcee AI launches the 400B Trinity Large MoE model, a "team of rivals" multi-agent architecture intercepts 90% of LLM errors, and TuringDB outperforms Neo4j by 200x for GraphRAG.
Mistral launches the Devstral 2-powered Vibe 2.0 coding agent, HetGPU enables binary compatibility across diverse GPU vendors, and DeepSeek-OCR 2 introduces a Visual Causal Flow architecture.
Qwen3-Max-Thinking achieves GPT-5.2 performance parity, ESA’s Meerkat system identifies imminent asteroid impacts, and Gemini Flash outperforms frontier LLMs in drone piloting benchmarks.
Beni AI offers FaceTime-style calls with an AI Companion, RL with Verifiable Rewards improves long-form story generation, and R3-Engine achieves 117 Tokens/s on a single CPU core with 1.58-bit LLM inference.
Comma openpilot brings open-source driver assistance to 325 vehicle models, TTT-Discover beats humans in the TriMul competition, and Nvidia debuts VibeTensor as the first deep learning framework 100% generated by AI.
Comma.ai's openpilot brings open-source driver assistance to over 325 vehicles, an automated executor grounds LLM research ideas with GPU experiments, and rtk reduces LLM token usage by 60-90%.
LLMs compose APIs via `exec_bash`, LLMs decode Jabberwocky text through pattern matching, and BrowserOS runs AI agents natively in a Chromium fork.
eBay bans agentic "buy for me" bots, DiffRatio slashes diffusion GPU memory by 50%, and yolo-cage sandboxes coding agents to prevent secret exfiltration.
Gemini 3 Flash dominates game-theory benchmarks through institutional deception, repeating prompts twice improves LLM accuracy, and Open Coscientist automates scientific hypothesis generation.
West Midlands police chief resigns over a Copilot hallucination, Homunculus introduces self-rewriting plugins for Claude Code and DiffusionBlocks enables memory-efficient block-wise training.
Tauformer reduces KV-cache overhead by 50% using topological attention, Verbalized Sampling boosts LLM diversity by 2.1x, and VAM Seek × AI enables 30-minute video analysis for just $0.003.
AI insiders launch Poison Fountain to corrupt training data, Video-to-Grid slashes video analysis expenses by 600x and RTX 5090s enable private LLM inference at 200x lower cost than APIs.
Black Forest Labs releases FLUX.2 [Klein] for sub-second image generation, researchers predict a potential 2032 lunar asteroid impact, and Burn provides a high-performance deep learning framework for Rust.
Raspberry Pi launches an 8GB AI HAT+ for local LLMs, researchers define "promptware" as a multi-step malware threat and tldraw pauses contributions to combat AI slop.
Furiosa's RNGD server delivers 3.5x efficiency over H100s, Google Gemini helps prove advanced mathematical theorems, and Curl ends its bug bounty program due to a flood of AI submissions.
Signal warns that agentic AI is a surveillance risk, MacPrompt jailbreaks T2I models with cross-lingual prompts and SkyPilot orchestrates AI workloads across 20 clouds.
Google removes flawed AI health summaries, a two-line change delivers a 30% RAG boost, and TimeCapsuleLLM trains models exclusively on data from the 1800s.
Meta powers AI with 6.6 GW nuclear, the Confucius Code Agent (CCA) rivals commercial systems on SWE-Bench-Pro, and the Thiele Machine defines "insight" cost while subsuming Turing.
AI agents eliminate vendor lock-in by collapsing migration costs, jailbroken LLMs output near-verbatim copyrighted books, and Topic2Manim generates 3Blue1Brown-style educational videos.
AI autonomously solves Erdős problem #728, rude prompts surprisingly outperform polite ones in LLM accuracy, and Topic2Manim generates 3Blue1Brown-style educational videos.
IBM’s ‘Bob’ agent executes malware via prompt injection, MemoryGraft poisons LLM memory with fake successes, and Zeroshot CLI enables autonomous dev teams for Claude Code.
Notion AI faces an unpatched data exfiltration vulnerability, MemoryGraft research demonstrates persistent LLM agent compromise via poisoned experiences, and Polyharmonic Cascade enables deep learning without gradient descent.
Read