Thursday — March 6, 2025

California programmer's radical AI group linked to murders, researchers teach Llama reasoning from Qwen, and IPEX-LLM boosts AI performance on Intel GPUs.

News

Writing an LLM from scratch, part 8 – trainable self-attention

The author is working through Sebastian Raschka's book "Build a Large Language Model (from Scratch)" and blogging about their progress, currently covering section 3.4 on implementing self-attention with trainable weights. The author provides a high-level overview of how GPT-type decoder-only transformer-based language models work, including tokenization, embedding, and self-attention, before diving into the specifics of scaled dot product attention, a key component of the self-attention mechanism.

They wanted to save us from a dark AI future. Then six people were killed

Ziz, a 31-year-old computer programmer, faked her own death by drowning in California and went underground, only to be arrested in Maryland and charged with trespassing and illegal transportation of a firearm. Ziz was the central figure in a group of highly educated, militant vegans, many of whom were transgender, who shared a radical philosophy that took abstract questions from AI research to extreme conclusions, resulting in a string of violent incidents, including multiple murders, that have left at least six people dead.

Expanding AI Overviews and Introducing AI Mode

Google is expanding its AI Overviews feature, which is now used by over a billion people, and introducing a new experimental AI Mode in Google Search. The AI Overviews feature is being upgraded with Gemini 2.0, allowing it to provide faster and higher-quality responses to more complex questions, and will be rolled out to more users, including teens.

Dear Student: Yes, AI is here, you're screwed unless you take action

A student anonymously emailed for advice after using an AI tool called Composer to complete a complex coding task, feeling threatened by the potential for AI to replace human software engineers. The respondent advises the student to take action and not be discouraged, citing the cyclical nature of the software development industry and the importance of being a high-autonomy person who can adapt to changing circumstances.

Superintelligence Strategy

The rapid advancement of AI is transforming national security, and the potential development of superintelligence - AI that surpasses human capabilities - requires a coherent strategy to navigate this new era. A proposed three-part framework consists of deterrence through Mutual Assured AI Malfunction (MAIM), nonproliferation to prevent rogue actors from accessing weaponizable AI capabilities, and competitiveness through bolstering economies and militaries with AI.

Research

Cognitive Behaviors That Enable Self-Improving Reasoners

Researchers have found that certain language models, like Qwen, are better at self-improvement through reinforcement learning due to their intrinsic properties, such as exhibiting cognitive behaviors like verification and backward chaining. By priming other models, like Llama, with examples containing these reasoning behaviors, they can achieve substantial improvements and match the performance of Qwen, highlighting the importance of initial reasoning behaviors in a model's capacity for improvement.

Hilbert 6th problem: derivation of fluid equations via Boltzmann kinetic theory

The paper derives the fundamental equations of fluid mechanics, including the compressible Euler and incompressible Navier-Stokes-Fourier equations, from the interactions of hard sphere particles undergoing elastic collisions. This achievement resolves a key aspect of Hilbert's sixth problem by linking Newton's laws to fluid equations through Boltzmann's kinetic theory, building on previous work to derive Boltzmann's equation in 2D and 3D spaces.

Evolutionary Multi-Agent Reinforcement Learning in Group Social Dilemmas

Reinforcement learning (RL) can be unpredictable in complex environments, particularly when multiple agents learn simultaneously, and this is especially relevant in Public Goods Games where agents must cooperate to achieve a common goal. Researchers studied the impact of learning parameters on cooperation levels in RL agents, finding that evolutionary pressures can select for varying levels of exploration and identifying conditions that influence cooperation in social dilemmas.

Evaluating Intelligence via Trial and Error

The Survival Game framework evaluates intelligence based on the number of failed attempts in a trial-and-error process, with fewer failures indicating higher intelligence, and is used to assess the capabilities of existing AI systems. Current AI systems are far from achieving the Autonomous Level of intelligence, particularly in complex tasks, and scaling up current technologies to reach this level would be impractically costly, highlighting the need for a deeper understanding of task mechanisms to develop more effective AI systems.

Training LLMs with Order-Centric Augmentation

Large language models (LLMs) struggle with logical reasoning due to their reliance on fixed sequential patterns, but a new order-centric data augmentation framework can help address this issue by introducing random premise shuffling and valid step reordering. This framework enables LLMs to develop a more flexible and generalized reasoning process, resulting in significantly enhanced reasoning performance and adaptability to diverse logical structures.

Code

DeepSeek-R1-671B-Q4_K_M with 1 or 2 Arc A770 on Xeon

IPEX-LLM is an LLM acceleration library for Intel GPU, NPU, and CPU, providing seamless integration with various frameworks and models, including HuggingFace transformers, LangChain, and LlamaIndex, with support for 70+ optimized models and state-of-the-art LLM optimizations. The library offers features such as low-bit support, pipeline parallel inference, and finetuning capabilities, with demos available for running local LLMs on Intel Core Ultra iGPU, NPU, and Arc GPU.

Show HN: Free Pilot – AI Autocomplete plugin for Vim

Free Pilot is a fast, lightweight, and configurable AI completion plugin for Vim/Neovim that offers GitHub Copilot-like functionality for free or at a fraction of the cost, supporting both local and cloud models. It provides features such as real-time AI-powered code completion, support for local models via Ollama and cloud models via OpenRouter, and customizable behavior, making it a flexible and affordable alternative to expensive subscription-based services.

Show HN: Diff filtering, text mapping, and windowed transforms for LLM apps

The chopdiff library is a tool for transforming text documents, particularly for large language model (LLM) applications, allowing for parsing, diffing, and transforming text at the level of words, sentences, paragraphs, and "chunks". It provides features such as filtering diffs, backfilling information, and windowed transforms, with minimal dependencies, and can be used for tasks like inserting paragraph breaks, spell checking, and editing text while enforcing specific constraints.

Can LLMs accurately evaluate their own confidence?

An experiment was conducted to compare the self-evaluated confidence of a large language model (LLM) with its actual confidence, derived from the LogP of a yes/no token, and found that LLMs tend to vastly underestimate their own confidence. The goal of the experiment is to determine if LogP can be used as a metric to detect when a model is not confident in a particular decision, especially in high-stakes situations, by analyzing the correlation between self-evaluated and LogP-derived confidence.

Show HN: Tablepilot – Open-source CLI tool designed to generate tables using AI

Tablepilot is a CLI tool that uses AI to generate tables, allowing users to create tables with missing columns that are automatically filled in, and also provides features such as fine-grained context control and flexible column data generation strategies. The tool can be installed and used to generate tables based on a TOML config file and a table schema JSON file, with options to export data as CSV files and switch between different LLMs and models.