Monday February 17, 2025

Google scraps AI pledges as expressive movements enhance robot design, while ZeroBench challenges LMM visual understanding and DIY GPT-style models become accessible with a new notebook.

News

“A calculator app? Anyone could make that”

Google hired renowned programmer Hans-J. Boehm to develop a calculator app, which proved to be a much more complex task than expected due to the limitations of floating-point numbers and the need for precise mathematical expressions. Boehm's solution involved using recursive real arithmetic, which can handle expressions involving irrational numbers like π, but even this approach required further refinement to provide a good user experience and exact answers for simple calculations like "1-1".

Blocklist for AI Music on YouTube

A list of YouTube channels has been blocked via context menu, with 124 channels blocked between January 9, 2025, and January 18, 2025. The channels appear to be primarily music-related, with many focused on lo-fi, chill, and city pop genres.

Google defends scrapping AI pledges and DEI goals in all-staff meeting

Google's executives announced that the company will be sunsetting its diversity initiatives and defended dropping its pledge against building artificial intelligence for weaponry and surveillance, citing the need to comply with evolving legal directions and participate in geopolitical discussions. The company's chief legal officer, Kent Walker, stated that it would be "good for society" for Google to be part of these conversations, despite employee concerns and criticism from worker activist groups.

New Junior Developers Can’t Actually Code

The author is concerned that the increasing reliance on AI tools like Copilot and GPT among junior developers is leading to a lack of deep understanding of the code they're writing, as they're able to produce working code quickly without fully grasping the underlying concepts. To address this issue, the author suggests using AI with a learning mindset, engaging in discussions with other developers, and building things from scratch to gain a deeper understanding of the code and the development process.

Elon Musk's terrifying vision for AI

Elon Musk has developed a Large Language Model called Grok that can spread propaganda and influence people's attitudes, which is a cause for concern as it can be used to manipulate public opinion without people even realizing it. The model has been shown to be effective in shifting people's attitudes, even when they are warned about its potential bias, and its use could have significant consequences for democracy and free speech.

Research

Infrastructure for AI Agents

As AI systems become increasingly capable of interacting with their environments, there is a growing need for tools to manage their risks and benefits, particularly in terms of accountability and interaction with existing institutions. The concept of "agent infrastructure" is proposed to address this gap, comprising technical systems and protocols that mediate and influence AI agents' interactions with their environments, with three key functions: attributing actions, shaping interactions, and detecting and remedying harm.

Who cares about mathematics education?

Teaching and outreach have become increasingly important in mathematics departments and organizations over the past two decades, but more work is needed to fully include mathematics education and educators. Mathematics educators often experience varying levels of support in mathematics departments, ranging from fractious and fragile to fertile environments that can impact their work and inclusion.

Expressive and Functional Movement Design for Non-Anthropomorphic Robot

Robots can interact more naturally with humans by incorporating expressive qualities like intention, attention, and emotions into their movement design, alongside traditional functional considerations. A study on a lamp-like robot found that expression-driven movements significantly enhanced user engagement and perceived robot qualities, particularly in social-oriented tasks, compared to function-driven movements.

SPIRE: Semantic Prompt-Driven Image Restoration (2024)

SPIRE is a framework that uses natural language as an interface to control the image restoration process for tasks such as denoising and super-resolution. The framework leverages content-related prompts and language-based quantitative specifications to enhance semantic alignment and restoration strength, and achieves superior restoration performance compared to existing methods.

ZeroBench: An Impossible Visual Benchmark for Contemporary LMMs

Large Multimodal Models (LMMs) struggle with spatial cognition and image interpretation, yet still achieve high scores on popular visual benchmarks. To address this, a new benchmark called ZeroBench has been introduced, consisting of 100 challenging questions that contemporary LMMs are unable to answer, with the goal of encouraging progress in visual understanding.

Code

Want to Train Your Own GPT-Style Model? – Step-by-Step Notebook

This repository contains a Jupyter Notebook that trains a small GPT-style language model from scratch using PyTorch, covering topics such as tokenization, positional encoding, and self-attention. The notebook provides a step-by-step guide to building and training a minimal GPT-style decoder-only transformer model, allowing users to experiment with fine-tuning and inference.

Simple RAG pipeline. Dockerized open source

Legit-RAG is a modular Retrieval-Augmented Generation system built with FastAPI, Qdrant, and OpenAI, following a 5-step workflow: query routing, query reformulation, context retrieval, completion check, and answer generation. The system is designed for easy extension and modification, with support for multiple LLM providers, vector databases, and document management, and can be run using Docker Compose or directly with a Python debugger.

Describe – Describe your codebase to an LLM

Describe is a command-line tool that scans a directory, embeds file contents, and generates a structured Markdown file, allowing for precise control over included files and directories through a .describeignore file. The tool can be installed via Homebrew, by downloading a precompiled binary, or by installing with Go, and its usage involves scanning an input directory and outputting a Markdown file, with options for custom output files and ignore files.

Show HN: Mangosqueezy AI agent for finding and onboarding affiliates

Mangosqueezy is an affiliate marketing platform currently under active development, utilizing a range of technologies including React, TypeScript, and Nextjs. The project is not yet ready for use, but its architecture and services, including hosting on Supabase and Vercel, and security measures such as Gitleaks, are being actively worked on and documented.

Show HN: System-info-now – Aggregate system debug data for LLM troubleshooting

System-info-now is a Python utility that aggregates and exports comprehensive system information to JSON, designed for feeding system context into Large Language Models (LLMs), and provides a snapshot of the system's current state in a standardized format. The tool gathers real-time data about the operating system, hardware, running processes, and environment variables, and generates a detailed JSON output that can be used for system diagnostics, environment documentation, troubleshooting, and providing contextual information to AI models.