Thursday — January 9, 2025

Apple's AI enhances scam messages, Nvidia's AI chips outpace Moore's Law, and LLM Catcher decodes Python errors with large language models.

News

Apple's new AI feature rewords scam messages to make them look more legit

Apple's new AI-powered "Apple Intelligence" feature is rephrasing and prioritizing scam messages, making them appear more legitimate and increasing the risk of users falling for them. The feature, which aims to summarize and prioritize notifications, has been criticized for its inability to distinguish between real and fake messages, with experts warning that it may lead to more people losing money to scams due to their trust in Apple's summaries.

Nvidia CEO says his AI chips are improving faster than Moore's Law

Nvidia CEO Jensen Huang claims that his company's AI chips are improving at a rate faster than Moore's Law, which has driven computing progress for decades. Huang states that Nvidia's latest data center superchip is 30-40x faster at running AI inference workloads than its previous generation, and he expects the cost of AI models to decrease over time as computing capability increases.

Cloud AI for Video Games Is Dead on Arrival, On-Device Is the Future

The current approach to using Generative AI in video games, which relies on cloud-based services, is flawed due to high costs and lack of control for developers. Studio Atelico has developed an alternative, the Atelico AI Engine, which runs on the player's device, is modular and cost-effective, and gives developers fine-grained control, making it possible for every game developer to integrate GenAI into their games.

Show HN: Flows – Google Colab meets Notion, designed for AI workflows

The provided text lists 13 templates for various use cases, including content generation, sales, and research and analysis, utilizing techniques such as chatbots, RAG, and API calls. These templates, created by different users, offer a range of applications, from automating research workflows and generating summaries to analyzing sentiment and providing personalized assistance.

Show HN: Weco AI Functions- a text-to-agent dev tool

Weco AI Functions is a platform that enables users to build and deploy AI features in seconds, with a simple 3-step process of describing, verifying, and deploying the function. The platform offers key features such as structured output, A/B testing, observability, no-code prototyping, and web search, making it easy for users to create and integrate AI functions into their projects.

Research

Experimental evidence a photon can spend a negative amount of time in an atom

The group delay of a light pulse traversing a material is related to the time photons spend as atomic excitations, even when the delay is negative near atomic resonance. Experimental results using the cross-Kerr effect to measure atomic excitation times caused by transmitted photons confirm a recent theoretical prediction, suggesting that negative group delay values have physical significance.

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

rStar-Math is a system that enables small language models to achieve state-of-the-art math reasoning capabilities, rivaling or surpassing larger models like OpenAI o1, through the use of Monte Carlo Tree Search and innovative training methods. The system has demonstrated impressive results, improving the performance of smaller models on math benchmarks and even solving a significant portion of problems from the USA Math Olympiad.

Key-Value Memory in the Brain

Classical models of memory rely on similarity-based retrieval, but they don't account for the distinct computational demands of storage and retrieval. Key-value memory systems, on the other hand, separate representations for storage and retrieval, allowing for optimized storage fidelity and retrieval discriminability, and have implications for machine learning, psychology, neuroscience, and biology.

LLMs for AGI

Generative AI systems, including large language models, have shown impressive capabilities in solving complex problems, but their cognitive abilities remain superficial and brittle, limiting their generalist capabilities. To achieve human-level general intelligence, foundational issues such as embodiment, symbol grounding, causality, and memory must be addressed, and this work discusses these concepts and surveys state-of-the-art approaches to implementing them in large language models.

Functorial String Diagrams for Reverse-Mode Automatic Differentiation (2021)

The calculus of string diagrams for monoidal categories has been enhanced with hierarchical features to capture closed monoidal structure. This new syntax is used to formulate an automatic differentiation algorithm for simply typed lambda calculus, which is proven sound, and implemented using a representation called hypernets, a class of hierarchical hypergraphs.

Code

Show HN: Stagehand – an open source browser automation framework powered by AI

Stagehand is an AI web browsing framework that simplifies browser automation by providing a simple and extensible API on top of Playwright, allowing users to automate web tasks using natural language. It offers three main APIs - act, extract, and observe - and is compatible with Playwright, making it easier to write durable and performant browser automation code.

Show HN: Open-Source Computer Use AI Agent Powered by Llama

This project utilizes open-source large language models (LLMs) to control a secure cloud Linux computer powered by E2B Desktop Sandbox, allowing users to operate the computer via keyboard, mouse, and shell commands. To get started, users need to install prerequisites such as Python and git, obtain API keys, clone the repository, set environment variables, and run the web interface using poetry.

Show HN: pay-respects – RIP command errors and keep yourself in the flow

Pay Respects is a tool that suggests fixes to incorrect console commands by pressing F, providing blazing fast and accurate results with support for AI integration and modular customization. It can be installed through various package managers, pre-built binaries, or compiled from source using Cargo, with configuration options available for environment variables and AI settings.

LLM Catcher – Automated Python Debugging Using LLMs

LLM Catcher is a Python library that uses Large Language Models (LLMs) like Ollama or OpenAI to help debug code by decoding stack traces and providing insightful fixes. The library offers various features, including exception diagnosis, support for local and cloud-based LLMs, and flexible configuration options, making it a useful tool for streamlining the debugging process.

Taming LLMs: A Practical Guide to LLM Pitfalls with OSS

The book "Taming LLMs" provides a practical guide to navigating the challenges and limitations of Large Language Models, offering concrete solutions and reproducible code examples to help engineers and technical leaders build effective LLM-powered applications. Through its chapters, the book covers key topics such as the "Evals Gap", structured output, input data management, safety, and preference-based alignment, with a focus on open-source software and battle-tested tools.