Saturday — December 21, 2024

OpenAI's o3 system achieves a breakthrough in AI task adaptability, new SpikeFI framework enhances reliability analysis in spiking neural networks, and SkyPilot streamlines running AI workloads on diverse infrastructures.

News

A Gentle Introduction to Graph Neural Networks (2021)

Graphs are a natural way to represent real-world objects and their connections, and graph neural networks (GNNs) have been developed to operate on this type of data, with applications in areas such as antibacterial discovery, physics simulations, and recommendation systems. This article explores modern GNNs, explaining what kind of data is best represented as a graph, how graphs differ from other data types, and how to build a state-of-the-art GNN model.

OpenAI O3 breakthrough high score on ARC-AGI-PUB

OpenAI's new o3 system has achieved a significant breakthrough in AI capabilities, scoring 75.7% on the ARC-AGI Semi-Private Evaluation set and 87.5% with high-compute configuration, demonstrating novel task adaptation ability never seen before in the GPT-family models. This marks a qualitative shift in AI capabilities, but it is not yet considered AGI, as o3 still fails on some easy tasks and will likely face challenges with the upcoming ARC-AGI-2 benchmark.

Study: Almost all leading AI chatbots show signs of cognitive decline

Almost all leading AI chatbots show signs of mild cognitive impairment in tests used to spot early signs of dementia, challenging the assumption that artificial intelligence will soon replace human doctors. The study found that "older" versions of chatbots performed worse on the tests, with scores ranging from 16 to 26 out of 30 on the Montreal Cognitive Assessment test.

UK Gov Open Consultation: Copyright and Artificial Intelligence

The UK government is seeking views on how to ensure the country's copyright framework supports both the creative industries and the AI sector, with a consultation running from December 17, 2024, to February 25, 2025. The consultation aims to address issues such as transparency, control, and access to high-quality material for AI training, as well as emerging issues like copyright protection for computer-generated works.

'Yes, I am a human': bot detection is no longer working

Captcha, a system designed to prove a user's humanity, is no longer effective as AI has advanced to the point where it can solve Captcha challenges in milliseconds. As a result, developers are exploring new methods to verify humans, such as behavioral analysis, biometrics, and digital authentication certificates, to stay ahead of increasingly sophisticated bots.

Research

SpikeFI: A Fault Injection Framework for Spiking Neural Networks

Neuromorphic computing and spiking neural networks (SNNs) are gaining popularity due to their efficient energy usage and faster computation speed, but their reliability in hardware applications is a concern. SpikeFI is a proposed fault injection framework for SNNs, built on the SLAYER PyTorch framework, which automates reliability analysis and test generation, offering various fault models and optimization speedups.

Compiling C to Safe Rust, Formalized

Researchers have developed a method to automatically translate C code to safe Rust, preserving Rust's memory safety guarantees, and applied it to formally verified C codebases, including the HACL* cryptographic library. The translated code, which includes a 80,000-line verified cryptographic library, demonstrates the feasibility of this approach with negligible performance impact.

On the Measure of Intelligence

To develop more intelligent artificial systems, a clear definition and evaluation of intelligence is necessary, allowing for comparisons between systems and humans. A new definition of intelligence based on Algorithmic Information Theory is proposed, focusing on skill-acquisition efficiency, and a benchmark called the Abstraction and Reasoning Corpus (ARC) is presented to measure human-like general fluid intelligence in AI systems.

Affirmative Resolution of Bourgain's Slicing Problem

A theorem is established stating that for any convex body in n-dimensional space with a volume of one, there exists a hyperplane that intersects the body with a volume greater than a universal constant. The proof combines various mathematical concepts, including Milman's theory of M-ellipsoids and stability estimates for the Shannon-Stam inequality.

Posterior Mean Matching: Generative Modeling Through Online Bayesian Inference

Posterior mean matching (PMM) is a new generative modeling method grounded in Bayesian inference, offering a flexible alternative to existing methods like diffusion models. PMM achieves competitive performance in tasks such as language modeling and image generation, and its mechanics can be applied to various data modalities using different conjugate pairs of distributions.

Code

Write a model to do AI problem solving in under 200 lines of code

This repository contains the open-source code and text for the book "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp" by Peter Norvig, originally published in 1992. The repository includes the book's text in various formats, as well as the accompanying Lisp code files, which can be run interactively using a Common Lisp interpreter/compiler/environment.

Genesis, the AI enabled physics platform

Genesis is a physics platform designed for general-purpose robotics, embodied AI, and physical AI applications, featuring a universal physics engine, a lightweight and user-friendly simulation platform, and a powerful photo-realistic rendering system. It aims to lower the barrier to using physics simulations, unify state-of-the-art physics solvers, and minimize human effort in collecting and generating data for robotics and other domains.

Show HN: Instruct LLMs to do what you want in Ruby

Instruct is a Ruby gem that allows developers to interact with large language models (LLMs) in a natural and intuitive way, providing features such as safe prompting, flexible middleware, and streaming support. The gem is still in active development and not yet ready for production use, but it aims to simplify the process of working with LLMs by combining code, prompts, and completions in a flexible and powerful interface.

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12 clouds)

SkyPilot is a framework for running AI and batch workloads on any infrastructure, offering unified execution, high cost savings, and high GPU availability. It abstracts away infrastructure burdens, supports multiple clusters, clouds, and hardware, and cuts cloud costs while maximizing GPU availability.