Sunday — November 3, 2024

The Colossus AI Supercluster boasts over 100,000 Nvidia H100 GPUs, Repomix simplifies feeding repositories to AI, and RingGesture speeds up AR text input with deep learning.

News

Venvstacks: Virtual Environment Stacks for Python

LM Studio has open-sourced a Python utility called venvstacks that allows for the creation of integrated sets of independently downloadable Python application environments. venvstacks enables the chaining of three layers of Python virtual environments - runtime, framework, and application - and allows for the sharing of dependencies between layers.

Colossus AI Supercluster with over 100k Nvidia H100 GPUs

ServeTheHome takes viewers inside the xai Colossus AI Supercluster, featuring over 100,000 NVIDIA H100 GPUs and a liquid-cooled cluster from Supermicro. The post thanks Elon Musk and his teams for making the project possible.

Brute-Forcing the LLM Guardrails

The author explores the limits of AI by attempting to get a medical diagnosis from a large language model (LLM), specifically Google's Gemini Pro 1.5. Despite the model's initial refusal to provide a diagnosis, the author uses prompt engineering and automation to eventually obtain valid-looking medical interpretations, including a formatted differential diagnosis.

Breaking the image: a 12th-century Ai Weiwei?

The Sussex muralist's painting, "The Deception of Eve and Adam," is a challenging and audacious work that subverts traditional art conventions, much like Ai Weiwei's provocative sequence of photos, "Dropping a Han Dynasty Urn." The muralist's use of trompe l'oeil hooks and a cloth-like scene may have been seen as threatening or deviant, earning them a place alongside art-martyrs like Lazaros Zographos and Saint George.

Apple Researchers Show Critical Flaw in AI

Apple researchers tested over 20 state-of-the-art artificial intelligence models with a simple arithmetic problem and found that they consistently got it wrong, while human schoolchildren were able to solve it correctly. The study highlights the limitations of AI systems, which don't truly "think" but rather match language patterns, and suggests that they may never be able to replicate human intelligence.

Research

Ring-Based Mid-Air Gesture Typing System Using Deep Learning Word Prediction

RingGesture is a ring-based mid-air gesture typing technique for lightweight AR glasses, utilizing electrodes and IMU sensors to track hand movements and translate them into cursor navigation. The system achieves an average text entry speed of 27.3 words per minute and is enhanced by a novel deep-learning word prediction framework, Score Fusion, which offers improved accuracy and input speed.

Length-Induced Embedding Collapse in Transformer-Based Models

Researchers have identified a phenomenon called Length Collapse, where longer text embeddings collapse into a narrow space, hurting performance in downstream tasks. They propose a solution called TempScale, which introduces a temperature in softmax() to mitigate this issue, and demonstrate its effectiveness in improving existing embedding models, especially on long text inputs.

Spann: Highly-Efficient Billion-Scale Approximate Nearest Neighbor Search (2021)

SPANN is a memory-disk hybrid indexing and search system that efficiently handles large-scale databases by storing centroid points in memory and large posting lists on disk. It achieves high recall and low latency, outperforming the state-of-the-art ANNS solution DiskANN by being 2 times faster with the same memory cost and reaching 90% recall in around one millisecond.

Improving Neuron-Level Interpretability with White-Box Language Models

Researchers have developed a new transformer-like architecture called CRATE, which embeds sparse coding directly into the model to improve neural network interpretability. CRATE has shown significant improvements in neuron-level interpretability, up to 103% relative improvement, across various evaluation metrics and model sizes.

Smoothed asymptotics: from number theory to quantum field theory

Researchers have developed a new regularization scheme, called η regularization, for loop integrals in quantum field theory, inspired by Terence Tao's method of smoothed asymptotics. This scheme reveals a connection between eliminating divergences and preserving gauge invariance, and has led to a method for regularizing non-abelian gauge theories that preserves the Ward identity for the vacuum polarisation tensor.

Code

Awesome Generative AI Guide

This repository serves as a comprehensive hub for updates on generative AI research, interview materials, notebooks, and more, offering resources such as monthly best GenAI papers, interview prep, and free courses. The repository is regularly updated with new additions, including course materials, code repositories, and notebooks for developing generative AI applications.

Show HN: A browser extension for Claude/ChatGPT to edit your projects locally

The CodeSpin.AI Chrome Extension allows users to edit local projects using Claude and ChatGPT through the File System APIs on Chrome. To install, users must manually clone the project from GitHub, install dependencies, build the extension, and then load it as an unpacked extension in Chrome.

Repomix: Packs your entire repository into a single, AI-friendly file

Repomix is a tool that packs an entire repository into a single, AI-friendly file, making it easy to feed codebases to Large Language Models (LLMs) or other AI tools. It offers features such as AI-optimized formatting, token counting, and customizable configuration, and can be used with tools like Claude, ChatGPT, and Gemini.

PodcastLM: An open-source AI podcast creator

The AI Podcast is an open-source script that uses Anthropic or Google Gemini and ElevenLabs APIs to create AI-generated podcasts. To use the script, users need to install required libraries, set up API keys, and run the script with specific parameters to generate a podcast from a source document.

NucliaDB, the AI Search Database for RAG

NucliaDB is a robust, open-source database that allows storing and searching on unstructured data, utilizing vector, full text, and graph indexes. It is designed to index large datasets and provide multi-tenant support, with features such as text and semantic searches, data export, and role-based security.