Monday — November 11, 2024

Robots learn laundry tasks with π0 AI, LLMs' insights questioned via token noise in game theory, and Homemade GPT JS mimics MinGPT in the browser with Tensorflow.js.

News

Physical Intelligence's first generalist policy AI can finally do your laundry

Researchers have developed a general-purpose robot foundation model called π0 (pi-zero), which is a step towards creating artificial physical intelligence that allows robots to perform various tasks with physical versatility similar to humans. This model is trained on a large and diverse dataset of dexterous tasks from multiple robots and inherits semantic knowledge from internet-scale pretraining, enabling it to control different robots and perform tasks through zero-shot prompting or fine-tuning.

IMG_0416

Between 2009 and 2012, Apple iPhones and iPod Touches allowed users to upload videos directly to YouTube from the Photos app, resulting in millions of videos being uploaded with default titles in the format "IMG_XXXX". These videos, often uploaded unintentionally or without editing, provide a unique and authentic glimpse into the lives of strangers.

Show HN: A Cursor for Video Editing

Frame AI is a video editing platform that allows users to describe their desired edits and instantly see the changes, while also offering collaborative features and version control. The platform combines AI-powered editing tools with a Git-style workflow, enabling users to work with others, track changes, and experiment freely.

TSMC to close door on producing advanced AI chips for China from Monday

TSMC is set to stop producing advanced AI chips for China from Monday. The article is locked behind a paywall, requiring a subscription to access the full content.

Images of Spain's floods weren't made by AI. Trouble is, people think they were

A viral photo of a street in Valencia, Spain, devastated by a "rain bomb" was met with skepticism on social media, with many believing it to be an AI-generated fake due to its vivid and surreal quality. This reaction highlights the growing issue of "AI slop" - AI-generated images and text that are increasingly prevalent on social media platforms, often created to profit from engagement algorithms.

Research

The Multiple Dimensions of Spuriousness in Machine Learning

Machine learning models are vulnerable to capturing unintended correlations in data, which can lead to issues with model performance, fairness, and robustness. Researchers have identified multiple dimensions of spuriousness, including relevance, generalizability, human-likeness, and harmfulness, which go beyond the traditional causal/non-causal distinction and highlight the complexities of addressing spuriousness in machine learning.

LLM outputs explained using Game Theory

Large language models (LLMs) have potential applications in simulating human behavior, but their validity as substitutes for human subjects is uncertain due to divergences in underlying processes and sensitivity to prompt variations. A novel approach using Shapley values from cooperative game theory reveals "token noise" effects, where LLM decisions are disproportionately influenced by tokens with minimal informative content, raising concerns about the robustness of insights obtained from LLMs.

Magnetic Field Evolution of Hot Exoplanets

Numerical simulations using the MESA model found that the magnetic field strength of gas giant planets depends on the convective energy flux from their interiors. The simulations showed that hot Jupiters' magnetic fields decrease over time, while hot Neptunes' fields die out after around 2 billion years, with factors like atmospheric mass and orbital separation also affecting the magnetic field strength.

Age Normalized Testosterone Peaks at Series B for Male Startup Founders

A study of 107 male Y Combinator founders found that age-normalized testosterone levels increased by 99.6% from pre-seed to Series B funding, then dropped by 42.2% after that stage. This suggests that early startup success boosts confidence and dominance, while later-stage pressures and stresses erode these feelings, or alternatively, that founders with higher testosterone are more likely to secure larger funding rounds.

Cell Balancing Paradigms: Advanced Types, Algorithms, and Optimization Framework

The operation efficiency of electric transportation, energy storage, and grids relies on the characteristics of the employed batteries, which can be managed and maintained by a Battery Management System (BMS). A BMS measures and controls various parameters to optimize battery function, including cell balancing and charge/discharge processes, ensuring the battery's health, safety, and performance.

Code

Show HN: Chonkie – A Fast, Lightweight Text Chunking Library for RAG

Chonkie is a lightweight, fast, and feature-rich Python library for text chunking, designed for use in RAG (Retrieval-Augmented Generation) bots. It offers various chunking methods, including token, word, sentence, semantic, and SDPM chunking, and is easy to install and use.

AI GitHub Issue Resolving

The OpenHands Github Issue Resolver is a tool that uses open-source AI agents to automatically resolve GitHub issues, primarily designed to handle one issue at a time with high quality. It can be used through a GitHub Actions workflow or manually installed and run programmatically, and supports various LLM models, including Anthropic's Claude and OpenAI's GPT-4.

Show HN: Putting together all the AI-powered web search software we know of

The provided text is incomplete and only contains an error message. There is no information to summarize.

Homemade GPT JS – A Tensorflow.js Re-Implementation of MinGPT

A minimal TensorFlow.js re-implementation of Karpathy's minGPT, a Generative Pre-trained Transformer, is available in a single 300-line TypeScript file. The model can be trained, experimented with, and used to generate predictions directly in the browser using a GPU through the Homemade GPT playground.

LLM Prompt Tuning Playbook

This document is a playbook for tuning large language models (LLMs) through effective prompting strategies, targeting those who have basic interactions with LLMs but lack a rigorous technical understanding. The playbook provides mental models, concrete prescriptions, and a procedure for iterating on new system instructions, aiming to consolidate and share helpful intuitions and practical techniques for prompting LLMs.