Wednesday — February 19, 2025

HP acquires Humane's AI software for $116 million, Valve releases Team Fortress 2 code, and the Deep Lake project tackles big data challenges in deep learning.

News

My LLM codegen workflow

The author uses a workflow for building small products with Large Language Models (LLMs), which involves brainstorming and honing an idea, creating a detailed specification, planning the project, and then executing it using LLM code generation in discrete loops. The workflow consists of three main steps: idea honing, planning, and execution, with the use of conversational LLMs, reasoning models, and code-generation LLMs to produce a solid spec, a detailed plan, and finally, the implementation of the project.

The Generative AI Con

The author argues that the hype surrounding Large Language Models, such as ChatGPT, is a cynical bubble inflated by OpenAI CEO Sam Altman, and that anecdotal examples of people using these models do not prove their sustainability or profitability as a trillion-dollar industry. Despite ChatGPT's reported 300 million weekly users, the author suggests that this number is largely the result of media coverage and does not necessarily indicate a viable or essential product.

HP Acquires Humane's AI Software

HP Inc. has announced a definitive agreement to acquire key AI capabilities from Humane, including its AI-powered platform Cosmos and over 300 patents, for $116 million. The acquisition will accelerate HP's development of AI-powered devices and create an intelligent ecosystem across its products, forming a new AI innovation lab called HP IQ to build on this technology and shape the future of intelligent experiences.

Meta announces LlamaCon, its first generative AI dev conference on April 29

Meta has announced two upcoming events: LlamaCon, a developer conference focused on open source AI developments, which will take place on April 29, and Meta Connect, a conference for virtual and mixed reality developers, which will be held on September 17-18.

HP to Acquire Parts of Humane for $116M

HP Inc. has agreed to acquire certain assets from Humane Inc., the maker of the wearable Ai Pin, for $116 million, including the company's software platform, intellectual property, and most of its employees. The deal does not include Humane's Ai Pin device business, which will be wound down.

Research

Deep Lake: A Lakehouse for Deep Learning

Traditional data lakes provide a foundation for analytical workloads, but are not well-suited for deep learning applications involving non-tabular datasets such as images and videos. Deep Lake, an open-source lakehouse, addresses this limitation by storing complex data as tensors and streaming it to various frameworks, including PyTorch and TensorFlow, while maintaining the benefits of a traditional data lake.

The Widespread Adoption of Large Language Model-Assisted Writing Across Society

The adoption of large language models (LLMs) for writing has surged since the release of ChatGPT in November 2022, with significant usage across various domains, including consumer complaints, corporate communications, job postings, and international organization press releases. By late 2024, LLM-assisted writing accounts for a substantial portion of text in these domains, ranging from around 10% in job postings to up to 24% in corporate press releases, reflecting a new reality of reliance on generative AI for communications.

Mamba-Shedder: Post-Transformer Compression for Efficient SSMs

Researchers have proposed alternative architectures like Selective Structured State Space Models (SSMs) to improve the efficiency of large pre-trained models, and this paper explores compressing SSM-based models to reduce size and computational overhead. The proposed Mamba-Shedder solutions achieve a speedup of up to 1.4x during inference by eliminating redundancies with minimal impact on model performance.

Tensor evolution: A framework for fast tensor computations using recurrences

This paper presents a new mathematical framework, Tensor Evolution, which extends the Scalar Evolution optimization pass to analyze and optimize tensor expressions within loops, a common operation in high-performance computing and machine learning. The framework builds on the theory of Chain of Recurrences and can play a part in optimizing and analyzing computations in ML and HPC compilers, with potential applications beyond its initial scope.

SWE-Lancer: a benchmark of freelance software engineering tasks from Upwork

SWE-Lancer is a benchmark of over 1,400 freelance software engineering tasks valued at $1 million USD, encompassing both independent engineering tasks and managerial tasks. The benchmark evaluates model performance and finds that current models are still unable to solve the majority of tasks, with the dataset and evaluation tools made publicly available to facilitate future research into the economic impact of AI model development.

Code

Valve releases Team Fortress 2 code

The Source SDK 2013 repository contains the game code for Half-Life 2, HL2: DM, and Team Fortress 2, and provides instructions for building and distributing mods on Windows and Linux. To build a mod, users must clone the repository, install required software, and follow platform-specific build instructions, after which they can distribute their mod on or off Steam, subject to the terms of the SOURCE 1 SDK LICENSE.

Augment.vim: AI Chat and completion in Vim and Neovim

The Augment Vim/Neovim plugin provides inline code completions and multi-turn chat conversations tailored to a user's codebase, and can be installed and configured to work with any modern Vim or Neovim setup. Once installed, users can access features such as context-aware code completions, chat conversations, and workspace folder configuration to improve the accuracy and style of completions and chat responses.

Show HN: CLI AI assistant that remembers and sets goals

Elroy is a scriptable, memory-augmented AI personal assistant accessible from the command line, featuring long-term memory recall, goal tracking, and a simple scripting interface. It can be installed and used to process messages, create memories, and manage goals, with support for various AI models and customization options through configuration and scripting.

Cake: Distributed LLM and StableDiffusion inference for mobile desktop or server

Cake is a Rust framework for distributed inference of large models, allowing users to repurpose consumer hardware into a heterogeneous cluster of devices to run models that wouldn't normally fit in a single device's GPU memory. The project aims to make AI more accessible and democratic by leveraging planned obsolescence, and it supports various operating systems, architectures, and accelerations, including Linux, Windows, macOS, Android, and iOS.

New LLM Scaling Law

This paper presents a theoretical framework for understanding the relationship between expert granularity, cache efficiency, and bus bandwidth in Mixture-of-Experts (MoE) model architectures, demonstrating that increasing expert count while decreasing individual expert size can lead to exponentially improved cache efficiency. The framework suggests that models with smaller but more numerous experts could achieve superior performance while reducing memory requirements, enabling efficient deployment of large models without requiring full VRAM residency.