Monday — March 31, 2025

The IRS unveils a new AI initiative after discovering $21.1B in fraud, an Orion framework revolutionizes deep learning with Homomorphic Encryption, and a new PromptRepository standardizes unit tests for LLM prompts.

News

Literate Development: AI-Enhanced Software Engineering

The software development community is working to develop LLM-powered tools for code generation, but despite advances, many developers struggle to realize substantial benefits from incorporating LLMs into their daily work. To address this, a new approach called "Literate Development" is proposed, which emphasizes the importance of comprehensive documentation as the central, authoritative source of information and advocates for an iterative approach to LLM-assisted code generation.

Samsung Galaxy AI features can be set to on-device-only processing

The Samsung Galaxy S24's AI features can be set to process data only on the device, rather than sending it to Samsung's servers, by toggling the "Process data only on device" option in the Advanced Intelligence menu. However, this limits the available AI features to those that can be processed on-device, such as chat assist and translation, and disables features like summarizing and generative editing that require cloud processing.

The First LLM

The Large Language Model (LLM) revolution began with the publication of GPT-1 by Alec Radford in 2018, which is widely considered the first LLM, although Australian Jeremy Howard claims his ULMFit, published earlier in 2018, was the first. An LLM is defined as a language model that has been self-supervisedly trained as a "next word predictor" and can be easily adapted to many specific text-based tasks with state-of-the-art performance, without requiring architectural changes or large amounts of labeled data.

Agentic AI Needs Its TCP/IP Moment

The development of Agentic AI is hindered by the lack of shared protocols for communication, tool use, memory, and trust, which prevents agents from collaborating and sharing knowledge across different platforms and domains. To unlock the full potential of AI agents, an open, interoperable stack, referred to as the Internet of Agents, is needed, which requires standardization in nine key architectural dimensions, including tool use, communication, authentication, and knowledge exchange.

IRS to overhaul its tech after finding $21.1B in fraud in just two years

The IRS is overhauling its technology to combat financial crimes, which have been made easier by advancements in AI, after discovering $21.1 billion in fraud over a two-year period. The IRS' crime-fighting arm, IRS Criminal Investigation, is launching a new program called CI-FIRST to improve interactions with financial institutions and streamline the detection and reporting of financial crimes.

Research

Autonomous AI Agents Should Not Be Developed

The development of fully autonomous AI agents is argued against due to the increasing risks to people that come with greater autonomy. As AI agents are given more control, the potential benefits are outweighed by growing safety risks that can impact human life and other important values.

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

The Relax compiler abstraction is designed to optimize dynamic machine learning workloads, particularly large language models, by introducing a unified representation of computational graphs and tensor programs. Relax enables dynamic shape-aware optimizations, resulting in competitive performance across various GPUs and allowing for deployment on a broader range of devices, including mobile phones and web browsers.

Orion: A Homomorphic Encryption Framework for Deep Learning [pdf]

Fully Homomorphic Encryption (FHE) has the potential to improve privacy and security by enabling computation on encrypted data, particularly in deep learning applications. The Orion framework addresses the challenges of implementing FHE-secured neural inference by automatically translating PyTorch neural networks into efficient FHE programs, achieving state-of-the-art performance and enabling the processing of larger and deeper networks.

LLM Interactive Optimization of Open Source Python Libraries (2023)

This paper presents case studies on utilizing large language models, specifically ChatGPT-4, to optimize source code in open-source Python libraries, finding significant performance improvements of up to 38 times with human expert interaction. The results suggest that large language models are a promising tool for code optimization, but require a human expert in the loop to achieve success, with surprisingly few iterations needed to achieve substantial improvements.

Gen AI for Artistic Style Transfer with Convolutional Neural Networks (2023)

Artistic style transfer is a technique that combines the content of one image with the style of another to create unique compositions, using Convolutional Neural Networks (CNNs) to separate and manipulate image content and style. This approach enables the synthesis of high-quality images that blend content and style harmoniously, and has been shown to be effective and versatile across different styles and content through experimental results.

Code

Show HN: Agent – A Local Computer-Use Operator for macOS

Cua is an open-source framework that combines high-performance virtualization with AI agent capabilities, enabling secure and isolated environments for AI systems to interact with desktop applications. It offers two primary capabilities: high-performance virtualization for running macOS and Linux virtual machines, and a computer-use interface and agent framework for AI systems to observe and control these virtual environments.

Show HN: I made shopping AI chatbot from backend server of 289 API functions

The @samchon/shopping-backend project is an example backend server for a shopping mall, built using NestJS and Prisma, and is designed to demonstrate functional programming and test-driven development. The project provides a guide on how to utilize third-party libraries, such as typia, nestia, and prisma-markdown, to boost productivity, and also demonstrates how to build an A.I. chatbot using the @nestia/chat library.

Show HN: Standardising 'Unit Tests' for Prompts

The PromptRepository is a framework for managing, testing, and evaluating Large Language Model (LLM) prompts, allowing for systematic prompt engineering, validation, and testing to create more reliable AI applications. It includes features such as storing prompts in JSON files, automatic validation of required parameters, and generating unit tests and evaluations for prompts, with the goal of making it easier to develop and test LLM-based applications.

AI-Assisted Software Development: Trends and Lessons by Don Syme [pdf]

There is no text to summarize. The provided input appears to be an error message indicating that a README file could not be retrieved.

ML/AI Coding Interview Problems

Some AI coding practices can help prepare for GenAI and machine learning coding interviews, using code from public resources or generated by Large Language Models (LLMs). These practices are designed to assist in preparing for coding interviews related to machine learning and GenAI.