Tuesday — January 14, 2025

AI model training may shift away from massive data centers as distributed methods evolve, Panopticon AI launches an open-source military simulation platform to advance research, and Vision-Large Language Models show vulnerability to hidden virus signatures in images.

News

AI Engineer Reading List

The article discusses a curated list of "required reads" for AI engineers, covering various topics such as large language models, benchmarks, prompting, retrieval-augmented generation, and more. The list includes around 50 papers, with a focus on practicality and relevance to AI engineers, and is divided into sections such as Frontier LLMs, Benchmarks and Evals, and Prompting, ICL & Chain of Thought. The article also mentions the importance of understanding the history and context of certain concepts, such as information retrieval, and recommends additional resources such as books, tutorials, and podcasts for further learning.

Training AI models might not need enormous data centres

Training AI models may no longer require enormous data centers, as new methods allow for distributed training across multiple devices, potentially eliminating the need for dedicated hardware. Tech giants like Elon Musk and Mark Zuckerberg are currently competing to build the largest clusters of graphics processing units (GPUs) to train AI models, but this approach may become outdated as more efficient methods emerge.

WH Executive Order Affecting Chips and AI Models

The US must lead the transition to artificial intelligence to ensure national security and economic strength, and a new Interim Final Rule on Artificial Intelligence Diffusion aims to achieve this by streamlining licensing hurdles and setting security standards for AI technology. The rule establishes mechanisms to control the diffusion of US AI technology, including restrictions on sales to certain countries and entities, while also providing flexibility for trusted allies and partners to access and benefit from US AI technology.

CEO of AI Music Company Says People Don't Like Making Music

Mikey Shulman, CEO of AI music generator company Suno AI, believes that making music is not enjoyable for most people due to the time and practice required, and thinks his company's technology can make music creation more accessible to a wider audience. However, his views have been criticized for misunderstanding the value of the creative process and the joy of learning and mastering a skill, with some arguing that the challenge and effort involved in making music are a key part of its appeal.

Mark Zuckerberg says AI could soon do the work of Meta's midlevel engineers

Mark Zuckerberg stated that Meta will start automating the work of midlevel software engineers this year, with AI potentially replacing their tasks, and may eventually outsource all coding on its apps to AI. This development could significantly impact the job market, as midlevel software engineers at Meta currently earn close to mid-six figures in total compensation.

Research

Mlkaps: Machine Learning and Adaptive Sampling for HPC Kernel Auto-Tuning

MLKAPS is a tool that uses machine learning and adaptive sampling to automate the task of optimizing High-Performance Computing (HPC) kernel hyperparameters for varying inputs and environments. The tool has been shown to outperform state-of-the-art auto-tuning tools, achieving significant speedups on highly optimized kernels, such as Intel MKL's dgetrf LU and dgeqrf QR kernels, with geomean speedups of x1.30 and x1.18, respectively.

Infecting Generative AI with Viruses

Researchers tested the security of Vision-Large Language Models (VLM/LLM) by embedding the EICAR test file within JPEG images and uploading them to various LLM platforms, including OpenAI and Google. The experiments showed that the EICAR signature could be hidden in image metadata, extracted, and potentially executed within LLM environments, demonstrating vulnerabilities in file handling and execution capabilities.

LLM forecasters rapidly approaching human-level performance

Researchers have developed a method to instantly evaluate the performance of forecasting AI models by measuring the consistency of their predictions on related questions, using a metric based on arbitrage. This approach has been shown to correlate with traditional forecasting benchmarks, and a new consistency benchmark has been created to provide long-term evaluation of forecasting models, with results to be resolved in 2028.

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Retrieval-Augmented Generation (RAG) approaches have been limited to textual information, with some considering images, but largely overlooking videos as a rich source of multimodal knowledge. The VideoRAG framework addresses this by dynamically retrieving relevant videos and utilizing both visual and textual information to generate outputs, outperforming relevant baselines and showcasing its effectiveness.

Kajal: Extracting Grammar of a Source Code Using Large Language Models

Kajal is a novel approach that automatically infers grammar from domain-specific language code snippets by leveraging Large Language Models through prompt engineering and few-shot learning. The approach achieves significant accuracy, with 60% accuracy using few-shot learning, and offers a promising solution for automating DSL grammar extraction, with potential for further improvement and validation through future work.

Code

Show HN: A blocklist to remove spam and bad websites from search results

The Bad Website Blocklist is a curated list of low-quality websites, including AI-generated content and spam sites, that aims to remove them from search results to make room for more informative and useful articles. To use the blocklist, users can install the uBlacklist extension, subscribe to the blocklist, and automatically stay up-to-date with the latest blocked websites.

Show HN: Panopticon AI – Open-source platform for military AI research

Panopticon AI is an open-source, web-based military simulation platform compatible with OpenAI Gym, aiming to advance military AI research. The project's code is available under the Apache 2.0 license, with full documentation and contribution guidelines available on the platform's website.

Show HN: Bash-my-AWS adds bmai <missing command> to generate functions

Bash-my-AWS is a set of CLI commands for managing Amazon Web Services resources, providing simple and memorable commands for listing and acting on resources. The commands are designed to be Unix pipeline friendly, allowing for easy integration with other Unix commands, and also offer features like shell command completion and convenient shortcuts.

AI Assistent with file- and shell access

Rob the Robot is an AI-powered tool that can read and write files, and execute shell commands based on user prompts, but it requires caution as it has access to the user's shell and can potentially cause harm. To use Rob the Robot, an OpenAI API key is needed, which should be set as an environment variable, and the bin directory should be added to the system's PATH.

Start Machine Learning in 2025

This guide provides a comprehensive resource for individuals with little to no background in programming, math, or machine learning to become experts in the field for free. The guide includes a wide range of resources, such as YouTube videos, online courses, articles, and books, as well as tips and advice for learning and practicing machine learning, including repetition and seeking out multiple sources of information.