Saturday — January 18, 2025

AI sparks ethical debates as a cartoonist is arrested for AI-generated CSAM, Microsoft's red teaming offers critical insights into securing generative AI, and SLOP revolutionizes Python with 1 billion tokens per second code generation.

News

Thoughts on a month with Devin

A new AI company launched in March 2024 with a $21 million Series A funding, introducing Devin, a fully autonomous software engineer that can chat with users, learn new technologies, and deploy applications. However, after thorough testing, the results were mixed, with Devin successfully completing some tasks, such as API integrations and building functional applications, but struggling with others, resulting in 14 failures out of 20 tasks attempted, and failing to deliver on its promise of revolutionizing software development.

Let's talk about AI and end-to-end encryption

The rise of end-to-end encrypted communications has significantly improved privacy, but the increasing use of AI in messaging and phone systems may threaten this progress, as AI models often require access to plaintext data to function effectively. The integration of AI into private messaging and phone systems raises important questions about the future of end-to-end encryption and whether it can coexist with the growing demand for AI-powered features.

Using ChatGPT is not bad for the environment

The claims that AI models like ChatGPT have a significant environmental impact are often exaggerated and misleading, with some statements being entirely incorrect. In reality, the emissions produced by ChatGPT and other large language models are relatively small compared to other daily activities, and individuals who avoid using them due to environmental concerns may be missing out on a valuable tool that can also aid in climate research and education.

Under new law, cops bust famous cartoonist for AI-generated CSAM

A Pulitzer-prize-winning cartoonist, Darrin Bell, has been arrested in California for possessing AI-generated child sex abuse images under a new law that went into effect on January 1. The law considers AI-generated CSAM to be inherently harmful, even without an actual victim, and Bell is being held on $1 million bail after police found evidence of computer-generated CSAM on his account.

Rumor About GPT-5 Changes Everything

The author proposes a hypothesis that OpenAI has developed GPT-5 but is keeping it internal, using it to generate synthetic data to improve the performance of smaller, cheaper models like GPT-4o, rather than releasing it publicly. This theory is supported by the example of Anthropic's Claude Opus 3.5, which was trained but not released, instead being used to generate synthetic data to enhance the performance of the cheaper Claude Sonnet 3.6 model through a process called distillation.

Research

Lessons from Red Teaming 100 Generative AI Products

Microsoft's experience red teaming over 100 generative AI products has yielded eight key lessons, including the importance of understanding system capabilities and the human element in red teaming, as well as the limitations and challenges of securing AI systems. The company is sharing its internal threat model ontology and practical recommendations to help align red teaming efforts with real-world risks and address the ongoing challenges of securing AI systems.

How is Google using AI for internal code migrations?

Google has been using large language models (LLMs) for code migration, and their experience shows that LLMs can significantly reduce the time needed for migrations and lower barriers to starting and completing migration programs. The company is sharing its insights from applying LLM-based code migration in an enterprise context, with the goal of providing useful information to other industry practitioners using machine learning in software engineering.

Predicting Human Brain States with Transformer

Researchers used functional magnetic resonance imaging (fMRI) and transformer architecture to predict human brain resting states, achieving accurate predictions up to 5.04 seconds based on the previous 21.6 seconds of data. The results demonstrate the potential for developing generative models that learn the functional organization of the human brain, with the generated fMRI brain states reflecting the architecture of the functional connectome.

Mathematics of the daily word game Waffle

The daily word game Waffle involves complex combinatorics of permutations, which can make some games easy to solve while others are extremely challenging. A perfect solution to the game requires a specific arrangement of 11 orbits, including at least one of length 1, across the 21 squares of the game.

Generating particle physics Lagrangians with transformers

Researchers used a transformer model to predict Lagrangians, which describe the interactions of fundamental particles, by treating them as complex linguistic expressions. The model achieved high accuracy (over 90%) in constructing Lagrangians for up to six matter fields, and was able to generalize and internalize concepts such as group representations and conjugation operations.

Code

Show HN: Open-source framework to deploy personalized computer-use agents

TankWork is an open-source desktop agent framework that enables AI to perceive and control a computer through computer vision and system-level interactions, allowing for voice and text command execution, real-time screen processing, and natural language voice commands. The framework includes features such as direct computer control, computer vision analysis, voice interaction, customizable agents, and real-time feedback, and is designed for developers and researchers working on autonomous desktop agents.

Show HN: ReProm – CLI to bundle code and structure into one Markdown for AI

There is no text to summarize. The input appears to be an error message indicating that a README file could not be retrieved.

World's most advanced Python AI. Generating Python slop at 1B tokens/sec

SLOP is a highly advanced AI model that generates Python code at an unprecedented speed of 1 billion tokens per second. It can be installed and utilized to generate Python code, such as a full-stack application, using a command like "cargo install --path . slop fullstack_app.py --count 42069".

MiniCPM-O 2.6, GPT-4o Level MLLM for Vision, Speech and Multimodal on Your Phone

MiniCPM-o is a series of end-side multimodal large language models (MLLMs) that can take image, video, text, and audio as inputs and provide high-quality text and speech outputs. The latest model, MiniCPM-o 2.6, achieves comparable performance to GPT-4o-202405 in vision, speech, and multimodal live streaming, and supports features such as bilingual real-time speech conversation, emotion/speed/style control, and end-to-end voice cloning.

Transformer²: dynamic weight adaptation in LLMs

Transformer² is a novel self-adaptation framework for large language models (LLMs) that adapts to unseen tasks in real-time by selectively adjusting components of their weight matrices. The framework uses a two-pass mechanism, identifying task properties and then dynamically mixing task-specific "expert" vectors to obtain targeted behavior for incoming prompts.