Friday January 17, 2025

Nepenthes traps AI web crawlers in a tarpit, DBOS offers durable TypeScript execution on Postgres, and Microsoft shares key insights from red teaming over 100 generative AI products.

News

Nepenthes is a tarpit to catch AI web crawlers

Nepenthes is a tarpit software designed to catch and waste the resources of web crawlers, particularly those scraping data for large language models (LLMs), by generating an endless sequence of pages with links that lead back into the tarpit. The software is intentionally malicious and can cause significant CPU load, and its use can also lead to a site being removed from search engine results, as it cannot differentiate between legitimate crawlers and those training AI models.

No Calls

The founder of Keygen, an introvert who dislikes sales calls, implemented a "no calls" policy at his company, which initially seemed crazy but ultimately led to more efficient and successful sales, particularly with larger customers. By solving common problems such as unclear messaging, poor onboarding, and hidden pricing, Keygen was able to eliminate the need for sales calls and instead use email and other asynchronous methods to close deals, including its first enterprise sale with an F1000 company.

Thoughts on a Month with Devin

A new AI company launched in March 2024 with a $21 million Series A funding, introducing Devin, a fully autonomous software engineer that can chat with users, learn new technologies, and deploy applications. However, after thorough testing, the results were mixed, with Devin successfully completing some tasks, such as API integrations and building functional applications, but struggling with others, resulting in 14 failures out of 20 tasks attempted, and failing to deliver on its promise of revolutionizing software development.

Test-driven development with an LLM for fun and profit

The author explores the potential of combining Test-Driven Development (TDD) with Large Language Models (LLMs) to improve software development, and presents a framework that automates the process of generating and testing code using LLMs. The author demonstrates this approach by using an LLM to generate a function to parse IPv4 and IPv6 addresses, and then iteratively refining the code through automated testing and human input to ensure its correctness and reliability.

Framework for Artificial Intelligence Diffusion

The US Department of Commerce's Bureau of Industry and Security has published a rule establishing a framework for the diffusion of artificial intelligence, which is effective as of January 13, 2025. The rule, published in the Federal Register, outlines regulations and guidelines for the export control of artificial intelligence technologies, and is open for public comment through the Regulations.gov website.

Research

Lessons from Red Teaming 100 Generative AI Products

Microsoft's experience red teaming over 100 generative AI products has yielded eight key lessons, including the importance of understanding system capabilities and the human element in red teaming, as well as the limitations and challenges of securing AI systems. The company shares these insights and case studies to provide practical recommendations for aligning red teaming efforts with real-world risks and to highlight areas of the field that require further consideration.

How is Google using AI for internal code migrations?

There is a growing interest in using large language models (LLMs) for bespoke purposes in software engineering, such as code migration, with several companies developing proprietary ML-based tools. Google's experience with using LLMs for code migration has shown that it can significantly reduce the time needed for migrations and lower barriers to starting and completing migration programs.

Lessons from Red Teaming 100 Generative AI Products

Microsoft's experience red teaming over 100 generative AI products has yielded eight key lessons, including the importance of understanding system capabilities, the role of human elements, and the limitations of automation in identifying risks. The company shares these insights, along with case studies and practical recommendations, to help align red teaming efforts with real-world risks and address the ongoing challenges of securing AI systems.

Mathematics of the daily word game Waffle

The daily word game Waffle involves complex combinatorics of permutations, which can make some games easy to solve while others are extremely challenging. A perfect solution to the game requires a specific arrangement of 11 orbits, including at least one of length 1, across the 21 squares of the game.

Generating particle physics Lagrangians with transformers

Researchers used a transformer model to predict Lagrangians, which describe the interactions of fundamental particles, by treating them as complex linguistic expressions. The model achieved high accuracy (over 90%) in constructing Lagrangians for up to six matter fields, demonstrating its ability to internalize and generalize concepts such as group representations and conjugation operations.

Code

Show HN: DBOS TypeScript – Lightweight Durable Execution Built on Postgres

DBOS Transact is a lightweight TypeScript library that provides durable execution, allowing programs to persist their execution state and automatically resume from where they left off in case of interruptions or crashes, all backed by a Postgres database. The library is easy to use, requiring only Postgres as a dependency, and can be added to any TypeScript application, including Next.js apps, to provide reliable background jobs, cron scheduling, and queues.

4M Tokens Context Model

There is no text to summarize.

Show HN: Open-source framework to deploy personalized computer-use agents

TankWork is an open-source desktop agent framework that enables AI to perceive and control a computer through computer vision and system-level interactions, allowing for voice and text command execution, real-time screen processing, and natural language voice commands. The framework includes features such as direct computer control, computer vision analysis, voice interaction, and customizable agents with distinct personalities and specializations.

World's most advanced Python AI. Generating Python slop at 1B tokens/sec

SLOP is a highly advanced AI model that generates Python code at an unprecedented speed of 1 billion tokens per second. It can be installed and utilized to generate full-stack applications, such as the example command to create a "fullstack_app.py" with 42,069 iterations.

Show HN: List of AI Agents

AI agents for computer use are autonomous programs that can reason, plan, and act within digital interfaces to accomplish user-specified goals independently. These agents combine perception, decision-making, and control capabilities to interact with computers and mobile devices, and a curated list of resources, including research papers, projects, and tools, is available to support their development and understanding.

2024 Differentiated.