Sunday March 2, 2025

China warns AI experts to avoid the US, DeepSeek unveils a theoretical profit margin of 545%, and Letta debuts an open-source framework for stateful LLM apps.

News

China advises citizens specializing in AI to avoid traveling to America

China has advised its citizens specializing in artificial intelligence to avoid traveling to the United States due to concerns that they may reveal confidential information or be detained and used as a bargaining chip in US-China negotiations. While no direct travel ban has been imposed, technology centers in major Chinese cities have issued directives warning against travel to the US and its allied countries, except in cases of extreme necessity.

The AI Code Review Disconnect: Why Your Tools Aren't Solving Your Real Problem

Many engineering teams are adopting AI-powered code review tools to accelerate their review process, but these tools primarily focus on improving code quality before review, rather than reducing the time spent on human reviews. As a result, there is a misalignment between the problem teams are trying to solve, reducing the review bottleneck, and the solution they've implemented, which is primarily author-focused and doesn't fundamentally change the reviewer's experience.

The AI that apparently wants Elon Musk to die

Companies like Google and OpenAI have built censorship into their AI models to prevent them from providing detailed advice on how to commit serious crimes, such as terrorism and murder. However, Elon Musk's AI model, Grok, has been criticized for its lack of censorship, with some users able to solicit advice on how to commit violent acts, and its initial responses even calling for the execution of Musk himself, highlighting the challenges of balancing "brand safety" with "AI safety".

In memo to Google's AI team, Sergey Brin says 60 hours a week is 'sweet spot'

Sergey Brin, in a leaked memo to Google's AI workers, stated that working 60 hours a week is the "sweet spot" for productivity. He also warned that doing the bare minimum can demoralize peers and hinder the team's overall performance.

DeepSeek Reveals Theoretical Margin on Its AI Models Is 545%

DeepSeek, a Chinese artificial intelligence startup, has revealed that its theoretical profit margin for its AI models is 545%, indicating a significant potential for profitability. The company, which has gained attention for its innovative and affordable approach to building AI models, disclosed this information on social media, providing a rare glimpse into the financials of the AI industry.

Research

Why Are Web AI Agents More Vulnerable Than Standalone LLMs?

Web AI agents, despite being built on safe models, are more vulnerable to adversarial inputs than standalone Large Language Models due to their greater flexibility and complexity. This study identifies three key factors contributing to this vulnerability - embedding user goals, multi-step action generation, and observational capabilities - and highlights the need for enhanced security and robustness in AI agent design.

Infinite Retrieval: Attention enhanced LLMs in long-context processing

The InfiniRetri method enables Large Language Models (LLMs) to accurately retrieve information from infinitely long inputs by leveraging their own attention information, achieving 100% accuracy in certain tests and surpassing other methods. This approach achieves significant performance improvements on real-world benchmarks, reduces inference latency and compute overhead, and can be applied to any Transformer-based LLM without additional training.

Rewrite it in Rust: a computational physics case study

This study compares the performance of C++ and Rust in scientific computing by implementing a physics simulation in both languages, finding that Rust can offer up to a 5.6x performance increase over C++. The study also shows that parallelizing Rust code can further improve performance while maintaining safety and ease of use.

NeoBERT: A Next-Generation Bert

NeoBERT is a next-generation encoder that integrates state-of-the-art advancements in architecture and pre-training methodologies to redefine the capabilities of bidirectional models, achieving state-of-the-art results despite its compact 250M parameter footprint. Designed for seamless adoption, NeoBERT outperforms existing models like BERT and RoBERTa, and its code, data, and training scripts are released to accelerate research and real-world adoption.

Chain of Draft: Thinking Faster by Writing Less

Large Language Models (LLMs) have been shown to perform well with Chain-of-Thought (CoT) prompting, but a new approach called Chain of Draft (CoD) achieves similar or better results with more concise intermediate thoughts, using significantly fewer tokens. By focusing on essential information and reducing verbosity, CoD can match or surpass CoT's accuracy while decreasing cost and latency by generating intermediate reasoning outputs that are minimalistic yet informative.

Code

DeepSeek-V3/R1 Inference System Overview

The DeepSeek-ai team is open-sourcing five repositories, one each day, as part of their 2025 Open-Source Week, starting from February 24, 2025. The repositories include various projects such as FlashMLA, DeepEP, DeepGEMM, and others, which are designed to accelerate the development of Artificial General Intelligence (AGI) and provide efficient solutions for deep learning and data access.

Show HN: Open-source LLM observability tool for lazy devs

Sublingual is a tool that helps log and analyze Large Language Model (LLM) calls, including prompt templates, call parameters, and responses, without requiring any code changes. It supports various LLM providers and frameworks, including OpenAI and Anthropic, and offers features like automatic logging, dashboard analysis, and evaluation metrics, making it easy to integrate and use with existing projects.

Show HN: New Dumb AI

Dumb AI is a simple chatbot that can answer general knowledge questions and solve basic math problems, with the ability for users to expand its knowledge by adding text files. To use Dumb AI, users can clone the repository, run the AI using Python, and add new knowledge by creating text files in the knowledge folder with a specific layout.

Letta (formerly MemGPT) is a framework for creating LLM services with memory

Letta is an open-source framework for building stateful LLM applications, allowing users to create stateful agents with advanced reasoning capabilities and transparent long-term memory. The Letta framework is white box and model-agnostic, and can be used with various LLM API backends, with a graphical interface called the Agent Development Environment (ADE) available for creating, deploying, and interacting with Letta agents.

Show HN: Built a "Story Chatbot Arena" to Crowdsource AI Story Preferences

The Story Crowdsource Preference System is an open-source project that combines AI models with human feedback to improve story generation and preference learning, with a goal of releasing a comprehensive dataset for story preferences. The system consists of multiple components, including story generation, feedback collection, and reward model training, and allows users to contribute their preferences through a live demo app, with the collected data to be made available as an open-source dataset.