Wednesday — March 26, 2025
Aggressive AI crawlers force global blocks amid infrastructure strain, researchers showcase cost-saving MoE models for AI scaling, and VGGT advances 3D scene inference with fast processing and low GPU usage.
News
Devs say AI crawlers dominate traffic, forcing blocks on entire countries
Aggressive AI crawlers from companies like Amazon and OpenAI are overwhelming open source infrastructure, causing downtime and increased bandwidth costs, with some projects seeing up to 97% of their traffic coming from these bots. In response, developers are fighting back with measures like custom-built proof-of-work challenge systems, but these solutions can also cause delays for legitimate users, highlighting the need for AI companies to be more respectful of open source resources.
An AI bubble threatens Silicon Valley, and all of us
OpenAI's plans to build a $500 billion supercluster called Project Stargate to develop artificial general intelligence (AGI) are threatened by a new generative AI model called DeepSeek R1, which was released by a Chinese hedge fund and matches OpenAI's performance at a significantly lower cost. The emergence of DeepSeek R1 poses an existential threat to OpenAI's business model and could also burst the speculative bubble around generative AI, which has been inflated by Silicon Valley hype and has hundreds of billions of dollars at stake.
AI bots are destroying Open Access
AI companies are using bots to scrape data from open-access websites, such as libraries and scholarly publishers, to train their large language models, causing significant strain on these websites and threatening their ability to provide access to quality information. The bots are aggressive, numerous, and mindless, using up server resources and forcing the websites to block entire countries or use commercial services to outsource bot-blocking, resulting in lost access to thousands of books and other content.
We chose LangGraph to build our coding agent
Qodo, a company building AI coding assistants, chose LangGraph as the framework for their coding agent due to its flexibility and ability to create opinionated workflows while maintaining adaptability. LangGraph's graph-based approach allows for the creation of a state machine that defines a workflow, with nodes representing discrete steps and edges defining transitions between them, making it easy to recalibrate the structure of the flows as new, more powerful models are released.
Show HN: Feudle – A daily puzzle game built with AI
Feudle is a game where players guess the most popular responses to a given question, with correct answers appearing on the board along with their response count. The game is available to play daily, with options to share results, view statistics, and participate in a community through social media platforms like Instagram and Reddit.
Research
Scaling a 300B Mixture-of-Experts LING LLM Without Premium GPUs
The report presents two large language models, Ling-Lite and Ling-Plus, which are Mixture of Experts (MoE) models that achieve comparable performance to industry benchmarks despite having fewer parameters and requiring less computational resources. The report proposes innovative methods to improve the efficiency and accessibility of AI development, and demonstrates that large-scale MoE models can be effectively trained on lower-performance devices, resulting in significant cost savings of approximately 20%.
Communication-Efficient Language Model Training Scales Reliably and Robustly
The DiLoCo approach, which relaxes synchronization demands in machine learning models, scales predictably and robustly with model size, and when well-tuned, can outperform traditional data-parallel training methods. DiLoCo's benefits include increased optimal batch sizes, improved downstream generalization, and better evaluation loss, making it a promising approach for training large language models under a fixed compute budget.
Optimizing ML training with metagradient descent
Researchers have developed a gradient-based approach to optimize the training process of large-scale machine learning models, introducing an algorithm for efficiently calculating metagradients and a framework for effective optimization. This approach, called metagradient descent, achieves significant improvements in dataset selection, defends against data poisoning attacks, and automatically finds competitive learning rate schedules.
Causal Emergence 2.0: Quantifying emergent complexity
A new theory of emergence has been introduced, which treats the different scales of a complex system as slices of a higher-dimensional object, allowing for the identification of unique causal contributions at each scale. The theory provides a framework for understanding the causal workings of complex systems across multiple scales, and introduces a measure of emergent complexity that quantifies the distribution of causal workings across a system's hierarchy of scales.
SeeLe: A Unified Acceleration Framework for Real-Time Gaussian Splatting
The 3D Gaussian Splatting (3DGS) rendering technique is hindered by limited hardware resources on mobile platforms, but the proposed SeeLe framework accelerates the 3DGS pipeline with two GPU-oriented techniques: hybrid preprocessing and contribution-aware rasterization. These techniques collectively achieve a 2.6$\times$ speedup and 32.3\% model reduction while maintaining superior rendering quality, making them suitable for resource-constrained mobile devices.
Code
VGGT: Visual Geometry Grounded Transformer
VGGT (Visual Geometry Grounded Transformer) is a neural network that can infer 3D attributes of a scene, including camera parameters, depth maps, and 3D point tracks, from one or multiple views of the scene. The model can be used for various tasks, including 3D reconstruction, tracking, and single-view reconstruction, and has shown competitive or better results compared to state-of-the-art methods, with fast processing times and reasonable GPU memory usage.
Open source AI agent helper to let it SEE what its doing
Vibe-Eyes is an MCP server that enables Large Language Models (LLMs) to "see" what's happening in browser-based games and applications by capturing and vectorizing canvas content and debug information. The system uses a client-server architecture, where a lightweight browser client sends canvas snapshots and debug data to a Node.js server, which then makes the information available to LLMs through the Model Context Protocol (MCP).
Show HN: Surf – open-source Web Access for LLMs
SURF is a self-deployable API that bridges the gap between Large Language Models and the web, enabling them to search the web and process content with minimal setup. It features powerful HTML processing, intelligent web search with multiple search providers, and is designed for easy integration with LLMs, offering flexible output formats and customizable result counts.
AI-powered Notetaker for doctors? Any feedback?
Notetaker AI is an intelligent transcription and summarization tool for professionals, combining precise transcription with intelligent summarization to create concise, structured notes. It offers features such as smart transcription, multiple summary formats, flexible deployment, GPU acceleration, and customizable configuration, and can be set up and used through a variety of methods, including API, Gradio UI, and Docker.
Show HN: I built a Reddit MCP server for faster and better research in Claude
This is a Reddit MCP server that allows users to browse, search, and read Reddit content, with features like detailed parameter validation and rate limiting protection. The server is currently read-only, but users can upvote an issue or send a pull request to add write features, and it can be installed into Claude Desktop or used with other MCP clients.