Tuesday — March 25, 2025
Springer faces backlash over AI-generated content in a $160 book, Google's LLMs accelerate code migration efficiency, and LatencyAI emerges as a performance engineer for optimizing Python code.
News
Deepseek V3-0324
The DeepSeek-V3-0324 model is a text generation model available on the Hugging Face platform, with 685 billion parameters and support for various tensor types and inference providers. The model has been used in 26 different spaces and has several finetunes and quantizations available, but it has not been downloaded in the last month and its README file is empty.
$160 book by Springer about Advanced Nanovaccines Contains AI generated text
John Mark Ockerbloom discovered that a book published by Springer, "Advanced Nanovaccines for Cancer Immunotherapy", appears to be of poor quality and may have been generated by an AI language model. The book's content is questionable, and its $160 price tag has raised concerns about the publisher's priorities. Ockerbloom and others are discussing the issue on Mastodon, criticizing the lack of quality control and suggesting that the book's publication may be a sign of a larger problem in academic publishing.
Why we chose LangGraph to build our coding agent
Qodo, a company building AI coding assistants, chose LangGraph as the framework for their coding agent due to its flexibility and ability to create opinionated workflows while maintaining adaptability. LangGraph's graph-based approach allows for the creation of a state machine for the agent, with nodes and edges defining the workflow, and its declarative API simplifies the development process, making it easy to reason about and modify the code.
Sail's MCP Server: Spark Analytics for LLM Agents
The 0.2.3 release of Sail includes a server for the Model Context Protocol (MCP), allowing LLM agents to interact directly with the compute engine and enabling a paradigm shift in data interaction by merging AI power with precise data processing. This integration enables users to engage in interactive, context-aware conversations with data systems, allowing them to derive insights and make decisions more intuitively and efficiently, with the potential to drive tangible value across entire organizations.
The Missing Data Infrastructure for Physical AI
Rerun has raised $17 million in seed funding to build a data stack for Physical AI, which has the potential to transform the economy by leveraging computer vision and robotics. The company aims to create a new database and cloud data platform that will help teams run more experiments faster, building on its existing open-source framework for logging and visualizing multimodal data that has been adopted by companies like Meta, Google, and Hugging Face.
Research
Bold: Boolean Logic Deep Learning
This paper proposes a novel approach to deep learning by introducing Boolean weights and inputs, allowing for efficient training in the Boolean domain using Boolean logic. The approach achieves state-of-the-art results in various tasks, including image classification and natural language understanding, while significantly reducing energy consumption during both training and inference.
Sentence-Level Reward Model Can Better Aligning LLM from Human Preference
The effectiveness of aligning large language models (LLMs) with human preferences relies on the performance of reward models, which assign scores to generated responses. This paper proposes an intermediate-grained reward model that assigns scores to individual sentences within a response, outperforming traditional response-level reward models and achieving state-of-the-art results on common benchmarks.
How is Google using AI for internal code migrations?
There has been a growing interest in using large language models (LLMs) in software engineering, particularly for bespoke purposes such as code migration. Google's experience with using LLMs for code migration has shown that it can significantly reduce the time needed for migrations and lower barriers to starting and completing migration programs.
Every Flop Counts: Scaling a 300B LLM Without Premium GPUs
The report presents two large-scale Mixture of Experts (MoE) models, Ling-Lite and Ling-Plus, which achieve comparable performance to industry benchmarks despite having fewer parameters and requiring lower computational resources. The researchers propose innovative methods to optimize model architecture, training processes, and evaluation efficiency, demonstrating that large MoE models can be trained on lower-performance devices with significant cost savings, up to 20% reduction in computing costs.
Can Large Vision Language Models Read Maps Like a Human?
MapBench is a dataset of over 1600 pixel-based map path finding problems designed to test the navigation capabilities of large language models (LVLMs) in outdoor scenarios. The dataset challenges state-of-the-art LVLMs, revealing limitations in their spatial reasoning and decision-making capabilities, and is available along with its code for further evaluation and research.
Code
Open source AI agent helper to let it SEE what its doing
Vibe-Eyes is an MCP server that enables Large Language Models (LLMs) to "see" what's happening in browser-based games and applications by capturing and vectorizing canvas content and debug information. The system uses a client-server architecture, where a lightweight browser client sends canvas snapshots and debug data to a Node.js server, which then makes the information available to LLMs through the Model Context Protocol (MCP).
LatencyAI – AI Performance Engineer
LatencyAI is an experimental AI agent that optimizes Python code for better performance using techniques such as GPU offloading and latency hiding. To use LatencyAI, users can install the library, set an API key, and run the optimization tool on their script, which will iteratively profile, optimize, and benchmark the code to produce an optimized version.
AI code review comments for GitHub pull requests
AI-PR-Reviewer is a tool that automatically analyzes code in GitHub pull request diffs and generates insightful code review comments using various AI models, including OpenAI, Azure OpenAI, and Anthropic Claude. It can be easily integrated with GitHub CI workflows and allows for customizable configuration through environment variables and a configuration file.
Show HN: HolmesGPT – OSS AI Agent for On-Call and Observability
HolmesGPT is an AI-powered tool that helps respond to alerts faster by automatically fetching logs, traces, and metrics, determining if issues are application or infrastructure related, and finding upstream root causes. It connects AI models with live observability data and organizational knowledge, and can be integrated with various tools such as Kubernetes, Grafana, and Prometheus, to investigate alerts and provide insights.
Show HN: Browserbase MCP Server – Automate Browsers at Scale in Your Own Cloud
The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools, providing a standardized way to connect LLMs with the context they need. The Browserbase MCP Server offers cloud browser automation capabilities, allowing LLMs to interact with web pages, take screenshots, and execute JavaScript in a cloud browser environment using tools like Browserbase, Puppeteer, and Stagehand.