Monday — April 21, 2025

Gemma 3 brings AI to consumer GPUs with optimized QAT models, you can now run LLMs inside a PDF file with llm.pdf, and new research introduces a linearity theorem for improving LLM quantization accuracy.

News

Gemma 3 QAT Models: Bringing AI to Consumer GPUs

The latest generation of open models, Gemma 3, has been optimized with Quantization-Aware Training (QAT) to reduce memory requirements while maintaining high quality, allowing it to run on consumer-grade GPUs like the NVIDIA RTX 3090. This optimization enables users to run powerful models like Gemma 3 27B locally, with dramatic reductions in VRAM requirements, such as from 54 GB to 14.1 GB, making it more accessible to a wider range of devices.

Show HN: JuryNow – Get an anonymous instant verdict from 12 real people

JuryNow is a platform that allows users to ask a jury a question with two possible answers, providing a unique way to gather opinions. The platform, accessible through a sign-up or log-in process, enables users to pose questions and receive feedback from a jury, with the option to participate in jury duty as well.

The skill of the future is not 'AI', but 'Focus'

Large Language Models (LLMs) can be powerful tools for engineers, automating tasks and generating code, but they should be used wisely due to their limitations, such as hallucinations, inconsistencies, and biases. Over-reliance on LLMs can lead to engineers losing their problem-solving skills, particularly for novel challenges, and a balance is needed between using LLMs and understanding the reasoning behind their solutions to maintain human ingenuity and mastery of algorithms.

If you use AI to write me that note, don't expect me to read it

The author, a journalist, laments the increasing use of AI-generated content on platforms like LinkedIn, where users are relying on tools like ChatGPT to create posts that summarize others' work without adding much original thought. The author finds this trend not only unoriginal but also disrespectful, as it implies that the reader's time and attention are not valued, and that the effort to craft thoughtful, human-written content is no longer necessary.

The AI skeptic's guide to AI collaboration

Many people are skeptical of AI, seeing it as a threat to human work and creativity, but this skepticism often stems from a misunderstanding of what AI is capable of. AI is not a replacement for human thought and insight, but rather a collaborative tool that can assist and augment human work, and its true potential can be realized when used thoughtfully and in conjunction with human expertise and judgment.

Research

Pushing the Limits of LLM Quantization via the Linearity Theorem

Quantizing large language models has become a standard way to reduce their memory and computational costs, but existing methods lack theoretical justification and may use sub-optimal metrics. This paper presents a new approach, including a "linearity theorem" and two novel applications, which enables improved accuracy-compression trade-offs and outperforms prior data-free approaches on various language models.

Measuring Global Migration Flows Using Online Data

Researchers used Facebook user data to estimate country-to-country migration flows, finding that 39.1 million people migrated internationally in 2022, with significant changes in migration patterns during the COVID-19 pandemic and in response to global events like the Russian invasion of Ukraine. The estimates, which closely match existing high-quality measures of migration, will be made publicly available to support research and policy interventions, offering a more comprehensive and timely understanding of global migration trends.

Nudge Theory: Users Weaken Their Behavior Change Regimen over Time (2021)

Users of the HabitLab platform, which helps reduce online browsing time, tend to start with challenging interventions but over time opt for easier ones. Despite this, many users still intend to return to the more challenging interventions, as evidenced by their repeated requests to reassess the difficulty level on their next visit rather than saving their preference for easier options.

Measuring Virality

Social media posts can spread quickly and potentially threaten public dialogue if they contain misleading content, making early detection crucial. Researchers have proposed a new metric to identify viral tweets, finding that a tweet is more likely to be viral if the ratio of retweets to its author's followers exceeds a threshold of 2.16, and also developed a transformers-based model to detect viral tweets with an F1 score of 0.79.

Impact of Triangular-Toothed Gears on Functionality of the Antikythera Mechanism

The Antikythera Mechanism's performance is affected by its triangular tooth profiles and manufacturing inaccuracies, with the latter significantly increasing the likelihood of gear jamming or disengagement. Simulations suggest that the reported manufacturing errors would have made the mechanism non-functional, leading to questions about the accuracy of the reported error values and the possibility that the actual errors were smaller.

Code

Awesome-consensus: A bibliography for protocol design

The literature on consensus is vast and explores various aspects of Byzantine agreement, including foundational concepts, communication complexity, and network models. Recent research has focused on optimizing communication cost, achieving quadratic message complexity lower bounds, and developing protocols that adapt to different network conditions and fault scenarios.

Llm.pdf: Run LLMs inside a PDF file

The llm.pdf project demonstrates the possibility of running a Large Language Model (LLM) entirely within a PDF file, using Emscripten and asm.js to compile the model and embedding it into the PDF with base64. The project allows users to create their own PDFs with custom LLMs using a Python script, with guidelines provided for choosing compatible models and optimizing performance.

Show HN: @mcp-it/fastify – Auto-generate MCP tools from Fastify APIs

The @mcp-it/fastify plugin allows you to expose your Fastify API routes as tools consumable by Model Context Protocol (MCP) clients, enabling AI assistants to interact with your API directly. The plugin automatically discovers your Fastify routes, leverages Fastify's schema system to generate complete tool definitions, and supports features like automatic route discovery, schema utilization, and multiple transports.

Show HN: LLM Shell Tools – AI-powered command line helpers(open source + local)

The LLM Shell Tools collection enhances the command line experience by utilizing Large Language Models to provide helpful suggestions and improved commit messages. The tools include a "command not found" hook and an enhanced git commit command, and can be installed and configured using a series of scripts and environment variable settings, with customization options available for model and parameter settings.

Cyckle-AI: An intuitive local AI chatbot

Cyckle is an AI program that can be installed by compiling it from source using a makefile, with dependencies installed through a script, and it utilizes the phi3 model by default but has options to use other models. The program can be controlled through various commands, including modifying tokens, swapping models, quitting, and displaying information, and it requires a processor with AVX2 and at least 4GB of RAM to run.