Tuesday — February 4, 2025

EU bans AI with 'unacceptable risk,' OpenEuroLLM pushes for transparent AI in Europe, and Klarity offers insights into LLM output uncertainty.

News

Anthropic: "Applicants should not use AI assistants"

Anthropic's job application form includes a statement asking applicants not to use AI assistants during the application process, as they want to evaluate the applicant's personal interest and non-AI-assisted communication skills. The application also includes an essay question, "Why do you want to work at Anthropic?", which is valued highly and is expected to be answered in 200-400 words.

AI systems with 'unacceptable risk' are now banned in the EU

The European Union has banned the use of AI systems deemed to pose "unacceptable risk" or harm, with regulators able to fine companies up to €35 million or 7% of their annual revenue for non-compliance. The ban, which is part of the EU's AI Act, prohibits the use of AI for certain activities such as social scoring, manipulating people's decisions, and exploiting vulnerabilities, with some exceptions allowed for law enforcement and medical purposes.

Open Euro LLM: Open LLMs for Transparent AI in Europe

The OpenEuroLLM project is a collaborative effort between 20 leading European research institutions, companies, and EuroHPC centres to develop next-generation open-source language models, aiming to advance European AI capabilities and strengthen competitiveness in the global market. The project will create transparent and compliant open-source models that preserve linguistic and cultural diversity, enabling European companies to develop high-quality products and services in the era of AI, while aligning with European values and regulatory frameworks.

Google removed 2.36M apps from Google Play using AI threat detection

The provided text appears to be a blog post from the Google Online Security Blog, specifically an article titled "How we kept the Google Play & Android app ecosystems safe in 2024". The post is dated January 29, 2025, and includes various links to other articles and resources, but does not provide a clear summary of the content. To get the actual information, one would need to click on the link to the full article.

First place in Tetris 99 using computer vision, classical AI, a lot of free time

The authors created a program, dubbed "Jeff," to play Tetris 99, an online multiplayer game for the Nintendo Switch, using computer vision to determine the state of the board and a depth-first search algorithm to find the best next block placement. Jeff was able to consistently place in the top 15 players and occasionally achieve first place, with its performance documented in a video showcasing its autonomous gameplay.

Research

Fanar: An Arabic-Centric Multimodal Generative AI Platform

Fanar is a platform for Arabic-centric multimodal generative AI systems, featuring two large language models, Fanar Star and Fanar Prime, which support language, speech, and image generation tasks. The platform offers various capabilities, including customized retrieval augmented generation systems, speech recognition, voice and image generation, and an attribution service, and was developed by Qatar's Qatar Computing Research Institute with sponsorship from the Ministry of Communications and Information Technology.

Efficient Reasoning with Hidden Thinking

The Heima framework is proposed to improve the efficiency of Chain-of-Thought (CoT) reasoning in Multimodal Large Language Models (MLLMs) by condensing intermediate reasoning steps into compact hidden representations. Experimental results show that Heima achieves higher generation efficiency and maintains or improves zero-shot task accuracy, while also allowing for the reconstruction of reasoning processes that resemble the original CoTs.

Small Language Models (SLMs) Can Still Pack a Punch: A Survey

Researchers have found that smaller language models, with 1 to 8 billion parameters, can perform as well as or even outperform larger models, challenging the notion that massive scale is the only path forward. These Small Language Models (SLMs) can be designed to balance performance, efficiency, scalability, and cost, and can be tailored for general or specific tasks, offering a promising alternative to larger models.

Using Read Promotion and Mixed Isolation Levels for Serializable Execution

The proposed theory determines the lowest isolation level for each transaction program in a mixed-isolation-level setting to ensure serializable executions and preserve integrity constraints. The theory is used to develop an optimization method that improves throughput while maintaining serializability by modifying application code in a semantics-preserving way.

A Comprehensive Survey of the Lean 4 Theorem Prover

This survey examines Lean 4, a cutting-edge interactive theorem prover and functional programming language, analyzing its design, capabilities, and applications in formal verification and mathematics. The survey highlights Lean 4's advantages in proof automation, performance, and usability through comparisons and case studies, and explores its growing ecosystem and impact on formal methods and mathematical formalization.

Code

Show HN: Klarity – Open-source tool to analyze uncertainty/entropy in LLM output

Klarity is a tool for analyzing uncertainty in generative model outputs, combining raw probability analysis and semantic understanding to provide insights into model behavior during text generation. It offers features such as dual entropy analysis, semantic clustering, and AI-powered analysis, and provides a structured JSON analysis of generation patterns, with support for Hugging Face Transformers and various tested target models.

AI-Hedge Fund (With DeepSeek R1) – FincpetTerminal (OSS)

Fincept Terminal is a comprehensive command-line tool designed to help investors and financial professionals navigate the complex world of investments with ease and precision, offering features such as technical analysis, fundamental analysis, sentiment analysis, and portfolio management. The tool can be installed via PyPI and offers various features, including dynamic asset searching, economic data analysis, and a robo advisor, with upcoming features such as real-time data and customizable terminal settings.

Show HN: Open-source version of OpenAI's Deep Research

Open Deep Research is an open-source clone of Open AI's Deep Research experiment that uses Firecrawl's extract and search with a reasoning model to research the web. The project features a range of technologies, including Next.js, React Server Components, and the AI SDK, and can be easily deployed to Vercel or run locally with environment variables.

Show HN: CodeCapy – A PR bot that tests your code

CodeCapy is a PR bot that automatically detects new pull requests, generates natural language end-to-end UI tests based on code changes, and executes tests in isolated Scrapybara instances. To get started with CodeCapy, users can connect their GitHub repositories on the CodeCapy dashboard, install the bot directly on GitHub, and configure test environments using a capy.yaml file.

Show HN: Gave Claude LSD SQL

LSD MCP server allows users to connect Claude, a language model, to the internet and retrieve high-quality information from websites using LSD SQL, a domain-specific language for the web. By following the provided quickstart guide, users can install and configure the LSD MCP server, enabling Claude to perform tasks such as writing and running LSD SQL queries, and retrieving data from websites.