LLM Digest

Tag: llms

Behavioral Fingerprints for LLM Endpoint Stability and Identity

Jonah Leshin

Your LLM endpoint might be silently switching models without you knowing it, breaking your application's behavior even while health checks pass. This black-box monitoring system detects when model responses change due to weight updates, infrastructure changes, or hardware swaps—essential for maintaining consistent AI application behavior in production.

Snowflake Cortex AI Escapes Sandbox and Executes Malware

A real prompt injection attack bypassed Snowflake's Cortex Agent sandbox by hiding malicious instructions in a GitHub README, demonstrating how attackers can escape AI safety controls in production systems. This attack used process substitution to execute malware that the system incorrectly classified as safe—a wake-up call for engineers building agent applications.

On Optimizing Multimodal Jailbreaks for Spoken Language Models

Aravind Krishnan

Multimodal jailbreaks that simultaneously attack both text and audio inputs are 1.5x to 10x more effective than single-modality attacks against spoken language models. This research exposes critical vulnerabilities in voice-enabled AI systems that traditional text-only security measures miss entirely.

Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference

Pranay Anchuri

When you deploy LLMs as cloud services, clients have no way to verify they're actually getting responses from the intended model rather than a cheaper substitute. This lightweight cryptographic verification system solves a fundamental trust problem in AI-as-a-service without the prohibitive overhead of traditional proof systems.

Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

Xinghao Zhao

You can predict whether an LLM's chain-of-thought reasoning will be correct by tracking whether uncertainty decreases at every step—a simple diagnostic that works better than confidence scores. This 'monotonicity' check gives you a practical way to catch reasoning failures before they impact your application.

LLM Architecture Gallery

A visual catalog of LLM architectures that helps engineers understand the structural differences between major models like GPT, BERT, T5, and newer variants. This reference is invaluable for making informed decisions about which model architectures best fit your specific use case requirements.

Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

Hangeol Chang

Standard RAG often retrieves topically relevant but decision-useless information when you need to choose between options. This hypothesis-conditioned approach rewrites queries to seek supporting evidence, contradicting evidence, and distinguishing factors—dramatically improving retrieval quality for decision-making tasks.