Verda Blog

NVFP4 Explained: How NVIDIA Blackwell Unlocks Low-Precision Floating Point

NEW AI research

NVFP4 Explained: How NVIDIA Blackwell Unlocks Low-Precision Floating Point

Multi-Head Latent Attention: Benefits in Memory and Computation

NEW AI research

Multi-Head Latent Attention: Benefits in Memory and Computation

FLUX on B200 vs H100: Real-Time Image Inference with WaveSpeedAI

FLUX on B200 vs H100: Real-Time Image Inference with WaveSpeedAI

AI research Apr 8, 2025

DeepSeek-V3 + SGLang: Inference Optimization

DeepSeek-V3 + SGLang: Inference Optimization

AI research Apr 4, 2025

DeepSeek + SGLang: Multi-Head Latent Attention

DeepSeek + SGLang: Multi-Head Latent Attention

AI research Mar 12, 2025

Multi Data Center Training: Prime Intellect

Multi Data Center Training: Prime Intellect

AI research Feb 28, 2025