Paper: Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation
Listen to this article.
Problem
Large Language Models (LLMs) are known to harbor biases, but these are tricky to spot! Traditional methods of checking LLM outputs—looking at single responses or relying on automated metrics—often miss subtle biases hidden within the model’s probability distributions. This is because LLMs generate text stochastically; they don’t always choose the most likely word, and important bias might lurk in those less common generation paths.