The Kullback-Leibler divergence measures how many "nats" of information we gained. High divergence indicates our evidence was highly surprising or informative relative to our prior.
Geometric Space $P(H, E)$
Width = $P(H)$Height = $P(E|H)$
Renormalization Logic
Bayesian inference is the process of carving out the region where Evidence ($E$) is true, and then stretching that sub-region to represent our new entire universe of possibility.