I just realized that if the ‘gumbelmax trick’ (look it up) is defined by using argmax on Gumbels, and identical ‘expmin trick’ uses argmin on Exponentials, then using argmax on Betas (precisely equivalent to the previous two), gives what really should be called the ‘betamax trick’ 📹
8 months ago