Excited to share that our paper on Density-Guided Response Optimization has been accepted to ACM FAccT!
We show that local density in embedding space encoded recoverable community preference signal, and this can be used for alignment without explicit preference annotations.
3 months ago