Flavio Calmon
@fcalmon.bsky.social
📤 141
📥 20
📝 1
Associate Professor @Harvard SEAS. Information theorist, but only asymptotically.
New paper on discretion in AI “alignment” — check out
@maartenbuyl.bsky.social
’s thread below!
add a skeleton here at some point
10 months ago
0
5
0
reposted by
Flavio Calmon
Maarten Buyl
10 months ago
9/n Full paper here: 🔗
arxiv.org/abs/2502.10441
. Huge thanks to my amazing team of co-authors:
@hadikh.bsky.social
,
@lucasmpaes.bsky.social
,
@claudiomv.bsky.social
,
@caiocvm.bsky.social
, and
@fcalmon.bsky.social
. Done at
@harvard.edu
loading . . .
AI Alignment at Your Discretion
In AI alignment, extensive latitude must be granted to annotators, either human or algorithmic, to judge which model outputs are `better' or `safer.' We refer to this latitude as alignment discretion....
https://arxiv.org/abs/2502.10441
0
3
1
reposted by
Flavio Calmon
Maarten Buyl
10 months ago
AI is built to “be helpful” or “avoid harm”, but which principles should it prioritize and when? We call this alignment discretion. As Asimov's stories show: balancing such principles for AI behavior is tricky. In fact, we find that AI has its own set of priorities. (comic by
@xkcd.com
)🧵👇
2
5
5
reposted by
Flavio Calmon
Bogdan Kulynych
12 months ago
The standard practice in differential privacy of targeting ε at small δ is extremely lossy for interpreting the level of privacy protection. For many real-world algorithms (e.g., for DP-SGD), we can do much better! We show how in the
#NeurIPS2024
paper:
arxiv.org/abs/2407.02191
Short summary👇
loading . . .
Attack-Aware Noise Calibration for Differential Privacy
Differential privacy (DP) is a widely used approach for mitigating privacy risks when training machine learning models on sensitive data. DP mechanisms add noise during training to limit the risk of i...
https://arxiv.org/abs/2407.02191
1
9
3
reposted by
Flavio Calmon
Bogdan Kulynych
12 months ago
This is joint work with Felipe Gomez, Georgios Kaissis,
@fcalmon.bsky.social
, and
@carmelatroncoso.bsky.social
Happy to chat about it online, and in 🇨🇦+🇺🇸 next two weeks: - At the
#NeurIPS2024
Friday Dec. 13 evening poster session. - Will also present in more detail on Tuesday Dec. 17 at Harvard.
1
1
2
you reached the end!!
feeds!
log in