Simon Lermen
@simonlermen.bsky.social
📤 144
📥 330
📝 27
I work on AI safety and AI in cybersecurity
pinned post!
Happy to share my
matsprogram.org
project that I have been working on in the last couple of months. We explore how LLMs can be used for large-scale deanonymization online.
add a skeleton here at some point
7 days ago
0
4
0
reposted by
Simon Lermen
koenfucius
3 days ago
“I didn’t write that” “Yes you did” Research by
@simonlermen.bsky.social
et al shows LLMs can deanonymize pseudonymous users of online platforms using unstructured content (eg link pseudonymous Hacker News posts with LinkedIn profiles or interview transcripts):
buff.ly/bAdgQpx
0
0
1
Happy to share my
matsprogram.org
project that I have been working on in the last couple of months. We explore how LLMs can be used for large-scale deanonymization online.
add a skeleton here at some point
7 days ago
0
4
0
Our paper on AI-powered spear phishing, co-authored with
@fredheiding.bsky.social
, has been accepted at the ICML 2025 Workshop on Reliable and Responsible Foundation Models!
openreview.net/pdf?id=f0uFp...
loading . . .
https://openreview.net/pdf?id=f0uFpuea1s
8 months ago
0
1
1
Grok's DeepSearch was launched with Zero safety features, you can ask it about assasslnations, dru*gs. This has been online for a few days now with no changes.
about 1 year ago
0
2
0
I published a human study with
@fredheiding.bsky.social
We use AI agents built from GPT-4o and Claude 3.5 Sonnet to search the web for available information on a target and use this for highly personalized phishing messages. achieved click-through rates above 50%
www.lesswrong.com/posts/GCHyDK...
loading . . .
Human study on AI spear phishing campaigns — LessWrong
TL;DR: We ran a human subject study on whether language models can successfully spear-phish people. We use AI agents built from GPT-4o and Claude 3.5…
https://www.lesswrong.com/posts/GCHyDKfPXa5qsG2cP/human-study-on-ai-spear-phishing-campaigns
about 1 year ago
0
4
1
I'll be at the SafeGenAI workshop on Sunday presenting on research I did on safety in AI agents. I will talk about results from these two blog posts:
www.lesswrong.com/posts/ZoFxTq...
And:
www.lesswrong.com/posts/Lgq2Dc...
loading . . .
Current safety training techniques do not fully transfer to the agent setting — LessWrong
TL;DR: We are presenting three recent papers which all share a similar finding, i.e. the safety training techniques for chat models don’t transfer we…
https://www.lesswrong.com/posts/ZoFxTqWRBkyanonyb/current-safety-training-techniques-do-not-fully-transfer-to
about 1 year ago
0
4
0
reposted by
Simon Lermen
Arthur Conmy
over 1 year ago
I'm very bullish on automated research engineering soon, but even I was surprised that AI agents are twice as good as humans with 5+ years of experience or from a top AGI or safety lab at doing tasks in 2 hours. Paper:
metr.org/AI_R_D_Evalu...
loading . . .
https://metr.org/AI_R_D_Evaluation_Report.pdf
1
8
1
you reached the end!!
feeds!
log in