Alex Becker
@gputhief.bsky.social
π€ 79
π₯ 138
π 44
Safeguards @ Anthropic San Francisco Blog:
https://alexcbecker.net/blog.html
Bad sign
2 days ago
1
5
1
AI2 did a near 1:1 comparison between pure transformer and hybrid archs:
allenai.org/papers/olmo-...
Pretraining: hybrid gated deltanet clearly wins RL: mixed at best They also point out theoretical limitations of transformers' fixed circuit depths 1/2
loading . . .
https://allenai.org/papers/olmo-hybrid
2 days ago
1
8
1
reposted by
Alex Becker
norvid_studies
24 days ago
no offense but if your sole metric for belief evaluation is "which of these make me feel the best" you're just epistemically completely turbofucked. maybe there's a nicer way to phrase that but that's the gist. like I'd love to believe I'm impervious to disease, and bullets. but
6
30
2
At long last we have built Her, from the classic Sci-Fi movie Don't Build Her.
www.minimax.io/news/a-deep-...
loading . . .
A Deep Dive into the MiniMax-M2-her
https://www.minimax.io/news/a-deep-dive-into-the-minimax-m2-her-2
5 days ago
0
4
0
I'm a simple man; I see Mickens, I repost.
add a skeleton here at some point
6 days ago
2
1
0
reposted by
Alex Becker
James
6 days ago
WELL WELL WELL NOT SO EASY TO FIND A PLAN TO TERMINATE A CONFLICT THAT DOESNβT SUCK SHIT HUH?
add a skeleton here at some point
20
995
129
Proud of Anthropic for holding the line, hope folks at other labs will look closely at what they're agreeing to wrt the DoD. A shame because we need a strong, rational DoD, not one looking to fight imaginary culture war enemies.
www.anthropic.com/news/stateme...
loading . . .
Statement on the comments from Secretary of War Pete Hegseth
Anthropic's response to the Secretary of War and advice for customers
https://www.anthropic.com/news/statement-comments-secretary-war
8 days ago
1
12
1
I may not have gotten that Vercept job, but I did end up at the same place starting on the same day. Funny how life works!
www.anthropic.com/news/acquire...
loading . . .
Anthropic acquires Vercept to advance Claude's computer use capabilities
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
https://www.anthropic.com/news/acquires-vercept
10 days ago
1
13
0
Tomorrow will be my first day at Anthropic, where I'm joining the Safeguards team to work on prompt injection! I've written an info dump on it here:
alexcbecker.net/blog/prompt-...
loading . . .
Alex Becker β What's Next for Prompt Injection
https://alexcbecker.net/blog/prompt-injection-before-anthropic.html
12 days ago
5
55
2
you reached the end!!
feeds!
log in