Craig Sherstan
@craigsherstan.bsky.social
π€ 18
π₯ 3
π 13
AI Research Scientist - Reinforcement Learning. Tokyo based.
reposted by
Craig Sherstan
Marlos C. Machado
12 days ago
This paper has now been accepted
@neuripsconf.bsky.social
! Huge congratulations, Hon Tik (Rick) Tse and Siddarth Chandrasekar.
add a skeleton here at some point
0
7
3
Really cool opening at DeepMind right now for someone to explore "what comes after AGI" (closes Friday Sept 26, 2025):
job-boards.greenhouse.io/deepmind/job...
loading . . .
Research Scientist, Post-AGI Research
London, UK
https://job-boards.greenhouse.io/deepmind/jobs/6789253?gh_src=d39b99f31
8 days ago
1
1
0
reposted by
Craig Sherstan
James MacGlashan
20 days ago
Staff level link:
sonyglobal.wd1.myworkdayjobs.com/en-US/SonyGl...
0
2
1
reposted by
Craig Sherstan
James MacGlashan
20 days ago
Sony AI is hiring game integration engineers! We do awesome RL applications to modern video games. If that excites you, check out the posting! We have positions for senior and staff level developers. Senior dev link (staff level link next in π§΅):
sonyglobal.wd1.myworkdayjobs.com/en-US/SonyGl...
loading . . .
Senior AI Integration Engineer for Game AI
Sony AI America, a branch of Sony AI, is a remotely distributed organization spread across the U.S. and Canada. Sony AI is Sonyβs new research organization pursuing the mission to use AI to unleash hu...
https://sonyglobal.wd1.myworkdayjobs.com/en-US/SonyGlobalCareers/job/Remote---Virginia/Senior-AI-Integration-Engineer-for-Game-AI-2_JR-118197
1
26
7
reposted by
Craig Sherstan
James MacGlashan
about 1 month ago
The Sony AI Game AI team has reinforcement learning internships opens for 2026! It is remote for people in the US & Canada, mixed remote/onsite in Europe (onsite in Zurich), and onsite in Tokyo. If you want to work on RL with cool applications, sign up!
ai.sony/joinus/job-r...
loading . . .
Reinforcement Learning Research Intern 2026 for Game AI β Sony AI
https://ai.sony/joinus/job-roles/Reinforcement_Learning_Research_Intern_2026_for_GameAI/
1
7
5
timed coding tests -> stress -> fight or flight -> loss of fine motor control -> ca n'ptt typp
27 days ago
0
1
0
Monday's talk "Two Tales of Reward Design: GT Sophy and Factored Value Functions" is online:
youtu.be/PrsKX5ZWt_4
#RL
#GranTurismo
#GTSophy
loading . . .
Two Tales of Reward Design: GT Sophy and Factored Value Functions
YouTube video by Craig Sherstan
https://youtu.be/PrsKX5ZWt_4
about 1 month ago
0
2
2
Thanks to
@marloscmachado.bsky.social
for the invite to speak at the University of Alberta today. Hopefully someone remembers my main point: Reward design is really important. :)
about 1 month ago
0
3
0
reposted by
Craig Sherstan
James MacGlashan
about 1 month ago
"A scientist believes..." isn't noteworthy unless it can be followed by "because science shows..."
0
1
1
I just came across a technical article written by *Dr.* So-and-so. Seeing Dr. made me more impressed and then I remembered "I'm a Dr. too". Inbuilt biases :P I'm definitely going to start using my Dr. title.
2 months ago
0
1
0
Learning to Reason without External Rewards
arxiv.org/pdf/2505.19590
LLM finetuning is done ONLY using internal reward (model confidence) with no external grounding reward. That means the LLM had to already know how to solve the problems.
loading . . .
https://arxiv.org/pdf/2505.19590
3 months ago
0
2
0
Sometimes I imagine a world where all the friends that I'm trying to coordinate use the same messaging app. One can dream...
3 months ago
0
0
0
I was playing with a couple of emotion detection models today. Apparently my resting face is one of: disgust, anger, sad and my happy face is contempt :P
5 months ago
0
1
0
Cortical Labs combines human neurons with silicon computing for a cool
#cyborg
computer!!! And you can buy one, or use their cloud service.
corticallabs.com
loading . . .
Cortical Labs
We've combined lab-grown neurons with silicon chips and made it available to anyone, for first time ever.
https://corticallabs.com/
7 months ago
0
0
0
reposted by
Craig Sherstan
James MacGlashan
7 months ago
Let's go! Really psyched to have Barto and Sutton win the Turing award for Reinforcement Learning! Their work shaped my career in such profound ways.
www.nytimes.com/2025/03/05/t...
loading . . .
Turing Award Goes to A.I. Pioneers Andrew Barto and Richard Sutton
Andrew Barto and Richard Sutton developed reinforcement learning, a technique vital to chatbots like ChatGPT.
https://www.nytimes.com/2025/03/05/technology/turing-award-andrew-barto-richard-sutton.html
0
7
4
doi.org/10.48550/arX...
Really cool approach: agentic LLM generates its own actions as python functions. However, this is NOT doing RL - there is no reward over which the system optimizes.
#rl
#llm
loading . . .
DynaSaur: Large Language Agents Beyond Predefined Actions
Existing LLM agent systems typically select actions from a fixed and predefined set at every step. While this approach is effective in closed, narrowly-scoped environments, we argue that it presents t...
https://doi.org/10.48550/arXiv.2411.01747
10 months ago
0
0
0
I'm giving a keynote on building GT Sophy - our reinforcement learning based racing agent for Gran Turismo. Tuesday Nov 26, 2024 8:40 JST (online). Computers and Games Conference.
#rl
#granturismo
#sonyai
#gtsophy
10 months ago
0
8
1
you reached the end!!
feeds!
log in