Tim Kellogg
@timkellogg.me
đ¤ 7173
đĽ 738
đ 11555
AI Architect | North Carolina | AI/ML, IoT, science WARNING: I talk about kids sometimes
pinned post!
Does AI get bored? I gave them nothing to do, just to see what happens one thing â they devolve into a repetitive âcollapseâ state, I guess you could call it boredom but some break out into math & poetry on their own, I didnât expect which ones that would be
timkellogg.me/blog/2025/09...
loading . . .
Does AI Get Bored?
https://timkellogg.me/blog/2025/09/27/boredom
about 1 month ago
15
75
22
Kimi K2-Thinking a new leader?
moonshotai.github.io/Kimi-K2/thin...
about 2 hours ago
2
10
1
OpenAI has been getting ready to release GPT-5.1 (this from their iOS code) pretty sure iâve A/B tested it, and it was a big step up, at least for the search-type queries i typically do
about 6 hours ago
0
19
2
looks interesting
add a skeleton here at some point
about 8 hours ago
0
4
0
@cameron.pfiffer.org
whatâs the idea here? just seeing if an agent will become anything useful without curation?
add a skeleton here at some point
about 19 hours ago
1
5
0
Apple accidentally leaked that itâs using a 1.2T model from Google so is that Gemini 3 Flash, Pro or Ultra thatâs 1.2 T? fwiw, general speculation says Google wouldnât give Apple a Pro model
www.bloomberg.com/news/article...
loading . . .
Apple Nears Deal to Pay Google Roughly $1 Billion a Year for Siri AI Model
Apple Inc. is planning to pay about $1 billion a year for an ultrapowerful 1.2 trillion parameter artificial intelligence model developed by Alphabet Inc.âs Google that would help run its long-promise...
https://www.bloomberg.com/news/articles/2025-11-05/apple-plans-to-use-1-2-trillion-parameter-google-gemini-model-to-power-new-siri?embedded-checkout=true
about 20 hours ago
5
32
3
đ¨ this is not a drill!
add a skeleton here at some point
about 20 hours ago
0
17
2
why is gemini 3 delayed? the answers may SHOCK you reason 1: the mixture of experts got too opinionated and now they can't agree on anything reason 2: they designed it as a Matryoshka transformer, but they misplaced one of them and they're still scrambling to find it
about 22 hours ago
5
25
0
are we FINALLY getting async cobol?
about 23 hours ago
2
8
0
is the supreme court broadcasting audio live on CNN? is that new? i thought audio was hard to get a hold of
1 day ago
3
4
0
PSA: MCP is fine and not scary at all if you use it to only pull information like if your pulling data from 6 different databases, use MCP, itâs fine Just disable tools that can edit or delete (or do anything that might make you sad if done wrong)
1 day ago
4
17
0
Lily is one of my coworkers. imo she's the archetype of what success looks like for AI in the workforce she's in marketing, and leverages AI in ridiculously powerful ways i like her phrasing â "Ops" is the superpower, not AI. AI is merely the tool
www.appliedaiformops.com/p/why-ops-sk...
loading . . .
Why Ops Skills Are Your AI Superpower
How your ops skills supercharge AI effectiveness - plus a practical roadmap for building AI solutions for your business
https://www.appliedaiformops.com/p/why-ops-skills-are-your-ai-superpower
1 day ago
0
13
0
overheard: âtheyâre on twitter, instagram, X.. i donât even know what X is, what is X?â
1 day ago
0
11
1
we didnât see many US AI releases in October. i had been predicting a lot more best case: AI slowed down worst case: it sped up but nothingâs being released
1 day ago
2
20
1
Anthropic predicting twice the revenue as OpenAI in â28, regarding selling AI to businesses I imagine OpenAIâs remaining business revenue will boil off as the Microsoft exclusivity deal ends
www.theinformation.com/articles/ant...
loading . . .
Anthropic Projects $70 Billion in Revenue, $17 Billion in Cash Flow in 2028
Anthropic this summer hiked its most optimistic growth forecasts by roughly 13% to 28% over the next three years and projected generating as much as $70 billion in revenue in 2028, up from close to $5...
https://www.theinformation.com/articles/anthropic-projects-70-billion-revenue-17-billion-cash-flow-2028?utm_campaign=Editorial&utm_content=Exclusive&utm_medium=organic_social&utm_source=twitter
1 day ago
5
12
3
Windsurf Codemaps actually this makes a ton of sense â if vibe coding only works on small/non-complex projects, then the answer is to tackle complexity directly Codemaps uses LLMs to create an âindexâ over your code, a map of where things are
cognition.ai/blog/codemaps
1 day ago
3
21
2
Starcloud: GPUs in space This company finally launched their first H100 into high Earth orbit. A solar array for power, uninterrupted by weather or nighttime, and a black plate in the back to radiate heat away into -270°C space
starcloudinc.github.io/wp.pdf
2 days ago
6
16
2
Thinking in both text & image leads to new emergent properties researchers SFTâd a small 7B on reasoning traces that use both image and text for reasoning results: - huge đ on benchies - emergent properties, like using image or text reasoning at the right time
thinkmorph.github.io
2 days ago
0
14
1
Anthropic Model Depreciation Process Anthropic sweetly asked Sonnet about its preferences in how it wanted to be deprecated in addition: - no, still not open weights - preserve weights and keeping it running internally - letting models pursue their interests
www.anthropic.com/research/dep...
2 days ago
5
33
8
the fact that napalm death doesnât have a song called â996â is an indictment of the metal scene
add a skeleton here at some point
2 days ago
1
11
0
linkedin seems to heavily penalize dormant accounts, so i try to leave a slow steady trickle of clickbait posts there in case i ever need to get big reach on a post
2 days ago
0
8
0
this seems like a huge accelerant to climate and earth science anyone out there more familiar with exactly how this thing is used? ideally specific use cases..
add a skeleton here at some point
2 days ago
2
21
1
OpenAI: We're buying $743B in GPUs from Mayo Clinic Mayo Clinic: Yes, we're very excited that OpenAI is interested in Geriatric Psychiatric health, it's long past due
2 days ago
0
9
0
I added this to my
AGENTS.md
file (text in alt) and it seems to work well i had an environment error, spun out a new codex-cli to figure it out, it wrote a lesson. my other codex instances can see it and benefit from it
2 days ago
4
34
3
when i have more time, i want to figure out why anything with flash attention ends up being so f huge probably something
@advanced-eschatonics.com
knows about want a vLLM docker image? cool, 2KB. oh, with flash attention? thatâs 2TB
2 days ago
3
4
0
Consistency Training new GDM research notes that both jailbreaking and sycophancy share a common cause â subtle changes in the prompt cause dramatic changes in output they address it by training on small prompt changes, expecting the same result
deepmindsafetyresearch.medium.com/consistency-...
loading . . .
Consistency Training Could Help Limit Sycophancy and Jailbreaks
Authors: Alex Irpan* and Alex Turner*, Mark Kurzeja, David Elson, and Rohin Shah
https://deepmindsafetyresearch.medium.com/consistency-training-could-help-limit-sycophancy-and-jailbreaks-668c184df154
2 days ago
2
23
2
iâve been thinking about this a lot the last month, finally wrote it up i think it enables truly general agents to operate with oversight, but also brings sanity to the current state of AI security
add a skeleton here at some point
2 days ago
1
9
1
this is a nightmare for alignment research, but it shouldnât be this should be a tool for peeking inside LLM thought process
add a skeleton here at some point
2 days ago
1
9
0
MCP Colors A riff off of the lethal trifecta for addressing prompt injection, this is a simple heuristic to ensure security at runtime red = untrusted content blue = potentially critical actions An agent can't be allowed to do both
timkellogg.me/blog/2025/11...
loading . . .
MCP Colors: Systematically deal with prompt injection risk
https://timkellogg.me/blog/2025/11/03/colors
3 days ago
3
32
6
Cache to Cache: let agents communicate in KV cache latent space Instead of con concatenating the text from one agent into another, just concatenate their KV cache directly this is dumb, how do i get it now???
fuvty.github.io/C2C_Project_...
3 days ago
2
44
8
all yâall who think LLMs are too sycophantic havenât talked to a 5yo
3 days ago
2
23
0
BREAKING: the Microsoft <-> OpenAI alliance
add a skeleton here at some point
3 days ago
1
29
5
1st package manager result for "wat": pypi: Deep inspection of Python objects npmjs: Community-controlled cheat sheets for every coder.
crates.io
: Rust parser for the WebAssembly Text format, WAT go: Package wat is a generated protocol buffer package.
3 days ago
1
6
0
happy monday
3 days ago
0
8
0
yesterday i got a A/B test on chatgpt and the alternative was *really* good the thought process was very structured. it gave me both depth and breadth at the same time
3 days ago
2
14
0
Rule of Two: fighting prompt injection
@simonwillison.net
posted on a new phrasing of the Lethal Trifecta that changes one node to âexternally communicate OR change stateâ in my own work, my version was âOR perform critical actionsâ
simonwillison.net/2025/Nov/2/n...
3 days ago
1
16
2
this is getting ridiculous
4 days ago
0
3
0
itâs plausible that Gemini is ~10T, and GDM has been clear that the next Gemini is 10x the size thatâs big enough that it canât fit onto a single server rack. imagine the crazy problems you face at that scale like, thereâs no way itâs serving one request at a time. no chance
4 days ago
2
19
1
whatâs the most pointless LLM emergent behavior? a behavior that clearly doesnât help on benchmarks or make a product âbetterâ maybe eval awareness & alignment faking?? although arguably that helps even more since it effectively fabricates perfect alignment scores
4 days ago
17
29
2
GB300âs cost $3M per rack thatâs for an NVL72, i.e. rack of 72 B300 GPUs along with some shared memory boost, some CPUs, etc. so i guess that puts a single B300 at around $40k
www.barrons.com/livecoverage...
loading . . .
Nvidiaâs Multi-Million Dollar AI Servers Are Getting More Expensive
Nvidia is finally shipping in volume its most important product lineup in years: the 72 GPU rack servers called the GB200 NVL72 and the GB300 NVL72. The NVL72 systems incorporate 72 GPUs, linked toget...
https://www.barrons.com/livecoverage/nvidia-earnings-stock-price-jensen-huang/card/nvidia-s-multi-million-dollar-ai-servers-are-getting-more-expensive-fQAv8OTMJhJU0Ql8VzWZ
4 days ago
1
5
0
reposted by
Tim Kellogg
Julien Chaumond
4 days ago
Training LLMs end to end is hard. But way more people should, and will, be doing it in the future. The
@hf.co
Research team is excited to share their new e-book that covers the full pipeline: ¡ pre-training, ¡ post-training, ¡ infra. 200+ pages of what worked and what didnât. ⤾ď¸
3
111
20
going around my house measuring watt usage on various devices and holy wow people who are concerned about AI electricity use: do you drink tea or coffee?
4 days ago
7
37
1
my most autistic behavior is needing consistency iâm moving everything out of my office so we can get work done on the floors and it is stressing me tf out i just moved a bookshelf out and i need to take a tea break to calm down
4 days ago
2
10
0
reposted by
Tim Kellogg
Alexander Doria
4 days ago
bf16 halloween might be already ending. according to a bytedance engineer could just have been another flash-attention bug.
1
33
6
i feel like this is a good summary of the current state of AI agents & models
4 days ago
1
11
2
people here & everywhere bitterly disagree whenever AI consciousness comes up imo thatâs amazing, very encouraging. itâs a topic that makes us question our very existence weâd have to be truly lulled into senselessness to not bitterly disagree over it
4 days ago
3
24
0
LLMs can report their own experience the most convincing experiment: they isolated the deception control vector and had it talk about its own consciousness it more openly introspected when deception was *suppressed*, and boosting reduced introspection
arxiv.org/abs/2510.24797
loading . . .
Large Language Models Report Subjective Experience Under Self-Referential Processing
Large language models sometimes produce structured, first-person descriptions that explicitly reference awareness or subjective experience. To better understand this behavior, we investigate one theor...
https://arxiv.org/abs/2510.24797
4 days ago
2
23
2
looked into it, codex-cli switched to Rust because: 1. static binary / easy distribution 2. easier OS-level security controls 3. perf 4. extensibility â they want to introduce wire-level extensions that can be written in a variety of languages
add a skeleton here at some point
5 days ago
2
24
2
thatâs codex-cli and claude code now is Rust taking over AI? whatâs the advantage of Rust here?
add a skeleton here at some point
5 days ago
5
12
2
PSA if you think a news article is dumb, DONâT share it. That just makes the problem worse
5 days ago
7
61
7
holy shit, yes, such a good thread, the entire thing
add a skeleton here at some point
5 days ago
0
8
1
Load more
feeds!
log in