LLMs can retrieve knowledge โ but can they connect it in *creative* ways to solve problems?
Introducing CresOWLve ๐ฆ, a new benchmark that evaluates creative problem-solving over real-world knowledge, using puzzles that require multiple creative thinking strategies.๐
about 2 months ago