Latent Distance Alignment in Large Language Models as World Models
יום ראשון 21.12 13:00 - 13:30
- Graduate Student Seminar
-
Bloomfield 527
Abstract: Large Language Models (LLMs) have shown surprising abilities in reasoning and decision-making, making them attractive tools for planning tasks expressed in natural language. A common assumption is that distances in foundation model embedding spaces reflect meaningful semantic or structural similarity, potentially allowing efficient greedy search. We first explore whether embeddings generated by these models exhibit a property we term Latent Distance Alignment, where geometric distances in latent space correspond to goal distances in classical planning tasks. Extensive experiments across thirteen benchmark domains show that this property does not hold reliably for off-the-shelf embeddings, revealing that raw latent distances systematically fail to reflect the true cost of planning. To bridge this gap, we introduce a learnable component: a neural network transition function trained to predict the embedding of a resulting state given a current state and action. We evaluate this learned dynamics model on two specific tasks: State Identification, where the model must identify the true resulting state from a batch of negative candidates, and Action Disambiguation, where the model must determine the correct action by verifying which action's prediction is most similar to the actual outcome. Our results demonstrate partial success on both tasks, with performance varying significantly across domains. This suggests that while raw embeddings lack necessary structure, learning a transition function within the latent space can partially recover the dynamics required for planning, though domain-specific challenges remain.