Self-Evolving LLM Memory Extraction Across Heterogeneous Tasks
Abstract
Persistent LLM assistants need to decide what information from past interactions should become useful memory, but the right memory extraction behavior changes substantially across personalization, problem-solving, and agentic settings. This work studies heterogeneous memory extraction through BEHEMOTH, a benchmark built from 18 existing datasets and evaluated with downstream utility. The paper shows that fixed extraction prompts and self-evolving methods designed for homogeneous task distributions do not reliably transfer across task categories. It introduces CluE, a cluster-based self-evolving strategy that groups examples by extraction scenario, analyzes each group separately, and synthesizes cross-cluster guidance to update the extraction prompt. Across BEHEMOTH, CluE improves generalization over prior self-evolving frameworks.
BibTeX
@misc{yang2026selfevolvingllmmemoryextraction,
title={Self-Evolving LLM Memory Extraction Across Heterogeneous Tasks},
author={Yuqing Yang and Tengxiao Liu and Wang Bill Zhu and Taiwei Shi and Linxin Song and Robin Jia},
year={2026},
eprint={2604.11610},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2604.11610}
}