AI Agent Papers 2026
← Collections
Compiles a corpus offline into a hierarchical tree of Agent Skills that the LLM agent navigates at query time, replacing retrieval with skill-tree traversal.
Investigates routing agent memory queries to different processing tiers based on query difficulty to control the cost-accuracy trade-off at runtime.
Proposes a shared memory bank with a learned controller that decides what information is worth passing between parallel agent teams to reduce redundant work.
Explores converting a corpus into atomic QA pairs offline to resolve multi-hop questions with just two LLM calls regardless of hop count.
Examines breaking financial RAG answers into atomic facts and verifying each against retrieved documents using reinforcement learning rewards.
Surveys graph-based memory architectures for agents, covering extraction, storage, retrieval, and how memory evolves over time.
Proposes a multi-agent system for inventory management that retrieves similar past decisions to adapt ordering across various supply chain scenarios.
Explores replacing flat chunk-based RAG with graph experts that understand entity relationships, causality, and process flows for structured documents like SOPs.
Investigates letting agents save step-by-step procedural skills from past runs and reuse them later without retraining to reduce repeated computation.
Proposes an agentic method for aggregation queries over unstructured text that tries to find all matching evidence, breaking the task into disambiguation, filtering, and aggregatio…
Proposes an agentic RAG framework that uses reflection and memory-based refinement to generate diverse answers for open-ended questions.
Proposes joint optimization of planning and execution in agentic RAG by modeling the system as a cooperative multi-agent team with shared backbone and outcome-based rewards.
Proposes process-supervised reinforcement learning for RAG that uses MCTS-based step-level rewards to identify and fix flawed reasoning steps in multi-hop retrieval.
Introduces an episodic memory framework where assistant agents maintain uncompressed memory contexts while a master agent orchestrates global planning, replacing destructive memory…
Proposes a tiered memory service for agentic LLM systems that uses masked mixture-of-experts routing to probe only eligible memory shards under a fixed budget.
Explores adaptive query optimization in RAG using reinforcement learning to dynamically decide when to split complex queries into sub-queries and fuse the retrieved results.
Introduces an adaptive agentic Graph-RAG framework that verifies evidence sufficiency and progressively escalates retrieval effort, mapping graph signals back to source text to han…
Investigates augmenting multimodal LLMs with a trainable memory gate that decides which observations to retain, update, or discard during online embodied agent exploration.
Proposes a multi-agent memory framework with hierarchical granularity, adaptive query routing, consistency verification, and targeted memory refresh for long-term agent interaction…
Examines when iterative retrieval-reasoning loops outperform static gold-context RAG in scientific multi-hop QA, diagnosing failure modes across retrieval coverage, hypothesis drif…
Introduces a dependency-aware search framework that uses GRPO reinforcement learning to teach LLMs to decompose questions with dependency relationships and store intermediate resul…
Proposes a biologically-inspired agent memory architecture with adaptive exponential decay, LLM-guided conflict resolution, and intelligent memory fusion across a dual-layer hierar…
Explores two fusion operators for Graph RAG that combine graph-aware reranking with semantic-topological expansion to improve retrieval accuracy and generation quality.
Proposes a generator-aligned reranking and pruning module for RAG that selects evidence using utility signals and filters weak or harmful passages before context truncation.
Introduces a step-by-step reasoning reranking agent for RAG that distinguishes semantically similar but logically irrelevant passages in retrieval-augmented question answering.
Introduces a multi-agent RAG framework that coordinates sequential and parallel inference-time scaling under unified context management to prevent contamination and improve multi-h…
Proposes a nugget-augmented generation system that constructs a bank of Q&A nuggets from retrieved documents to guide extraction, selection, and report generation with citation pro…
Introduces a hybrid RAG architecture combining query augmentation, agentic routing, and structured retrieval that merges vector and graph-based techniques with context unification …
Presents a systematic study of metadata-aware retrieval strategies for RAG, comparing prefix, suffix, unified embedding, and late-fusion approaches with field-level ablations on em…
Proposes a hierarchical global-to-local retrieval strategy for GraphRAG with beam search-optimized re-ranking and a compact LLM integration module trained via dynamic-weighting rei…
Introduces an agentic memory system that indexes trajectory steps with structured contextual intent cues and retrieves history by intent compatibility to reduce interference in lon…
Proposes a structure-informed and diversity-constrained context bubble construction framework for RAG that preserves document structure and balances relevance, coverage, and redund…
Introduces a dual-architecture RAG framework that routes narrative through dense retrievers and tabular data through a cell-aware late interaction mechanism to preserve spatial rel…
Defines a class of memory systems for long-horizon agents that maintain persistent, temporally chained internal state instead of stateless RAG lookups, specifying the architectural…
Surveys foundation agent memory organized by substrate (internal/external), cognitive mechanism (episodic, semantic, working, procedural), and subject (agent- vs user-centric).
Surveys memory in LLMs and multimodal LLMs across implicit, explicit, and agentic paradigms, covering cross-modal integration and challenges like capacity, alignment, and factual c…
Decomposes memory management into atomic CRUD operations and learns an autonomous policy via SFT + RL to study whether learnable memory outperforms static-workflow methods on long-…
Feeds explicit document quality signals (relevance score, ranking, QPP) into RAG generation to study whether exposing retrieval metadata makes the model more robust to noisy contex…
Benchmarks vector-only, LLM-extracted KG, and AST-derived graph pipelines for code RAG, comparing correctness and indexing cost across deterministic and LLM-based graph constructio…
Proposes an agentic RAG framework that dynamically decides whether to retrieve new evidence or reason over existing context at each step, aiming to eliminate redundant retrieval.
Proposes a training-free RAG decoding method that treats retrieved documents as isolated "experts" and aggregates their logits via retrieval-aware contrastive decoding to recover c…
Proposes a query-aware agentic memory system that achieves sub-linear retrieval through temporal and semantic DAG-Tag indexing with an embedding-tag co-consolidation mechanism for …
Proposes treating memory abstraction as a learnable cognitive skill, training a memory copilot via DPO to determine how memories should be structured, abstracted, and reused across…
Introduces a temporal semantic memory framework that organizes memories by actual occurrence time rather than dialogue time and consolidates temporally continuous information into …
Proposes an agent-centric architecture inspired by Physarum polycephalum where the agent autonomously decides when to consolidate learnings and prune raw interaction history to man…
Proposes a reason-and-construct paradigm for GraphRAG that dynamically builds query-specific evidence graphs by instantiating facts from a latent relation pool and discarding distr…
Introduces a plug-and-play RAG framework that disentangles semantic match from factual consistency and estimates self-answerability to make the conflict-resolution decision process…
Proposes a construction-integration approach for multi-hop RAG that preserves multiple evidence chains via iterative triple construction and adaptively expands context granularity …
Proposes a working memory framework that constructs structured episodic narratives from conversational fragments, consolidates memories with momentum, and semanticizes peripheral f…
Proposes an adaptive RAG framework that uses entropy-based gating to bypass vector database retrieval when model uncertainty is low, triggering expensive chunk retrieval only when …
Proposes a decoupled multi-agent RAG framework for multi-hop QA with a Plan-Retrieve-Inspect-Solve-Memoize architecture and two-stage GRPO optimization to address retrieval collaps…
Proposes a framework for user-controllable memory reliance in long-term agent interactions, modeling memory dependence as an explicit and steerable dimension.
Proposes proactive memory extraction using self-questioning feedback loops instead of one-off static summarization to recover missing information and correct errors iteratively.
Proposes a hierarchical memory architecture with a Topic Loom that groups consecutive same-topic dialogue turns into coherent memory boxes and links them via long-range event-timel…
Proposes a multi-graph agentic memory architecture that represents memories across orthogonal semantic, temporal, causal, and entity graphs with policy-guided traversal for retriev…
Proposes a hippocampus-inspired memory architecture for AI assistants that fuses RL-trained short-term memory extraction with partitioned long-term memory for personalization.
Proposes a three-stage memory framework based on semantic lossless compression with structured compression, online semantic synthesis, and intent-aware retrieval planning.
The emergence of writable, cross-session persistent memory in LLM agents introduces a qualitatively different threat landscape from conventional input-centric security concerns, ch…
Memory & RAG 2604.16548 notes β†’ πŸ’¬ Tier 2 필독 (loom μž‘μ—…μž). λ©”λͺ¨λ¦¬ κ±°λ²„λ„ŒμŠ€λ₯Ό 5개 primitive둜 ν˜•μ‹ν™”:…
Long-context Large Language Models, despite their expanded capacity, require careful working memory management to mitigate attention dilution during long-horizon tasks. Yet existin…
Memory & RAG 2510.12635 notes β†’ πŸ’¬ Tier 2. λ©”λͺ¨λ¦¬λ₯Ό action으둜 β€” 둱호라이즌 νƒœμŠ€ν¬μ—μ„œ μ»¨ν…μŠ€νŠΈ 자율 νλ ˆμ΄μ…˜. …