在 《》 上一篇文章中,我已经掌握了如何用微软 agent Framework 实现会话记录的三方存储,解决了内存存储易丢失、多实例不共享的核心痛点。但随着对话场景的深入,新的问题浮出水面:长会话中历史消息不断累积,很容易超出大模型的上下文窗口限制,导致调用失败、响应变慢甚至 token 成本飙升。 今天就在 “三方存储会话” 的基础上,新增「内置聊天历史缩减器」功能 —— 借助Microsoft.Extensions.AI原生提供的两种缩减器,无需自定义开发,就能让 Agent 自动 “瘦身” 长会话历史,既保留关键上下文,又确保不超模型限制。双重保障,让长会话 Agent 真正具备生产环境落地能力。摘要:在 《》 上一篇文章中,我已经掌握了如何用微软 agent Framework 实现会话记录的三方存储,解决了内存存储易丢失、多实例不共享的核心痛点。但随着对话场景的深入,新的问题浮出水面:
当 Agent 支持会话持久化后,用户可能会进行多轮连续对话(比如客服咨询、代码调试、日常闲聊),但所有大模型都有明确的上下文窗口限制(例如 GPT-3.5 为 4k token,GPT-4o 为 128k token):
消息累积过多,总 token 数超限时,模型调用直接报错;
即使未超限,大量冗余历史会增加模型计算负担,响应速度明显下降;
三方存储中历史数据无限增长,长期占用存储资源,增加运维成本。
此时,需要一个 “智能瘦身工具”——聊天历史缩减器(Chat Reducer)。微软早已考虑到这一场景,在中内置了两种核心缩减器,直接开箱即用,无需重复造轮子。提供的两种缩减器,覆盖了不同长会话场景需求,我的 Demo 代码已支持 “注释切换”,下面结合官方定义和实际用法详细说明:缩减器类型
核心逻辑
构造参数(代码中已体现)
适用场景
MessageCountingChatReducer限制 非系统消息数量,保留最新 N 条;必保留第一条系统消息;排除函数调用 / 结果消息maxMessageCount:非系统消息最大保留数 简单问答、闲聊,需精准控消息数 SummarizingChatReducer
会话超阈值时自动摘要旧消息;保留系统消息 + 最新 N 条原始消息;排除函数相关消息
:摘要用模型客户端;:触发摘要阈值;maxSummaryCount:最大摘要数复杂长会话,需保留上下文语义
关键补充:两种缩减器的官方核心特性1. MessageCountingChatReducer官方定义核心提炼:
限制对话中非系统消息的数量,保留最新消息和第一条系统消息(若存在);排除包含函数调用或函数结果的消息,适用于需要约束聊天历史大小的场景(如适配模型上下文限制)。
简单说:它是 “精准裁剪” 工具,只保留最新的关键消息,不做语义处理,速度快、无额外 token 消耗。
2. SummarizingChatReducer官方定义核心提炼:
将聊天消息集合缩减为摘要形式;会话超指定长度时自动摘要旧消息,保留上下文同时减少消息数;保留系统消息,排除函数相关消息不参与摘要。
简单说:它是 “智能压缩” 工具,用模型将旧消息浓缩为摘要,既精简体积,又不丢失核心语义,适合超长复杂会话。
重要疑问:较上一篇为什么要显式定义?代码中这行是关键适配:
IChatClient chatClient = new OpenAIClient(...)!.GetChatClient(modelName).AsIChatClient;原因很明确:
SummarizingChatReducer需要调用大模型生成摘要,必须传入IChatClient实例;OpenAIClient.GetChatClient返回的是OpenAIChatClient,需通过AsIChatClient转换为通用接口,确保兼容性; 即使只用 MessageCountingChatReducer,显式定义也让代码更规范,后续切换缩减器时无需大幅修改。本次 Demo 基于优化后的代码,实现 “会话持久化 + 缩减器二选一”,核心目标:
会话历史存储在外部向量库(持久化不丢失,重启可恢复);
支持两种内置缩减器无缝切换,缩减后自动同步更新存储;
退出时验证缩减效果,直观看到保留的历史消息。
下面按 “依赖准备→核心代码拆解” 两步详解。
1. 核心依赖:确认版本,安装必备包确保项目引用以下包
PackageReference Include="Microsoft.Agents.AI.OpenAI" Version="1.0.0-preview.251110.2" />PackageReference Include="Microsoft.SemanticKernel.Connectors.InMemory" Version="1.67.1-preview" />2. 核心代码拆解:从存储到 Agent 的完整链路(1)VectorChatMessageStore:三方存储 + 缩减器的桥梁这个类是核心,既负责会话持久化,又集成了缩减器逻辑,兼容两种内置缩减器:
internal sealed class VectorChatMessageStore : ChatMessageStore{ private readonly VectorStore _vectorStore; // 外部存储载体 public string? ThreadDbKey { get; private set; } // 会话唯一标识 public IChatReducer? ChatReducer { get; } // 缩减器实例(兼容两种类型) public ChatReducerTriggerEvent ReducerTriggerEvent { get; } // 缩减触发时机 // 无缩减器构造函数(兼容旧场景) public VectorChatMessageStore(VectorStore vectorStore, jsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null) { this._vectorStore = vectorStore ?? throw new ArgumentNullException(nameof(vectorStore)); // 反序列化会话标识(支持线程恢复) if (serializedStoreState.ValueKind is JsonValueKind.String) { this.ThreadDbKey = serializedStoreState.Deserialize } } // 带缩减器构造函数(核心适配) public VectorChatMessageStore( IChatReducer chatReducer, VectorStore vectorStore, JsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null, ChatReducerTriggerEvent reducerTriggerEvent = ChatReducerTriggerEvent.BeforeMessagesRetrieval) : this(vectorStore, serializedStoreState, jsonSerializerOptions) { this.ChatReducer = chatReducer; this.ReducerTriggerEvent = reducerTriggerEvent; } // 核心方法:添加消息时自动缩减+同步存储 public override async Task AddMessagesAsync(IEnumerable { this.ThreadDbKey ??= Guid.NewGuid.ToString("N"); // 首次存储生成会话唯一标识 var collection = this._vectorStore.GetCollection await collection.EnsureCollectionExistsAsync(cancellationToken); 聊天历史缩减核心逻辑 // 1. 读取现有历史消息 var chatHistoryItems = collection.GetAsync( x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken); List await foreach (var record in chatHistoryItems) { chatHistoryMessages.Add(JsonSerializer.Deserialize } // 2. 合并现有历史+新消息 chatHistoryMessages.AddRange(messages); // 3. 触发缩减(添加消息后立即执行,两种缩减器自动适配) if (this.ReducerTriggerEvent is ChatReducerTriggerEvent.AfterMessageAdded && this.ChatReducer is not null) { chatHistoryMessages = (await this.ChatReducer.ReduceAsync(chatHistoryMessages, cancellationToken).ConfigureAwait(false)).ToList; } 同步更新缩减后的历史到三方存储 await collection.EnsureCollectionDeletedAsync; // 删除旧数据,避免冗余 // 存储缩减后的消息 await collection.UpsertAsync(chatHistoryMessages.Select(x => new ChatHistoryItem { Key = this.ThreadDbKey + x.MessageId, // 消息唯一键(会话标识+消息ID) Timestamp = DateTimeOffset.UtcNow, // 存储时间戳 ThreadId = this.ThreadDbKey, // 关联会话 SerializedMessage = JsonSerializer.Serialize(x), // 序列化消息 MessageText = x.Text // 消息文本(用于检索) }), cancellationToken); } // 读取缩减后的历史消息 public override async Task { var records = collection .GetAsync(x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken); List await foreach (var record in records) { messages.Add(JsonSerializer.Deserialize } messages.Reverse; // 按时间升序返回,适配Agent上下文处理 return messages; } // 序列化会话状态(支持线程持久化) public override JsonElement Serialize(JsonSerializerOptions? jsonSerializerOptions = null) { return JsonSerializer.SerializeToElement(this.ThreadDbKey); } // 向量库存储模型:定义消息存储结构 private sealed class ChatHistoryItem { [VectorStoreKey] public string? Key { get; set; } // 唯一键 [VectorStoreData] public string? ThreadId { get; set; } // 会话标识 [VectorStoreData] public DateTimeOffset? Timestamp { get; set; } // 时间戳 [VectorStoreData] public string? SerializedMessage { get; set; } // 序列化消息 [VectorStoreData] public string? MessageText { get; set; } // 消息文本 }}关键设计亮点:
兼容性:通过 IChatReducer接口适配两种内置缩减器,切换时无需修改存储类;数据一致性:缩减后先删除旧存储数据,再插入新数据,确保持久化的是精简后的数据;
触发时机:支持 AfterMessageAdded(添加后立即缩减)和BeforeMessagesRetrieval(查询前缩减),按需选择。(2)Agent 集成:双缩减器二选一,直接复制可用代码中已做好 “注释切换” 设计,两种方案无需大幅修改:
public static async Task DemoAsync(string apiKey, string modelName, string endpoint){ var clientOptions = new OpenAIClientOptions { Endpoint = new Uri(endpoint) }; // 显式定义IChatClient,适配缩减器切换 IChatClient chatClient = new OpenAIClient(new ApiKeyCredential(apiKey), clientOptions) .GetChatClient(modelName) .AsIChatClient; var agent = chatClient .CreateAIAgent(new ChatClientAgentOptions { Instructions = "你是一个擅长讲笑话的Agent,回复简洁有趣", // 第一条系统消息会被保留 Name = "ZerekZhang", ChatMessageStoreFactory = ctx => { // 配置MessageCountingChatReducer:非系统消息最多保留2条 return new VectorChatMessageStore( new MessageCountingChatReducer(maxMessageCount: 2), new InMemoryVectorStore, // 三方存储(可替换为Redis) ctx.SerializedState, ctx.JsonSerializerOptions, ChatReducerTriggerEvent.AfterMessageAdded // 添加消息后立即缩减 ); } }); // 线程序列化+恢复(会话持久化核心) AgentThread thread = agent.GetNewThread; JsonElement serializedThread = thread.Serialize; // 序列化线程状态(可存数据库/文件) AgentThread resumedThread = agent.DeserializeThread(serializedThread); // 恢复线程(模拟服务重启) // 交互循环 while (true) { var userInput = Console.ReadLine; if (userInput == "Exit") { // 验证缩减效果:打印保留的历史消息 var messageStore = resumedThread.GetService var messages = await messageStore.GetMessagesAsync; Console.WriteLine("\n缩减后的历史消息:"); foreach (var item in messages) { Console.WriteLine($"{item.Role}:{item.Text}"); } break; } var response = await agent.RunAsync(userInput, resumedThread); Console.WriteLine($"Agent Output:{response}\n"); }}方案 2:(解除注释可用,语义摘要)// 解除注释后替换方案1的Agent创建逻辑var agent = chatClient.CreateAIAgent(new ChatClientAgentOptions{ Instructions = "你是一个擅长讲笑话的Agent,回复简洁有趣", Name = "ZerekZhang", ChatMessageStoreFactory = ctx => { // 配置SummarizingChatReducer:超2条消息触发摘要,最多保留10条摘要 return new VectorChatMessageStore( new SummarizingChatReducer( chatClient: chatClient, // 摘要用的模型客户端 maxMessageCountBeforeSummarization: 2, // 消息数超2条触发摘要 maxSummaryCount: 10 // 最多保留10条摘要(避免摘要冗余) ), new InMemoryVectorStore, ctx.SerializedState, ctx.JsonSerializerOptions, ChatReducerTriggerEvent.AfterMessageAdded ); }});MessageCountingChatReducer:根据模型上下文窗口设置maxMessageCount(如 GPT-3.5 设为 5-8 条);SummarizingChatReducer:maxMessageCountBeforeSummarization建议设为模型上下文窗口的 1/3,平衡语义保留和 token 消耗;触发时机选择:
需审计完整历史:用 BeforeMessagesRetrieval(存储完整历史,查询时缩减); 需节省存储:用 AfterMessageAdded(存储缩减后历史);降级逻辑:生产环境需添加缩减器调用失败的降级处理(如默认保留最新 10 条消息)。
结合本次 Demo 代码,实现 “会话持久化 + 长会话不超限” 只需 3 步:
依赖准备:确保 Microsoft.Agents.AI ,内置两种缩减器可用;
存储适配:使用 VectorChatMessageStore,通过构造函数集成缩减器,自动处理缩减 + 存储同步; Agent 配置:根据场景选择缩减器(简单场景用 MessageCounting,复杂长会话用Summarizing),显式定义IChatClient适配切换。这个方案既解决了会话 “易丢失、不共享” 的问题,又借助微软原生能力突破了模型上下文限制,无需自定义开发,大幅提升开发效率,让长会话 Agent 真正能落地生产。
using Microsoft.Agents.AI;using Microsoft.Extensions.AI;using Microsoft.Extensions.VectorData;using Microsoft.SemanticKernel.Connectors.InMemory;using OpenAI;using System.ClientModel;using System.Text.Json;using static Microsoft.Agents.AI.InMemoryChatMessageStore;using VectorStore = Microsoft.Extensions.VectorData.VectorStore;namespace AgentDemo{ warning disable OPENAI001 warning disable MEAI001 /// /// Agent 会话记录三方存储 /// internal static partial class AgentConversationSaveBase { public static async Task DemoAsync(string apiKey, string modelName, string endpoint) { var clientOptions = new OpenAIClientOptions { Endpoint = new Uri(endpoint) }; IChatClient chatClient = new OpenAIClient(new ApiKeyCredential(apiKey), clientOptions).GetChatClient(modelName).AsIChatClient; var agent = chatClient .CreateAIAgent(new ChatClientAgentOptions { Instructions = "你是一个擅长讲笑话的Agent", Name = "ZerekZhang", ChatMessageStoreFactory = ctx => { return new VectorChatMessageStore(new MessageCountingChatReducer(2), new InMemoryVectorStore, ctx.SerializedState, ctx.JsonSerializerOptions, ChatReducerTriggerEvent.AfterMessageAdded); } }); ////使用 SummarizingChatReducer的Demo //var agent = chatClient.CreateAIAgent(new ChatClientAgentOptions //{ // Instructions = "你是一个擅长讲笑话的Agent", // Name = "ZerekZhang", // ChatMessageStoreFactory = ctx => // { // return new VectorChatMessageStore(new SummarizingChatReducer(chatClient, 2, 10), new InMemoryVectorStore, ctx.SerializedState, ctx.JsonSerializerOptions, ChatReducerTriggerEvent.AfterMessageAdded); // } //}); AgentThread thread = agent.GetNewThread; JsonElement serializedThread = thread.Serialize; AgentThread resumedThread = agent.DeserializeThread(serializedThread); while (true) { var userInput = Console.ReadLine; if (userInput == "Exit") { // 退出时查询存储的历史(验证缩减效果) var messageStore = resumedThread.GetService var messages = await messageStore.GetMessagesAsync; Console.WriteLine("\n缩减后的历史消息:"); foreach (var item in messages) { Console.WriteLine($"{item.Role}:{item.Text}"); } break; } var response = await agent.RunAsync(userInput, resumedThread); Console.WriteLine($"Agent Output:{response}\n"); } } } internal sealed class VectorChatMessageStore : ChatMessageStore { private readonly VectorStore _vectorStore; public string? ThreadDbKey { get; private set; } public IChatReducer? ChatReducer { get; } public ChatReducerTriggerEvent ReducerTriggerEvent { get; } public VectorChatMessageStore(VectorStore vectorStore, JsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null) { this._vectorStore = vectorStore ?? throw new ArgumentNullException(nameof(vectorStore)); if (serializedStoreState.ValueKind is JsonValueKind.String) { this.ThreadDbKey = serializedStoreState.Deserialize } } public VectorChatMessageStore(IChatReducer chatReducer, VectorStore vectorStore, JsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null, ChatReducerTriggerEvent reducerTriggerEvent = ChatReducerTriggerEvent.BeforeMessagesRetrieval) : this(vectorStore, serializedStoreState, jsonSerializerOptions) { this.ChatReducer = chatReducer; this.ReducerTriggerEvent = reducerTriggerEvent; } public override async Task AddMessagesAsync(IEnumerable { this.ThreadDbKey ??= Guid.NewGuid.ToString("N"); var collection = this._vectorStore.GetCollection await collection.EnsureCollectionExistsAsync(cancellationToken); 添加聊天记录的压缩 var chatHistoryItems = collection.GetAsync(x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken); List await foreach (var record in chatHistoryItems) { chatHistoryMessages.Add(JsonSerializer.Deserialize } chatHistoryMessages.AddRange(messages); if (this.ReducerTriggerEvent is ChatReducerTriggerEvent.AfterMessageAdded && this.ChatReducer is not null) { chatHistoryMessages = (await this.ChatReducer.ReduceAsync(chatHistoryMessages, cancellationToken).ConfigureAwait(false)).ToList; } 将压缩后的聊天记录同步更新到三方存储中 await collection.EnsureCollectionDeletedAsync; await collection.EnsureCollectionExistsAsync(cancellationToken); await collection.UpsertAsync(chatHistoryMessages.Select(x => new ChatHistoryItem { Key = this.ThreadDbKey + x.MessageId, Timestamp = DateTimeOffset.UtcNow, ThreadId = this.ThreadDbKey, SerializedMessage = JsonSerializer.Serialize(x), MessageText = x.Text }), cancellationToken); } public override async Task { var collection = this._vectorStore.GetCollection await collection.EnsureCollectionExistsAsync(cancellationToken); var records = collection .GetAsync(x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken); List await foreach (var record in records) { messages.Add(JsonSerializer.Deserialize } messages.Reverse; return messages; } public override JsonElement Serialize(JsonSerializerOptions? jsonSerializerOptions = null) { return JsonSerializer.SerializeToElement(this.ThreadDbKey); } private sealed class ChatHistoryItem { [VectorStoreKey] public string? Key { get; set; } [VectorStoreData] public string? ThreadId { get; set; } [VectorStoreData] public DateTimeOffset? Timestamp { get; set; } [VectorStoreData] public string? SerializedMessage { get; set; } [VectorStoreData] public string? MessageText { get; set; } } } warning restore OPENAI001 warning restore MEAI001 }来源:opendotnet
