「实战指南」使用 Python 轻松调用大模型（LLM）：从入门到实践

摘要：在人工智能快速发展的今天，大语言模型（LLM）已成为开发者构建智能应用的核心工具。无论是对话机器人、文本生成还是智能问答，调用大模型都能大幅提升开发效率。本文将从实战角度出发，详细讲解如何用 Python 实现大模型的调用，涵盖主流平台 API、本地部署模型及

在人工智能快速发展的今天，大语言模型（LLM）已成为开发者构建智能应用的核心工具。无论是对话机器人、文本生成还是智能问答，调用大模型都能大幅提升开发效率。本文将从实战角度出发，详细讲解如何用 Python 实现大模型的调用，涵盖主流平台 API、本地部署模型及通用框架集成，帮助开发者快速上手。

Python 凭借丰富的库生态、简洁的语法和强大的兼容性，成为调用大模型的首选语言。无论是调用云端 API 还是本地部署模型，Python 都能提供成熟的工具链，让开发者无需深入底层原理，即可快速实现模型集成。

云端大模型 API 是最常用的调用方式，无需本地配置硬件，通过简单的 HTTP 请求即可使用 GPT、文心一言等主流模型。

OpenAI 提供了规范的 RESTful API，支持通过 requests 库直接调用：

import requestsimport jsondef call_gpt(prompt, api_key): url = "https://api.openai.com/v1/chat/completions" headers = { "Content-Type": "application/json", "Authorization": f"Bearer {api_key}" } data = { "model": "gpt-3.5-turbo", "messages": [ {"role": "system", "content": "你是专业的技术助手"}, {"role": "user", "content": prompt} ], "temperature": 0.7, # 控制输出随机性（0-1） "max_tokens": 500 # 限制生成长度 } response = requests.post(url, headers=headers, data=json.dumps(data)) return response.json["choices"][0]["message"]["content"]# 使用示例print(call_gpt("解释什么是大语言模型", "your_api_key"))

国内模型通常需要先获取访问令牌（Access Token），再发起请求：

import requestsdef get_ernie_token(api_key, secret_key): """获取文心一言访问令牌""" url = f"https://aip.bAIdubce.com/oauth/2.0/token?grant_type=client_credentials&client_id={api_key}&client_secret={secret_key}" return requests.get(url).json.get("access_token")def call_ernie(prompt, token): url = f"https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions_pro?access_token={token}" data = {"messages": [{"role": "user", "content": prompt}]} response = requests.post(url, json=data) return response.json.get("result")# 使用示例token = get_ernie_token("your_api_key", "your_secret_key")print(call_ernie("介绍自然语言处理的应用场景", token))

对于数据敏感或无网络环境，本地部署模型是更佳选择。以开源框架 llama.cpp 为例，可运行 LLaMA、Mistral 等模型：

下载模型文件（如 .gguf 格式的量化模型）启动 llama.cpp 服务：./server -m model.gguf -c 2048通过 HTTP 请求调用：pythonimport requests def call_local_llm(prompt): url = "http://localhost:8080/completion" data = { "prompt": f"用户问：{prompt}\n回答：", "n_predict": 300, "temperature": 0.8, "stop": ["\n用户问："] } response = requests.post(url, json=data) return response.json.get("content") # 使用示例 print(call_local_llm("推荐一本 Python 入门书籍"))

使用 LangChain 等框架可实现多模型无缝切换，降低代码耦合度：

from langchain_openai import ChatOpenAIfrom langchain_community.llms import LlamaCppfrom langchain.chains import LLMChainfrom langchain.prompts import PromptTemplate# 初始化模型（可切换为本地或云端模型）llm = ChatOpenAI(model_name="gpt-3.5-turbo", api_key="your_key")# 本地模型配置：llm = LlamaCpp(model_path="./model.gguf", n_ctx=2048)# 定义提示模板prompt = PromptTemplate( input_variables=["question"], template="请简洁回答：{question}")# 创建调用链chain = LLMChain(llm=llm, prompt=prompt)print(chain.run("机器学习和深度学习的区别是什么？"))

本文介绍了 Python 调用大模型的三种核心方式：云端 API 快速集成、本地模型隐私部署、框架统一调用。开发者可根据场景需求选择合适方案 —— 快速验证用云端 API，隐私离线用本地部署，复杂应用用框架集成。掌握这些方法，能让你在 AI 应用开发中事半功倍，轻松构建智能产品。