AI绘画EasyControl来了，宫崎骏「吉卜力」画风开源免费使用

摘要：各位AI绘画的玩家和创作者们！大家有没有遇到过这种情况：现在最新的AI绘画模型（像基于transformer的DiT，比如FLUX）效果超棒，但是想精确控制它生成的内容，比如固定人物姿势、保留特定人脸、或者控制画面布局，就感觉特别费劲？要么速度慢，要么一加控制

各位AI绘画的玩家和创作者们！大家有没有遇到过这种情况：现在最新的AI绘画模型（像基于transformer的DiT，比如FLUX）效果超棒，但是想精确控制它生成的内容，比如固定人物姿势、保留特定人脸、或者控制画面布局，就感觉特别费劲？要么速度慢，要么一加控制，自己喜欢的LoRA模型（比如特定角色或画风）效果就没了？

没错，虽然之前基于Unet的模型有ControlNet、IP-Adapter这些神器，但轮到更强的DiT模型，高效又灵活的控制就成了大难题。不过现在，重磅好消息来了！隆重向大家介绍 EasyControl —— 一个专门为解决这个问题而生的全新框架！它的目标就是让DiT模型的条件控制变得高效、灵活，而且超级方便！

你可以把 EasyControl 理解成一个给DiT这种高级AI绘画引擎量身定做的“万能控制套件”。它能让你在不“大改”AI核心、不牺牲太多速度的前提下，实现精准的控制。

它是怎么做到的呢？主要靠这三大法宝：

轻量级的“控制插件”（条件注入LoRA模块）：想象一下，EasyControl 提供了一种像“插件”一样的东西。每个插件专门负责一种控制信号（比如姿势、深度、人脸特征）。它能独立工作，即插即用，最关键的是，它不会跟你的基础模型或者其他自定义模型（比如你心爱的人物LoRA、画风LoRA）打架！就算只训练了单一控制，它也能神奇地在之后零样本组合多种控制（比如姿势+人脸+风格），效果还很和谐！

聪明的“尺寸魔法”（位置感知训练范式）：这个技术让EasyControl在训练时就学会理解不同分辨率的控制图。这意味着什么？意味着你生成图片时，可以自由设定想要的图片尺寸和长宽比，不再被死板地限制住，而且还能提高计算效率！

风驰电掣的“加速器”（因果注意力+KV缓存）： EasyControl采用了一种巧妙的技术（可以理解为缓存关键信息），能够显著减少生成图片时的等待时间，大大提升了效率！

条件信号通过新引入的条件分支注入扩散变换器 (DiT)，该分支与轻量级、即插即用的条件注入 LoRA 模块一起对条件标记进行编码。

EasyControl 框架示意图

在训练过程中，每个单独的条件都会被单独训练，其中条件图像会被调整到较低的分辨率，并使用位置感知训练范式进行训练。这种方法可以实现高效灵活的分辨率训练。该框架集成了因果注意力机制，从而能够实现键值 (KV) 缓存，从而显著提升推理效率。此外，这样的设计有助于无缝集成多个条件注入 LoRA 模块，从而实现稳健且协调的多条件生成。

很多朋友特别喜欢用LoRA模型来还原宫崎骏（吉卜力）那种梦幻的动画风格。但以前常常遇到的问题是，一旦想用ControlNet之类的工具控制姿势，吉卜力画风可能就“跑偏”了，或者效果大打折扣。

EasyControl 的巨大优势就在于它的兼容性！因为它的控制模块是轻量且独立的，它在施加控制（比如引导姿势、锁定主体）的同时，能够最大限度地保留你加载的自定义LoRA模型的效果。也就是说，你可以用EasyControl来精准控制画面内容，同时让你心爱的宫崎骏画风LoRA完美发挥作用！

最最最激动人心的是，EasyControl 团队已经把它开源了！你可以在 GitHub 和 Hugging Face 上找到相关的代码和模型。这意味着，你现在就可以去下载、去尝试，把它集成到你的AI绘画工作流里，创作出既精准可控、又充满艺术风格的神奇画作！

总而言之，如果你希望：

对最新的AI绘画模型（DiT架构）进行更精准的控制；

同时使用各种自定义的人物或风格LoRA（比如吉卜力风）；

并且希望生成速度更快、更灵活；

那么，EasyControl 绝对是你不能错过的神器！它高效、灵活、兼容性强，快去试试看，让你的AI创作更上一层楼吧！

当然EasyControl已经开源了，喜欢代码的小伙伴可以直接到 GitHub 上面参考代码进行本地模型的生成与部署。如下是吉卜力风格图片的代码，可以上传自己的图片进行吉卜力风格的图片生成以及控制。

import spacesimport osimport jsonimport timeimport torchfrom PIL import Imagefrom tqdm import tqdmimport gradio as grfrom safetensors.torch import save_filefrom src.pipeline import FluxPipelinefrom src.transformer_flux import FluxTransformer2DModelfrom src.lora_helper import set_single_lora, set_multi_lora, unset_lorabase_path = "black-forest-labs/FLUX.1-dev" lora_base_path = "./checkpoints/models"pipe = FluxPipeline.from_pretrained(base_path, torch_dtype=torch.bfloat16)transformer = FluxTransformer2DModel.from_pretrained(base_path, subfolder="transformer", torch_dtype=torch.bfloat16)pipe.transformer = transformerpipe.to("cuda")def clear_cache(transformer): for name, attn_processor in transformer.attn_processors.items: attn_processor.bank_kv.clear@spaces.GPUdef single_condition_generate_image(prompt, spatial_img, height, width, seed, control_type): if control_type == "Ghibli": lora_path = os.path.join(lora_base_path, "Ghibli.safetensors") set_single_lora(pipe.transformer, lora_path, lora_weights=[1], cond_size=512) spatial_imgs = [spatial_img] if spatial_img else image = pipe( prompt, height=int(height), width=int(width), guidance_scale=3.5, num_inference_steps=25, max_sequence_length=512, generator=torch.Generator("cpu").manual_seed(seed), subject_images=, spatial_images=spatial_imgs, cond_size=512, ).images[0] clear_cache(pipe.transformer) return imagecontrol_types = ["Ghibli"]with gr.Blocks as demo: gr.Markdown("# Ghibli Studio Control Image Generation with EasyControl") gr.Markdown("The model is trained on **only 100 real Asian faces** paired with **GPT-4o-generated Ghibli-style counterparts**, and it preserves facial features while applying the iconic anime aesthetic.") gr.Markdown("Generate images using EasyControl with Ghibli control LoRAs.（Due to hardware constraints, only low-resolution images can be generated. For high-resolution (1024+), please set up your own environment.）") gr.Markdown("**[Attention!!]**：The recommended prompts for using Ghibli Control LoRA should include the trigger words: `Ghibli Studio style, Charming hand-drawn anime-style illustration`") gr.Markdown("If you like this demo, please give us a star (github: [EasyControl](https://github.com/Xiaojiu-z/EasyControl))") with gr.Tab("Ghibli Condition Generation"): with gr.Row: with gr.Column: prompt = gr.Textbox(label="Prompt", value="Ghibli Studio style, Charming hand-drawn anime-style illustration") spatial_img = gr.Image(label="Ghibli Image", type="pil") # 上传图像文件 height = gr.Slider(minimum=256, maximum=1024, step=64, label="Height", value=768) width = gr.Slider(minimum=256, maximum=1024, step=64, label="Width", value=768) seed = gr.Number(label="Seed", value=42) control_type = gr.Dropdown(choices=control_types, label="Control Type") single_generate_btn = gr.Button("Generate Image") with gr.Column: single_output_image = gr.Image(label="Generated Image") single_generate_btn.click( single_condition_generate_image, inputs=[prompt, spatial_img, height, width, seed, control_type], outputs=single_output_image)demo.queue.launch

当然 easy control 还有其他方面的控制，比如图片尺寸控制，人体姿态控制，以及可以使用素描画生成对应的图片等等，更多精彩应用可以参考官方 GitHub地址以及 hugging face 开源模型自行使用。