CycleUser

阿里新发布的 Z-Image 模型，生成图片效果不错，但是显存需求要至少16G好像，本文介绍如何使用 CPU 运行 Z-Image 模型。

一、环境准备

首先创建 Conda 虚拟环境：

conda create -n zimage python=3.12
conda activate zimage

然后安装依赖：

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130
pip install git+https://github.com/huggingface/diffusers
pip install transformers
pip install modelscope

接下来用modelscope下载模型到指定文件夹，注意，接下来将模型下载到了 ~/Z-Image-Turbo/ 这个目录，后面代码也存在这个路径下，就可以使用同一目录下 'Z-Image-Turbo' 这个文件夹当中的模型了。官方文档的参考代码是用hunggingface下载的，会有网络访问问题。这个模型大概有34G，要注意硬盘空间得足够。

cd ~/ # 进入工作目录
modelscope download --model Tongyi-MAI/Z-Image-Turbo --local_dir 'Z-Image-Turbo'

二、运行代码

运行下面的代码：

import torch
from diffusers import ZImagePipeline

pipe = ZImagePipeline.from_pretrained(
    "Z-Image-Turbo",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True, # 这里开启了低内存模式，能够在32G以内运行Z-Image模型。
)
pipe.to("cpu") # 这里改用CPU，避免显存不足

prompt = "一只熊猫拿着报纸，报纸上的文字是“中国队勇夺世界杯”，熊猫特别高兴"
# 2. Generate Image
image = pipe(
    prompt=prompt, # 提示词
    height=1024, # 图片高度
    width=1024,  # 图片宽度
    num_inference_steps=9,  # This actually results in 8 DiT forwards
    guidance_scale=0.0,     # Guidance should be 0 for the Turbo models
    generator=torch.Generator("cpu").manual_seed(42), # 这里也改成了CPU
).images[0]

image.save("result.png")

运行上面的代码，会出现下面的进度显示：

Loading pipeline components...:   0%|                                      | 0/5 [00:00<?, ?it/s]`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 100%|██████████████████████████████████| 3/3 [00:00<00:00, 185.14it/s]
Loading checkpoint shards: 100%|███████████████████████████████████| 3/3 [00:01<00:00,  1.52it/s]
Loading pipeline components...: 100%|██████████████████████████████| 5/5 [00:02<00:00,  2.26it/s]
100%|██████████████████████████████████████████████████████████████| 9/9 [04:41<00:00, 31.24s/it]

由于咱们用的是CPU，速度会慢一些。我在AMD Ryzen 9 7945HX+128G RAM的笔记本上运行，占用内存不超过30G，运行时间大概280秒，效果还行。

图片就会保存出一个result.png文件了。