Labsco
magicmars35 logo

Wan2GP API Server

โ˜… 4

from magicmars35

Skill for AI Agents : wrapper to generate video for local wan2GP instance

๐Ÿ”ฅ๐Ÿ”ฅโœ“ VerifiedAccount requiredNeeds API keys

Wan2GP Agentic Skill

Wan2GP Agentic Skill allows Linux AI agents such as OpenClaw, Hermes, or any Python-based agent to generate videos through a Windows PC running Wan2GP on the local network.

Concept

The concept is simple:

Copy & paste โ€” that's it
Linux AI Agent
 |
 | HTTP API
 v
Windows PC with Wan2GP + GPU
 |
 | FastAPI server
 | - video generation API
 | - built-in queue monitor
 | - MP4 download endpoint
 v
Generated MP4 video
 |
 v
Returned to the agent

The project contains two main parts:

  • a FastAPI server to install on the Wan2GP Windows machine

  • a Python skill to install on the AI agent machines

The monitoring interface is directly served by the FastAPI server. No separate web server is required.

Features

  • text to video generation

  • image to video generation

  • start image + end image to video generation

  • audio to video generation

  • audio + reference image to video generation

  • audio + reference image + LoRA generation

  • single universal Wan2GP template file

  • server-side mode routing

  • job queue tracking

  • job status monitoring

  • built-in HTML monitoring dashboard

  • automatic MP4 download after generation

  • requester IP tracking

  • requester user-agent tracking

  • fixed Wan2GP model on the server side

  • optional non-blocking job submission for agents

Generation modes

Copy & paste โ€” that's it
t2v text to video
i2v image to video
i2v_end start image + end image to video
s2v sound/audio to video
s2v_i2v sound/audio + reference image to video
s2v_i2v_lora sound/audio + reference image + LoRA

Current version highlights

This version uses one universal Wan2GP JSON template instead of one template per generation mode.

The server loads:

Copy & paste โ€” that's it
ltx2_template_universal.json

Then it applies the correct mode controls automatically:

  • image prompt type

  • audio prompt type

  • start image

  • end image

  • audio guide

  • LoRA activation

  • prompt enhancer

  • multimodal generation type

This keeps the configuration cleaner and avoids maintaining several nearly identical JSON files.

The server also includes a built-in monitoring dashboard:

Copy & paste โ€” that's it
http://SERVER_IP:7861/monitor?token=YOUR_SECRET_TOKEN

Alias:

Copy & paste โ€” that's it
http://SERVER_IP:7861/ui?token=YOUR_SECRET_TOKEN

Fixed model

The server is designed to use one fixed model:

Copy & paste โ€” that's it
LTX-2 2.3 Distilled 1.1 22B

Internal Wan2GP model identifier:

Copy & paste โ€” that's it
ltx2_22B_distilled_1_1

The model is intentionally locked on the server side to prevent agents from switching models or launching unexpected heavy generations.

Repository structure

Copy & paste โ€” that's it
wan2gp_agentic_skill/
โ”‚
โ”œโ”€โ”€ README.md
โ”‚
โ”œโ”€โ”€ wan2gp_server/
โ”‚ โ”œโ”€โ”€ wan2gp_api_server.py
โ”‚ โ””โ”€โ”€ ltx2_template_universal.json
โ”‚
โ””โ”€โ”€ wan2gp_video_agent_skill/
 โ”œโ”€โ”€ wan2gp_skill.py
 โ””โ”€โ”€ SKILL.md

The logic is:

  • wan2gp_server goes on the Windows PC running Wan2GP

  • wan2gp_video_agent_skill goes on the Linux AI agent machines

  • the monitoring dashboard is directly included in the FastAPI server

Built-in monitoring dashboard

The FastAPI server includes its own monitoring dashboard.

Open:

Copy & paste โ€” that's it
http://192.168.1.53:7861/monitor?token=YOUR_SECRET_TOKEN

Or:

Copy & paste โ€” that's it
http://192.168.1.53:7861/ui?token=YOUR_SECRET_TOKEN

The dashboard displays:

  • API status

  • loaded model

  • total jobs

  • active jobs

  • completed jobs

  • failed jobs

  • job status

  • queue position

  • generation mode

  • progress

  • current phase

  • current step

  • requester IP

  • requester user-agent

  • prompt excerpt

  • input files

  • LoRA information

  • generation duration

  • MP4 download link

The dashboard uses a browser token parameter because browsers do not easily send an Authorization: Bearer ... header when opening a page directly.

What to tell the agents

Add this to the system prompt or skill configuration of your agent:

Copy & paste โ€” that's it
You have access to a skill called Wan2GP Video.

Use this skill whenever the user asks for video generation.

Choose the mode automatically:

- text only: t2v
- image + prompt: i2v
- start image + end image + prompt: i2v_end
- audio + prompt: s2v
- audio + image + prompt: s2v_i2v
- audio + image + explicit LoRA request: s2v_i2v_lora

After submitting the job, retrieve the job_id, monitor progress with get_job_status, wait until the job is complete, download the generated MP4, then return the video file to the user.

When useful, provide the user with the built-in monitor URL so they can follow the queue visually.

Main API endpoints

Copy & paste โ€” that's it
GET /health
GET /model
GET /jobs
GET /jobs/{job_id}
GET /download/{job_id}/{filename}
GET /monitor?token=YOUR_SECRET_TOKEN
GET /ui?token=YOUR_SECRET_TOKEN
GET /monitor/download/{job_id}/{filename}?token=YOUR_SECRET_TOKEN

POST /generate/t2v
POST /generate/i2v
POST /generate/i2v_end
POST /generate/s2v
POST /generate/s2v_i2v
POST /generate/s2v_i2v_lora

Protected API endpoints require a Bearer token:

Copy & paste โ€” that's it
Authorization: Bearer YOUR_SECRET_TOKEN

The browser monitoring endpoints accept the token as a query parameter:

Copy & paste โ€” that's it
?token=YOUR_SECRET_TOKEN

Endpoint details

GET /health

Returns basic API status and available modes.

GET /model

Returns the fixed model information, default generation settings, template file path, and supported mode controls.

GET /jobs

Returns all jobs currently known by the API server.

Jobs are stored in memory. If the server restarts, the job history is cleared.

GET /jobs/{job_id}

Returns a single job with runtime fields such as queue_position and short_status.

GET /download/{job_id}/{filename}

Downloads a generated MP4 using Bearer token authentication.

GET /monitor

Displays the built-in HTML queue dashboard.

GET /monitor/download/{job_id}/{filename}

Downloads a generated MP4 from the browser dashboard using the token query parameter.

Prompting recommendations for LTX 2.3

For LTX 2.3 video generation, write the prompt as a clear cinematic direction, not as a list of keywords.

Recommended structure:

  • shot type

  • camera movement

  • environment

  • lighting

  • subject

  • visible action

  • mood expressed through physical details

  • audio or dialogue when needed

Example:

Copy & paste โ€” that's it
The camera starts in a tight cinematic close-up, then slowly pushes forward as the woman raises her eyes toward the lens. Warm studio light reflects softly on her face. She breathes in, pauses, and says in French, "Je crois que j'ai enfin compris." Her voice is quiet and sincere. After the line, she gives a small uncertain smile while the background remains softly blurred.

For Image-to-Video, do not redescribe what is already visible in the reference image. Describe only:

  • camera movement

  • subject movement

  • environmental changes

  • lighting changes

  • facial expression changes

  • dialogue or sound timing

If the video contains dialogue, include the dialogue directly inside the prompt exactly where it happens in the scene.

Important notes

The Wan2GP API server must be running before agents can generate videos.

The first generation after startup may be slower if the model has to be loaded into VRAM.

Jobs are stored in memory on the API server.

If the API server is restarted, previous job_id values will no longer be available.

Use the built-in monitor to follow the queue visually while agents submit generation jobs.

Security

This project is designed for local network or VPN usage.

Do not expose the Wan2GP API directly to the public Internet.

Recommendations:

  • use a long Bearer token

  • do not publish real tokens on GitHub

  • restrict the Windows firewall to agent IP addresses if possible

  • keep Wan2GP LAN-only

  • use YOUR_SECRET_TOKEN in public files

  • avoid committing local IPs or private infrastructure details if the repository is public

Disclaimer

This project is a wrapper around Wan2GP.

It does not include Wan2GP, AI models, model weights, LoRA files, or any third-party generation assets.

Respect the licenses and terms of use of Wan2GP, the models, and the LoRA files you use.