Skip to content

Commit f10fb06

Browse files
committed
Add ACE Step notebook
1 parent dd4bfae commit f10fb06

File tree

6 files changed

+2780
-0
lines changed

6 files changed

+2780
-0
lines changed

.ci/skipped_notebooks.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -574,3 +574,11 @@
574574
skips:
575575
- os:
576576
- macos-13
577+
- notebook: notebooks/ace-step-music-generation/ace-step-music-generation.ipynb
578+
skips:
579+
- python:
580+
- "3.9"
581+
- "3.11"
582+
- "3.12"
583+
- os:
584+
- macos-13

.ci/spellcheck/.pyspelling.wordlist.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ autogenerated
5252
AutoModelForXxx
5353
autoregressive
5454
autoregressively
55+
AutoEncoder
56+
AutoEncoders
5557
AutoTokenizer
5658
AWQ
5759
awq
@@ -200,6 +202,7 @@ denoises
200202
denoising
201203
denormalization
202204
denormalized
205+
demucs
203206
depainting
204207
deployable
205208
DepthAnything
@@ -230,6 +233,7 @@ DIT
230233
DiT
231234
DiT’s
232235
DiT’s
236+
DiTs
233237
DL
234238
DocLayNet
235239
docling
@@ -290,6 +294,8 @@ FastDraft
290294
FastSAM
291295
FC
292296
feedforward
297+
FeedForward
298+
FFN
293299
FFmpeg
294300
FIL
295301
FEIL
@@ -606,6 +612,7 @@ MRPC
606612
mRoPE
607613
msi
608614
MTVQA
615+
mT
609616
multiarchitecture
610617
Multiclass
611618
multiclass
@@ -703,6 +710,7 @@ opset
703710
optimizable
704711
Orca
705712
otsl
713+
OSNet
706714
OTSL
707715
OuteTTS
708716
outpainting
@@ -778,6 +786,7 @@ PowerShell
778786
PPYOLOv
779787
PR
780788
Prateek
789+
PLR
781790
pre
782791
Precisions
783792
precomputed
@@ -942,6 +951,7 @@ SmolVLM
942951
softmax
943952
softvc
944953
SoftVC
954+
SongGen
945955
SOTA
946956
SoTA
947957
soundfile
@@ -1122,6 +1132,7 @@ Vladlen
11221132
VOC
11231133
Vocoder
11241134
vocoder
1135+
vocoding
11251136
VQ
11261137
VQA
11271138
VQGAN
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Music generation using ACE Step and OpenVINO
2+
3+
[ACE-Step](https://ace-step.github.io/) is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches and achieves state-of-the-art performance through a holistic architectural design. Current methods face inherent trade-offs between generation speed, musical coherence, and controllability. ACE-Step bridges this gap by integrating diffusion-based generation with Sana’s Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer. The model achieving superior musical coherence and lyric alignment across melody, harmony, and rhythm metrics. Moreover, ACE-Step preserves fine-grained acoustic details, enabling advanced control mechanisms such as voice cloning, lyric editing, remixing, and track generation (e.g., lyric2vocal, singing2accompaniment).
4+
5+
ACE-Step adapts a text-to-image diffusion framework for music generation. The core generative model is a diffusion model operating on a compressed mel spectrogram latent representation. This process is guided by conditioning information from three specialized encoders: a text prompt encoder, a lyric encoder, and a speaker encoder. Embeddings from these encoders are concatenated and integrated into the diffusion model via cross-attention mechanisms
6+
7+
ACE-Step can be used for generating original music from text descriptions, music remixing and style transfer, edit song lyrics. The model offers a set of controllable features that allow users to precisely control the generation process and enable targeted modifications to existing audio material, as well as perform specialized generation tasks through fine-tuning.
8+
9+
<img src="https://raw.githubusercontent.com/ACE-Step/ACE-Step/main/assets/ACE-Step_framework.png" width=90% style="display: block; margin: auto;" />
10+
11+
More details about the model can be found using the following resources: [project page](https://ace-step.github.io/), [paper](https://arxiv.org/abs/2506.00045), [original repository](https://github.com/ace-step/ACE-Step).
12+
13+
14+
## Notebook Contents
15+
16+
This notebook demonstrates how to convert and run music generation or editing with ACE Step using OpenVINO.
17+
18+
The tutorial consists of the following steps:
19+
20+
- Install prerequisites
21+
- Download and run inference of ACE Step model
22+
- Convert the model to IR format and run inference with OpenVINO
23+
- Download, apply and generate audio with LoRA
24+
- Interactive demo
25+
26+
27+
## Installation Instructions
28+
29+
This is a self-contained example that relies solely on its own code.</br>
30+
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
31+
For details, please refer to [Installation Guide](../../README.md).
32+
33+
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/ace-step-music-generation/README.md" />

notebooks/ace-step-music-generation/ace-step-music-generation.ipynb

Lines changed: 888 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)