-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference Providers] Async calls for text-to-video
with fal.ai
#1292
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Julien Chaumond <[email protected]>
…e.js into async-calls-falai
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 🤩
async function pollFalResponse(res: FalAiOutput, args: TextToVideoArgs, options?: Options): Promise<Blob> { | ||
const requestId = res.request_id; | ||
if (!requestId) { | ||
throw new InferenceOutputError("No request ID found in the response"); | ||
} | ||
let status = res.status; | ||
const { url, info } = await makeRequestOptions(args, { ...options, task: "text-to-video" }); | ||
const baseUrl = url?.split("?")[0] || ""; | ||
const query = url?.includes("_subdomain=queue") ? "?_subdomain=queue" : ""; | ||
|
||
const statusUrl = `${baseUrl}/requests/${requestId}/status${query}`; | ||
const resultUrl = `${baseUrl}/requests/${requestId}${query}`; | ||
|
||
while (status !== "COMPLETED") { | ||
await delay(1000); | ||
const statusResponse = await fetch(statusUrl, { headers: info.headers }); | ||
|
||
if (!statusResponse.ok) { | ||
throw new Error(`HTTP error! status: ${statusResponse.status}`); | ||
} | ||
status = (await statusResponse.json()).status; | ||
} | ||
|
||
const resultResponse = await fetch(resultUrl, { headers: info.headers }); | ||
const result = await resultResponse.json(); | ||
const isValidOutput = | ||
typeof result === "object" && | ||
!!result && | ||
"video" in result && | ||
typeof result.video === "object" && | ||
!!result.video && | ||
"url" in result.video && | ||
typeof result.video.url === "string" && | ||
isUrl(result.video.url); | ||
if (!isValidOutput) { | ||
throw new InferenceOutputError("Expected { video: { url: string } }"); | ||
} | ||
const urlResponse = await fetch(result.video.url); | ||
return await urlResponse.blob(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to move this util in the src/providers/fal-ai.ts
file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, we should probably do the same for:
async function pollBflResponse(url: string, outputType?: "url" | "blob"): Promise<Blob> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in cf2d1ac. I had to move FalAiOutput
to src/providers/fal-ai.ts
to avoid an import cycle, maybe we should move every provider-specific output type to their respective provider files, wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should move every provider-specific output type to their respective provider files, wdyt ?
Agree with this and in particular their output type + logic to validate and parse the response. Same as the get_response
's in Python. Out of scope for this PR though
Co-authored-by: Simon Brandeis <[email protected]>
Co-authored-by: Simon Brandeis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
…e.js into async-calls-falai
…e.js into async-calls-falai
What does this PR do?
This PR adds asynchronous polling to the fal.ai text-to-video generation. This allows running inference with models that may take > 2 min to generate results. The other motivation behind this PR is to align the Python and JS clients, the Python client has already been merged into main: huggingface/huggingface_hub#2927
Main Changes
baseUrl
property withmakeBaseUrl()
function across all providers. This is needed to be able to customize the base url based on the task. We want to useFAL_AI_API_BASE_URL_QUEUE
fortext-to-video
only. I'm not convinced if it's the simplest and the best way to do that.pollFalResponse()
fortext-to-video
(similarly to what it's done with BFL fortext-to-image
).Any refactoring suggestions are welcome! I'm willing to spend some additional time to make provider-specific updates easier to implement and better align our two clients 🙂
btw, I did not update the VCR tests as we've discussed that it'd be best to remove the VCR for
text-to-video
. Maybe we should remove them here?EDIT: removed the text-to-video tests in f8a6386.
I've tested it locally with tencent/HunyuanVideo for which the generation takes more than 2min and it works fine:
fal-ai-output.mp4