JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly. You should upgrade or use an alternative browser.
Hey everyone,
I’m Darshan Hiranandani, currently looking for a model that can convert audio or speech input into full-body gestures. The idea is to generate a gesture mesh that can be used in conjunction with Stable Diffusion, using a reference image as input.
Has anyone come across any existing models or solutions that can do this? Any suggestions, resources, or insights on how to approach this would be really helpful.
Thanks in advance for your help! Regards Darshan Hiranandani