With release of Wan 2.2 FUN I was curious to give it a try. It turns out that even with proper reference it is still extremely time consuming to generate cohesive scene from several points of view. 4 days to refine image references that doesn't contradict each other, 5 more days to generate proper animation for each reference, 1 day to combine it and make sound. And that's for 30 seconds animation.
Knowing a little bit about 3D - that's comparable with Blender NSFW artists on a time scale. Same rule - better result require more efforts. AI doesn't solve that completely, it's just a sophisticated tool. I will look for other approaches and hope that tools become better.
It's my first time making NSFW sound design, comments are welcomed.
SFX: https://x.com/OpenNSFWSP
SFXPack: AudioElk
VA Pack: https://x.com/geministarsign1
VA Pack: https://x.com/NyaughtyMeow