Multimodal
One model takes images and text in, and produces images and speech out — across modalities.
Section: advanced-techniques · scene id multimodal · tutorial 03-advanced-techniques/07-multimodal
One model takes images and text in, and produces images and speech out — across modalities.
Section: advanced-techniques · scene id multimodal · tutorial 03-advanced-techniques/07-multimodal