# Multi Modals

### Multi Modals

Multi-modal is a specialized custom component that utilizes advanced models, such as diffusion models or speech generators, to create images or voice outputs from a simple text prompt. Currently, we have two Multi-Modals available: OpenAITextToImage and OpenAITextToSpeech.

#### OpenAITextToImage

This component uses the `Dall-e-3` model from OpenAI behind the hood. Users need to input specific parameters to generate the desired output.&#x20;

**Parameters**

* **OpenAI API Key:** Key used to authenticate and access the OpenAI API.
* **Prompt:** Prompt template or ChatPrompt Prompt, that contains the prompt to be instructed for the component.
* **Quality of the Image:** This refers to the visual quality of the image, which can be either standard or High Definition (HD).

**Example Usage**

<figure><img src="https://3489179498-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjlRjmrZThWiLNO9xTu9d%2Fuploads%2F9GPAYIPMfqCVVdspIHwk%2FScreenshot%202024-05-21%20at%205.28.00%E2%80%AFPM.png?alt=media&#x26;token=b0d2b152-b7fe-4dbd-a48e-83de2ee887d4" alt=""><figcaption><p>using the OpenAITextToImage to generate an image</p></figcaption></figure>

The OpenAITextToImage component requires an `OpenAI API Key` that you can get from <https://platform.openai.com/>. The Prompt can be specified using a simple `PromptTemplate`. The `Quality of Image` can be set either to HD or standard. This component returns an `Ouput`.

#### OpenAITextToSpeech

This component uses the `tts-1` model from OpenAI in the background. Users can generate a voice speech with a specified vocal tone, using this component.

**Parameters**

* **OpenAI API Key:** Key used to authenticate and access the OpenAI API.
* **Text Input:** The simple text prompt, which will be converted to speech.
* **Choose a Voice:** This option lets us choose the type of vocal tone for generating the speech.&#x20;

**Example Usage**

<figure><img src="https://3489179498-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjlRjmrZThWiLNO9xTu9d%2Fuploads%2FsMaDEpYCtO8rAzvLwMul%2FScreenshot%202024-05-21%20at%205.37.43%E2%80%AFPM.png?alt=media&#x26;token=8d047ebf-ec9d-43fa-9dfc-f334265daeef" alt=""><figcaption><p>using the OpenAITextToSpeech to generate a voice speech</p></figcaption></figure>

The OpenAITextToSpeech component requires an `OpenAI API Key` that you can get from <https://platform.openai.com/>. The `text input` is the field that gets converted to the speech. The `Choose the Voice` option allows you to adjust the type of vocal tone in the output speech. This component returns an `Ouput`.
