Which are the most common AI Models?
Large Language Models (LLMs): Text in, Text Out (note, Code is also a language). Examples include GPT4, GPT3, Charlie 1, Claude 3.5
Diffusion Models: Typically text to multimedia like images, video, audio. Examples include Stable Diffusion, Flux, Stable Video, midjourney
Text to Speech (TTS): Going from Text to Audio. Examples include ElevenLabs
Audio to Text: Going from Audio or Video with audio to text. Examples include OpenAI Whisper
Multimodal Models are different in that they typically can understand multiple modalities of data as inputs, and create multiple modalities. Most multimodal models are currently just different models stitched together with Langchain or other language driven architectures.