Skip to main content
Version: Latest (dev)

Supported Models

ServerlessLLM supports a plethora of language models from Huggingface (HF) Transformers. This page lists the models and model architectures currently supported by ServerlessLLM.

To test a model, simply add it to the supported_models.json inside /ServerlessLLM/tests/inference_tests and the Github Actions will automatically test whether not it is supported.

Text-only Language Models

ArchitectureModelsExample HF ModelsvLLMTransformers
OPTForCausalLMOPT, OPT-IMLfacebook/opt-6.7b

Vision Language Models

ArchitectureModelsExample HF ModelsvLLMTransformers
Qwen2VLForConditionalGenerationQwen2VLQwen/Qwen2-VL-2B-Instruct