Skip to content

Allow serving llama models with tensor parallel #2670

Allow serving llama models with tensor parallel

Allow serving llama models with tensor parallel #2670