If --runner pooling has been set (manually or automatically) but the model does not implement the [VllmModelForPooling] [vllm.model_executor.models.VllmModelForPooling] interface, vLLM will attempt to ...
Pooling API Our Pooling API (/pooling) is similar to LLM.encode, being applicable to all types of pooling models. The input format is the same as Embeddings API, but the output data can contain an ...