Overview
BentoML bundles models and business logic into “bentos” that can be served via REST/gRPC or deployed to Kubernetes, Lambda and more.
Key Capabilities
- CLI & Python SDK for packaging
- High-performance runner abstraction
- Deploy to BentoCloud, SageMaker, K8s
- Adapters for HuggingFace, vLLM, OpenLLM