Deploy open models with TGI on Cloud Run
Tutorial: How to deploy Gemma 2 on Cloud Run with TGI → https://goo.gle/3Yoztjh
Get started with Cloud Run GPU → https://goo.gle/4ec7mJS
Docs: Text Generation Inference → https://goo.gle/4e7qusz
Start serving text generation inference with fast token speed and serve requests for a fraction of the cost of traditional methods. Watch along and learn how to deploy the Gemma 2 model to Cloud Run using Hugging Face TGI with Wietse Venema (Google) and Alvaro Bartolome (Hugging Face).
More resources:
Gemma 2 (9b) on the Hugging Face Hub → https://goo.gle/3C1vX6R
Hugging Face Deep Learning Containers for Google Cloud → https://goo.gle/3BPaYUM
Watch more Google Cloud: Building with Hugging Face → https://goo.gle/BuildWithHuggingFace
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
#GoogleCloud #HuggingFace
Speakers: Wietse Venema, Alvaro Bartolome
Products Mentioned: Gemma, Hugging Face, Cloud Run
Get started with Cloud Run GPU → https://goo.gle/4ec7mJS
Docs: Text Generation Inference → https://goo.gle/4e7qusz
Start serving text generation inference with fast token speed and serve requests for a fraction of the cost of traditional methods. Watch along and learn how to deploy the Gemma 2 model to Cloud Run using Hugging Face TGI with Wietse Venema (Google) and Alvaro Bartolome (Hugging Face).
More resources:
Gemma 2 (9b) on the Hugging Face Hub → https://goo.gle/3C1vX6R
Hugging Face Deep Learning Containers for Google Cloud → https://goo.gle/3BPaYUM
Watch more Google Cloud: Building with Hugging Face → https://goo.gle/BuildWithHuggingFace
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
#GoogleCloud #HuggingFace
Speakers: Wietse Venema, Alvaro Bartolome
Products Mentioned: Gemma, Hugging Face, Cloud Run
Google Cloud Tech
Helping you build what's next with secure infrastructure, developer tools, APIs, data analytics and machine learning....