BigQuery provides access to a variety of LLMs for text and embedding generation, including Google’s Gemini models, Google-managed models from partners like Anthropic and Mistral. Using Gemini models and Google-managed partner models in BigQuery is simple — just create the model with the foundation model name and run inference directly in SQL queries. Today, we are bringing this same simplicity and power to any model you may choose from Hugging Face or Vertex AI Model Garden.
With the launch of managed third-party generative AI inference in BigQuery (Preview), you can now run open models using just two SQL statements.
This new capability delivers four key benefits:
CREATE MODEL SQL statement with the model id string (e.g., google/gemma-3-1b-it). BigQuery automatically provisions the compute resources with default configurations.endpoint_idle_ttl.CREATE MODEL statement to meet your performance and cost needs.Let’s take a look at the process of creating and utilizing an open model.
Step 1: Create a BigQuery managed open model
To use an open model from Hugging Face or Vertex AI Model Garden, use a CREATE MODEL statement along with the open model ID. It typically takes a few minutes for the query to complete, depending on the model size and machine types.
Hugging Face models
Specify the option hugging_face_model_id in the format of provider_name/model_name. For example, sentence-transformers/all-MiniLM-L6-v2 .
Vertex AI Model Garden models
Specify the option model_garden_model_name in the format publishers/publisher/models/model_name@model_version. For example, publishers/google/models/gemma3@gemma-3-1b-it .
For demanding workloads, you can customize deployment settings (machine types, replica counts, endpoint idle time) to improve scalability and manage costs. You can also use Compute Engine reservations to secure GPU instances for consistent performance. See CREATE MODEL syntax for all the options.
Step 2: Run batch inference
Once the above CREATE MODEL job finishes, you can use it with AI.GENERATE_TEXT (for LLM inference) or AI.GENERATE_EMBEDDING (for embedding generation) on your data in BigQuery.
Vertex AI endpoint lifecycle management and cost control
BigQuery offers flexible controls over Vertex AI endpoint lifecycle and costs through both automated and manual options.
Automated control: The endpoint_idle_ttl option enables automated resource recycling. If the model isn’t used for the specified duration (e.g., INTERVAL 10 HOUR), BigQuery automatically “undeploys” the Vertex AI endpoint for you, stopping all costs.
ALTER MODEL statement.Easy resource cleanup
When you are done with using a model, simply drop it. BigQuery automatically cleans up all associated Vertex AI resources (like the endpoint and model) for you, so you are no longer charged for them.
BigQuery’s new managed inference capability for 3P models fundamentally changes how data teams access and use third-party gen AI models. By consolidating the entire model lifecycle management into a familiar SQL interface, we’re removing the operational friction and making powerful open models accessible to every BigQuery user, from data analysts to AI/ML engineers. For comprehensive documentation and tutorials, please refer to the following resources:
Read the documentation: Creating automatically-deployed open models
Try the text generation tutorial: Generate text with the Gemma model
Try the embedding generation tutorial: Generate text embeddings with open models
We look forward to seeing what you build!
Computer vision is an area of artificial intelligence that gives computer systems the ability to…
At Amazon.ae, we serve approximately 10 million customers monthly across five countries in the Middle…
Users of the VR fitness service are distraught that Supernatural has had its staff cut…
Artificial intelligence models that are trained to behave badly on a narrow task may generalize…
Editor’s note: This article is a part of our series on visualizing the foundations of…
AutoScout24 is Europe’s leading automotive marketplace platform that connects buyers and sellers of new and…