May 27, 2026

Google Gemini PaLM API: migration, models, and usage

Reading time :  
6
 min
Rithul Palazhi
Rithul Palazhi

Google Gemini PaLM API: migration, models, and how to use them

If you've searched for "Google Gemini PaLM API," you're likely navigating the transition from Google's older PaLM 2 models to the current Gemini family. The short answer: PaLM API endpoints are deprecated, Gemini is the successor, and the migration path runs through Google's Generative AI documentation. As of 2025, Gemini 1.5 Pro processes up to 2 million tokens of context — the largest window of any production LLM, per Google DeepMind. On CodeWords, you access Gemini models natively, no API key setup required.

Unlike generic AI automation posts, this guide shows real CodeWords workflows — not just theory.

TL;DR: - PaLM API is deprecated; migrate to Gemini API via Google AI Studio or Vertex AI - Gemini 1.5 Pro offers 2M-token context and multimodal input (text, images, video, audio) - CodeWords provides built-in Gemini access — skip API key management and start building workflows immediately

What happened to the PaLM API?

Google launched PaLM 2 in May 2023 as its flagship large language model, accessible through the PaLM API in Google AI Studio. By December 2023, Google announced Gemini as the direct successor. PaLM API endpoints entered deprecation in early 2024, and Google's migration guide directs developers to switch to the Gemini API.

The technical shift isn't dramatic — both APIs use similar request/response structures — but the model architecture underneath changed significantly. PaLM was text-only (with separate vision models). Gemini is natively multimodal: it processes text, images, audio, and video in a single model call. If your existing code calls generateText or chat on PaLM endpoints, you'll need to update to Gemini's generateContent method and swap model identifiers.

For automation builders using AI powered development tools, this matters because multimodal input unlocks workflows that text-only models couldn't handle — processing receipts, analyzing screenshots, or transcribing meeting recordings.

Which Gemini models are available today?

Google's Gemini lineup as of 2025:

Gemini 2.0 Flash — the latest fast model optimized for high-throughput tasks. Supports multimodal input and agentic capabilities including tool use and code execution. Ideal for automation pipelines where speed matters more than maximum reasoning depth.

Gemini 1.5 Pro — the high-capability model with a 2-million-token context window. Handles long documents, entire codebases, or hour-long video files in a single prompt. Best for deep research workflows and complex analysis.

Gemini 1.5 Flash — a lighter, faster variant of 1.5 Pro with a 1-million-token window. Good balance of capability and cost for production workloads.

All models are accessible through two paths: Google AI Studio (free tier with rate limits) and Vertex AI (enterprise-grade with SLAs). Or, skip both and use CodeWords where Gemini access is built into the platform alongside OpenAI and Anthropic models.

How do you access Gemini API without managing keys?

The standard path requires creating a Google Cloud project, enabling the Generative Language API, generating an API key in Google AI Studio, and storing it securely. For Vertex AI, you need a service account with Google credentials and Google OAuth 2.0 configured. This setup takes 15–30 minutes and introduces secrets management overhead.

CodeWords eliminates this entirely. The platform provides native access to Gemini (alongside OpenAI and Anthropic) as part of its runtime. When you build a workflow through Cody — the AI assistant — and ask it to "use Gemini 1.5 Pro to summarize this document," the model call happens through CodeWords' managed infrastructure. No API key creation, no environment variables, no secrets rotation.

This is particularly valuable for teams running multiple models in the same pipeline. A common pattern on CodeWords: use Gemini 1.5 Pro for long-context analysis (processing a 200-page PDF), then pass extracted data to GPT-4o for structured output formatting, then trigger a Slack notification with the results. Three model providers, zero key management.

How do you build a Gemini-powered workflow on CodeWords?

Here's a practical example — a research automation that monitors industry news:

  1. Schedule a trigger. Set a cron job (CodeWords handles scheduling natively) to run every morning.
  2. Scrape sources. Use Firecrawl or the AI Web Agent to pull content from target publications.
  3. Analyze with Gemini. Pass the collected articles to Gemini 1.5 Pro. Its long context window handles dozens of articles in a single call. Prompt it to identify trends, extract key statistics, and flag items relevant to your business.
  4. Format and distribute. Structure the output as a digest and push it to Slack, email via Google Drive integration, or an Airtable base.

You describe this to Cody in conversation. The platform generates a FastAPI Python microservice, provisions an ephemeral sandbox, wires the integrations, and deploys — all from the description. Browse CodeWords templates for pre-built patterns like this.

The same pattern works for automated content creation pipelines, competitive monitoring, and deep research workflows where Gemini's context window is the differentiator.

When should you use Gemini vs. other models?

Model selection matters for automation. Here's a practical framework:

Choose Gemini 1.5 Pro when: - Your input exceeds 128K tokens (most other models' maximum) - You need multimodal analysis — images, PDFs, audio, or video alongside text - You're processing entire codebases or lengthy documents

Choose GPT-4o when: - You need structured JSON output with high reliability - Function calling and tool use are central to your workflow - You want the broadest ecosystem of fine-tuned variants

Choose Claude (Anthropic) when: - Nuanced long-form writing or analysis is the goal - You need careful instruction following in complex, multi-step prompts - Safety and refusal calibration matter for your use case

On CodeWords, switching between these is a one-line change in your workflow, or simply telling Cody which model to use. AI tools for software development work best when you can mix models per task rather than being locked to one provider.

How does pricing compare across Gemini tiers?

According to Google's pricing page, Gemini 1.5 Flash is among the most cost-effective production LLMs — significantly cheaper per token than GPT-4o for equivalent tasks. Gemini 1.5 Pro costs more but justifies it with the 2M-token window. Google AI Studio offers a free tier with generous rate limits for prototyping.

On CodeWords, model costs are bundled into platform pricing, which simplifies budgeting — you pay one bill rather than tracking usage across three providers.

FAQ

Can I still use PaLM API endpoints? Google has deprecated PaLM API endpoints. Existing calls may still function during the transition period, but new projects should use the Gemini API exclusively. Follow Google's migration guide for the specific code changes required.

Do I need a Google Cloud account to use Gemini? For direct API access, yes — either through Google AI Studio (lighter setup) or Vertex AI (enterprise). On CodeWords, no — Gemini access is built into the platform.

Is Gemini good for code generation? Yes. Gemini 2.0 Flash and 1.5 Pro both support code generation and execution. For AI powered code generation tools, Gemini's long context window is especially useful when the model needs to understand large codebases before generating new code.

How does Gemini handle rate limits? Google AI Studio enforces requests-per-minute and tokens-per-minute limits that vary by model and tier. Vertex AI offers higher limits with pay-as-you-go pricing. CodeWords manages rate limiting and retries automatically within its workflow automation platform.

From API migration to workflow leverage

The PaLM-to-Gemini transition isn't just a model swap — it's an expansion in what automated workflows can process. Multimodal input and million-token context windows mean automations that previously required multiple specialized tools now collapse into a single model call. That changes the economics of building AI-powered pipelines, especially for small teams who can't afford dedicated ML infrastructure.

Try Gemini-powered workflows on CodeWords — bring the idea, skip the API key ceremony.

Contents
Ready to try CodeWords?
Get started free
Sign in
Sign in