AI starter kit: choose the right one for your stack
AI starter kit: choose the right one for your stack
An AI starter kit is a pre-packaged combination of models, infrastructure, and orchestration tools that gets you from zero to running AI workflows in hours instead of weeks. The concept gained traction when n8n released its self-hosted AI starter kit on GitHub — a Docker Compose file bundling Ollama, Qdrant, and n8n into a single deployment. Since then, the category has expanded to cover everything from fully self-hosted stacks to cloud-managed platforms.
According to a 2025 a16z survey on enterprise AI infrastructure, 58% of companies experimenting with AI spend more time on infrastructure setup than on building actual AI features. A 2025 Sequoia Capital AI report found that the median time from “we should use AI” to “AI is running in production” is 4.5 months for custom builds versus 2 weeks for teams using starter kits or managed platforms.
Unlike generic AI automation posts, this guide shows real CodeWords workflows — not just theory. You will learn which AI starter kit approach matches your constraints and how to get running quickly.
Related reading: self-hosted AI starter kit, AI workflow automation, open-source workflow automation platform, locally hosted LLM, CodeWords integrations, CodeWords templates, CodeWords pricing.
TL;DR
- AI starter kits come in three flavors: self-hosted (full control, high maintenance), cloud-hybrid (balance of control and convenience), and platform-managed (fastest start, least infrastructure overhead).
- The n8n self-hosted AI starter kit (Ollama + Qdrant + n8n) is the most popular open-source option. It works well for experimentation but requires GPU hardware and DevOps skills for production.
- CodeWords is a platform-managed AI starter kit — LLM access, integrations, execution infrastructure, and workflow building are included. No hardware, no Docker, no API key management.
What is inside an AI starter kit?
Every AI starter kit, regardless of deployment model, includes three layers. Think of them as brain, memory, and nervous system.
The brain: model runtime. This is where inference happens — the component that accepts prompts and returns completions. Options range from self-hosted runtimes to managed API access.
- Ollama: Self-hosted. Run
ollama pull llama3and you have a local inference server. Simple setup, limited throughput for production. - vLLM: Self-hosted, high-throughput. PagedAttention engine handles concurrent requests efficiently. Production-grade for teams with GPU infrastructure.
- Managed APIs: OpenAI, Anthropic, Google Gemini. No infrastructure to manage. CodeWords provides access to all three without requiring you to create API accounts or manage keys.
The memory: vector database. Stores embeddings for retrieval-augmented generation (RAG), semantic search, and context management.
- Qdrant: Rust-based, fast, Docker-ready. The default choice in most self-hosted kits.
- Chroma: Python-native, easy to embed in applications. Good for prototyping.
- Pinecone: Managed service. No infrastructure, but data leaves your environment.
The nervous system: orchestration. Connects the model, memory, and external tools into workflows.
- n8n: Visual workflow builder, self-hosted. The n8n AI starter kit bundles orchestration with Ollama and Qdrant.
- LangChain: Python framework for building LLM applications. More code-intensive, more flexible.
- CodeWords: Managed orchestration. Describe the workflow to Cody, and it generates and deploys a serverless microservice. Handles LLM calls, integrations, state, and scheduling.
How do you set up the n8n self-hosted AI starter kit?
The n8n AI starter kit is the most forked AI starter kit on GitHub. Here is the minimal viable setup.
Prerequisites: Docker and Docker Compose installed. A machine with at least 16GB RAM. For GPU inference, an NVIDIA GPU with 8GB+ VRAM.
Step 1: Clone and configure.
git clone https://github.com/n8n-io/self-hosted-ai-starter-kit.git
cd self-hosted-ai-starter-kit
cp .env.example .env
Edit .env to set your n8n credentials and any custom configuration.
Step 2: Start the stack.
docker compose up -d
This launches three containers: Ollama (model runtime), Qdrant (vector database), and n8n (orchestration). The first startup takes several minutes as Docker pulls the images.
Step 3: Pull a model.
docker exec -it ollama ollama pull llama3
For machines without a GPU, use smaller models: phi3 or mistral run on CPU with acceptable latency for development.
Step 4: Build your first workflow.
Open n8n at http://localhost:5678. Create a workflow that:
- Accepts a webhook trigger
- Sends the input to Ollama for processing
- Stores the result in Qdrant
- Returns the response
This gives you a working AI pipeline in under 30 minutes — assuming your hardware cooperates.
When should you self-host versus use a managed platform?
The decision depends on three factors: data sensitivity, volume, and team capacity.
Self-host when:
- Regulated data must stay on-premises (healthcare, finance, government)
- You need predictable costs at high volume (10,000+ inference calls/day)
- You have DevOps capacity to maintain GPU infrastructure, model updates, and security patches
- You want to fine-tune models on proprietary data
Use a managed platform when:
- You need frontier model quality (GPT-4o, Claude Sonnet, Gemini Pro)
- Your team lacks DevOps capacity for infrastructure maintenance
- You want to ship AI workflows in days, not months
- Integration with external services matters more than model customization
Go hybrid (most teams land here):
- Self-host embeddings and classification (fast, cheap, data stays local)
- Use managed APIs for generation and reasoning (better quality, no maintenance)
- Use CodeWords for orchestration — it connects to both self-hosted models via custom API calls and managed LLMs natively
How does CodeWords work as a managed AI starter kit?
CodeWords collapses the three-layer AI stack into a single platform. Instead of configuring Docker, managing API keys, and writing orchestration code, you describe what you want and Cody builds it.
What is included out of the box:
- LLM access: OpenAI, Anthropic, and Google Gemini — no API key setup, no billing management, no rate limit headaches
- Execution infrastructure: Serverless FastAPI microservices in ephemeral E2B sandboxes. Each workflow runs in isolation.
- 500+ integrations: Composio and Pipedream connectors plus native Slack, WhatsApp, Airtable, and Google Drive
- Web scraping: Firecrawl and AI Web Agent for data collection
- Search APIs: SearchAPI.io and Perplexity for research workflows
- State management: Redis-based persistence for multi-step and monitoring workflows
- UI generation: Next.js interfaces at
*.codewords.runfor dashboards and internal tools
Example: research assistant workflow.
Describe to Cody: “Monitor Hacker News for posts about AI infrastructure. When a post gets 100+ points, summarize the article, extract key claims, and post the summary to #ai-research on Slack with a link to the original.”
Cody generates a scheduled workflow that scrapes Hacker News, filters by score, uses an LLM to summarize, and posts to Slack. Deployed and running in minutes, not weeks.
What hardware do you need for self-hosted AI starter kits?
If you choose the self-hosted path, hardware is the gating factor.
For development and prototyping:
- 16GB RAM, any modern CPU, no GPU required
- Use quantized models (Q4_K_M) via Ollama or llama.cpp
- Expect 5–15 tokens/second on CPU — slow but functional
- Total cost: $0 if using existing hardware
For production with small models (7B–13B parameters):
- 32GB RAM, NVIDIA GPU with 12GB+ VRAM (RTX 4070 Ti or better)
- 30–60 tokens/second with GPU acceleration
- Total cost: $1,500–2,500 for GPU hardware
For production with large models (70B+ parameters):
- 128GB+ RAM, 80GB+ VRAM (A100 or H100)
- Multi-GPU setups with NVLink for models that exceed single-GPU VRAM
- Total cost: $15,000–40,000 for hardware, or $2–4/hour on cloud GPU instances
For most teams, the managed path through CodeWords costs less than a single GPU — and includes frontier model access, integrations, and execution infrastructure.
FAQ
What is the easiest AI starter kit for beginners?
CodeWords requires no infrastructure setup — describe a workflow and it runs. For self-hosted experimentation, the n8n AI starter kit with Ollama is the most documented option with an active community.
Can I switch from self-hosted to managed later?
Yes. The architectural patterns (RAG, classification, generation) are the same. The orchestration layer changes. If you build workflows in n8n, you can recreate them in CodeWords by describing the same logic to Cody. Your data and models are portable.
Do I need to know Python to use an AI starter kit?
For self-hosted kits, basic Python or Docker knowledge helps. For CodeWords, no coding is required — Cody generates the code from natural language descriptions. You can inspect and modify the generated Python if you want to.
How do AI starter kits handle model updates?
Self-hosted: you pull new model versions manually or via cron. Managed: the platform handles model updates. CodeWords automatically provides access to the latest model versions from OpenAI, Anthropic, and Google.
The kit is the starting line, not the finish
An AI starter kit removes the infrastructure barrier. It does not remove the design barrier — understanding which problems AI solves well, structuring prompts for reliable output, building error handling around non-deterministic systems. The teams that get value from AI starter kits are the ones who spend less time configuring Docker and more time designing workflows.
Pick the deployment model that matches your constraints. Then focus on the workflow.
Start building AI workflows in CodeWords — no infrastructure setup, no API key management, ready in minutes.




