AI Platform & Cloud
Infrastructure Architect
Designing AI-enabled cloud platforms — from agentic infrastructure and MCP servers to self-healing systems — for safety-critical and regulated industries.
By Ajin Baby · 15+ years architecting cloud systems · 2x founder before architect. Writing and shipping open-source code at the intersection of AI, cloud, and reliability.
Recent Posts
-
What are vector embeddings?
A short primer on vector embeddings — the numerical representation that lets a computer treat 'the meaning of this text' as something it can search, cluster, and compare. Covers what an embedding actually is, how similarity works, why model choice matters more than retrieval quality, and the production failure modes you only see in evaluation.
-
What is function calling (tool use)?
A short primer on function calling — the mechanism that lets an LLM decide to invoke an external function and let your code do the actual work. Covers the JSON-schema contract, the request/response loop, parallel and forced tool calls, and why every production AI agent in 2026 is built on this primitive.
-
What is prompt caching?
A short primer on prompt caching — the LLM-provider feature that drops the cost of a repeated long prompt by 50–90% and the latency by half. Covers how prefix matching works, the TTL economics across Anthropic / OpenAI / Google, where caching helps and where it quietly does not, and the operational gotchas that determine whether your hit rate is 90% or 9%.
-
The CAP theorem in AI-native distributed systems
CAP didn't get repealed when LLMs showed up. But the costs of choosing C, A, or P shift when the datastore behind the system is a vector index, a context graph, or a model-served retrieval layer. A short revisit of the trade-offs, framed for teams building AI-enabled infrastructure.
-
NAS vs SAN for GPU workloads — what changed when AI showed up
The classical NAS-vs-SAN decision was about file vs block, ethernet vs fibre, and how much you wanted to pay. GPU training and inference rewrote the question. Here's how the calculus shifts when your storage has to keep an A100 or H100 cluster fed.
-
What is an AI agent? A primer for cloud engineers
A short primer on AI agents — the perceive-reason-act loop, what separates an agent from a one-shot LLM call, the classical agent types (reflex, model-based, goal-based, utility-based, learning) and how they map onto the agents running in modern SRE and platform tooling.