Skip to main content
All articles
5 min read

AI Workflow Automation with n8n and the OpenAI API

How we cut an agency's RFP turnaround time by 80% using n8n orchestration, structured OpenAI outputs, and careful prompt engineering.

AI
Automation
OpenAI
n8n

For the RFP Rocket project at Nettra Media, the goal was simple: take a process that took an account manager several hours and compress it to minutes. Here's the technical breakdown of how we did it.

The pipeline at a glance

The flow looked like this: a Slack message triggers an n8n webhook → n8n pulls the RFP brief from Google Drive → the brief is chunked and sent to the OpenAI API → structured JSON responses are assembled into a Word doc → the finished packet is posted back to Slack and filed in Drive.

Stack

n8n (self-hosted), OpenAI gpt-4o, Google Drive API, Slack webhooks, and docxtemplater for Word assembly. Total infrastructure cost: ~$15/month on a small VPS.

Getting consistent output from the model

The hardest part wasn't hooking up the APIs — it was making the model produce output that was reliable enough to drop straight into a client document. Two techniques made the biggest difference:

1. Structured outputs (JSON mode)

We used OpenAI's response_format: { type: "json_object" } and included a JSON schema in the system prompt. This eliminated the "the model added a preamble paragraph" class of failures entirely.

{
  "model": "gpt-4o",
  "response_format": { "type": "json_object" },
  "messages": [
    {
      "role": "system",
      "content": "Return a JSON object with keys: executive_summary, scope_of_work, pricing_narrative. Each value is a string of 2-3 professional paragraphs."
    },
    { "role": "user", "content": rfpBrief }
  ]
}

2. Few-shot examples in the system prompt

We included one complete example of a well-written RFP response in the system prompt. The model's tone and formatting immediately converged on the agency's house style without any further tuning.

Managing API quotas in n8n

n8n's built-in retry logic handles transient 429s, but we also added an explicit rate-limiter node before any OpenAI call using the "Wait" node with a short fixed delay. For document-heavy batches we split the workflow into a parent (scheduling) and child (per-document) execution so failures were isolated.

Rate limits

OpenAI tier-based rate limits reset per minute. If you're processing multiple documents in a loop, add a 1–2 second delay between calls even if you're within your TPM limit. Burst overages are the most common failure mode in production.

Results

The agency went from 4–6 hours per RFP to under 30 minutes. The remaining time is human review, which is exactly where it should be.

Don't try to automate the human decision — automate the tedious information-gathering and formatting that surrounds it. That's where the 80% time savings live.

Ship fast, tune later

The first version had no few-shot examples and no JSON schema. It still cut turnaround time by 60%. You don't need a perfect prompt to get value — ship, collect feedback, and iterate.