AI & RAG

Nightly re-index of Confluence + GitLab wikis into a vector store

Runs every night to pull changed Confluence pages and GitLab wiki pages, chunk and embed them.

CategoryAI & RAG
Enginesim
Difficultyadvanced
Triggerschedule
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNightly schedule
  • ActionFetch changed Confluence + GitLab pagesConfluenceConfluence
  • LogicSplit into upserts vs. deletes by change type
  • ActionChunk and embed pages with OpenAIOpenAI
  • OutputUpsert/prune vectors in Postgres pgvectorPostgreSQLPostgres

What it does

Keeps your retrieval index current. On a nightly schedule it fetches pages updated since the last run from Confluence and GitLab wikis, splits them into chunks, generates embeddings, and upserts the vectors into a Postgres pgvector table. Deleted pages are pruned so stale answers don't resurface.

When to use it

Whenever you run any RAG answer bot over engineering docs and need retrieval to reflect edits made during the day. Run it on its own so the question-answering flows stay fast and never embed at query time.

How it works

  1. 1A nightly schedule trigger starts the run.
  2. 2The flow queries Confluence and GitLab for pages with an updated timestamp newer than the last successful run.
  3. 3A change filter splits results into upserts (new/edited) and deletes (removed pages).
  4. 4Each changed page is chunked and embedded via OpenAI.
  5. 5Vectors are upserted into the Postgres pgvector store; tombstoned pages are deleted.
  6. 6The run records its completion timestamp in Postgres for the next incremental pass.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect ConfluenceSpaces, pages, blueprints.
  2. 2
    Connect GitLabRepos, MRs, pipelines, registry.
  3. 3
    Connect OpenAIModels, embeddings, files.
  4. 4
    Connect PostgresAny Postgres URL — query, write, migrate.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.