OpsGuide: Multi-Tenant Workflow Automation SaaS

Tech Stack:
React TypeScript FastAPI GraphQL Playwright AWS

A multi-tenant SaaS platform for designing, executing, and governing operational workflows with AI-assisted form automation

OpsGuide: Multi-Tenant Workflow Automation SaaS

Project Overview

OpsGuide is a multi-tenant SaaS product for modeling operational workflows, automating repetitive tasks, and managing execution from a single interface. The product combines a React dashboard, a FastAPI + GraphQL backend, browser automation jobs, and AWS infrastructure to support workflow execution, AI-assisted scenario generation, usage governance, and organization-based access control.

I joined the project as a Full-stack Developer from April 2025 to February 2026, contributing across the web application, backend APIs, and the automation pipeline that powers form submission workflows.

Link to the product: OpsGuide
Company page: PORTAMENT

OpsGuide Project

Tech Stack

  • Frontend: React, TypeScript, Vite, Tailwind CSS, Apollo Client
  • Backend: Python, FastAPI, Strawberry GraphQL, SQLAlchemy
  • Automation: Node.js, TypeScript, Playwright, AWS Batch
  • Data & Infra: PostgreSQL, Redis, S3, ECS Fargate, Lambda, Step Functions, Terraform
  • Auth & Platform: Auth0, CloudFront, ALB, CloudWatch

Product Context

The product sits in a difficult space between traditional SaaS CRUD screens and long-running automation jobs. A normal web app request/response model is not enough when users need to:

  • manage organizations, scenarios, and target companies
  • launch automation that may run asynchronously in the cloud
  • inspect execution status and failures later
  • keep usage limits, access control, and tenant isolation consistent

That made the project interesting from a full-stack perspective because the UI, API, data model, and execution layer all had to fit together cleanly.

What I Worked On

My work focused on full-stack product delivery across areas such as:

  • building and refining product screens in the React application
  • wiring frontend flows to GraphQL queries and mutations
  • supporting backend logic for workflow execution, scenario handling, and organization-scoped data
  • working around the automation pipeline that submits jobs to browser-based execution workers
  • helping keep the product usable across operational dashboards, configuration pages, and result tracking flows

Key Engineering Decisions

1. GraphQL for dashboard-heavy product flows

OpsGuide has many screens where the UI needs related data from multiple domains at once: organizations, form targets, companies, scenarios, results, and usage information. Using GraphQL made sense because the frontend could request exactly the shape needed for a screen instead of stitching together many REST calls.

2. Separate web requests from automation execution

Long-running browser automation jobs are a bad fit for synchronous API handling. The architecture separates user-triggered requests from execution workers by pushing automation to AWS Batch and related orchestration components. That keeps the web app responsive while making retries, status tracking, and failure handling more manageable.

3. Multi-tenant data boundaries from the start

Because the product serves multiple organizations, tenant boundaries had to be part of the design, not an afterthought. Organization-aware data models, permission checks, and usage tracking were important to keep operational data isolated and billing or limit logic enforceable.

4. Hybrid AI + rule-based scenario generation

One of the more interesting parts of the platform is automatic scenario generation for forms. A purely LLM-driven approach would be expensive and less predictable, so the system uses rule-based detection for common cases and falls back to AI for more complex form structures. That is a better engineering tradeoff than forcing every case through the most expensive path.

Full-Stack Challenges

  • Async execution model: users expect immediate feedback, but the actual work may happen later in distributed workers
  • Workflow visibility: results, failures, retries, and state transitions need to be understandable in the UI
  • Tenant safety: organization boundaries affect queries, mutations, limits, and permissions
  • Dynamic websites: external forms change often, so automation needs to be resilient instead of brittle
  • Cost vs capability: AI-assisted generation is useful, but it has to be controlled carefully

Outcome

This project strengthened the kind of work that is most relevant to my full-stack profile: product-facing frontend development, backend API work, domain modeling, and integration with asynchronous execution systems. It is the kind of system where a feature is only complete when the UI, API, data flow, and background execution all work together.

Lessons Learned

  • Designing for async operations changes both backend implementation and frontend UX decisions
  • Multi-tenant systems become much easier to maintain when organization boundaries are explicit everywhere
  • Workflow products need clear status modeling; otherwise the product feels unreliable even when jobs succeed
  • AI features are more useful in production when they are combined with deterministic rules and guardrails

Attachments

Execution Flow List

OpsGuide Execution Flow List

Flow Designer / Visual Editor

OpsGuide Flow Designer

Execution History

OpsGuide Execution History