Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 120 additions & 0 deletions docs/remote-troubleshoot-action.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Remote Troubleshoot Execution Action

The Remote Troubleshoot Execution Action is an optional feature that surfaces a button in the task properties panel when a pipeline execution is in a problematic state. Clicking it opens a modal where the user can add context, then submits a structured payload to a configurable endpoint — allowing the deployment operator to integrate with an external troubleshooting or on-call system.

The feature is intentionally OSS-friendly: it renders nothing when unconfigured, and the payload schema is generic enough to adapt to any HTTP-based integration.

## How it works

1. When a task node is selected in the run view, the button appears if:
- `window.__TANGLE_REMOTE_TROUBLESHOOT_ACTION__` is set (see [Configuration](#configuration))
- The hostname is not `localhost` / `127.0.0.1` / `*.local`
- The execution status is immediately eligible (`FAILED`, `CANCELLED`, `SYSTEM_ERROR`), or has been `PENDING` / `QUEUED` for more than 5 minutes
2. After submission, a localStorage record is written keyed by `(runId, executionId)`. Subsequent views of the same execution show "session opened at [time]" instead of the button, preventing duplicate requests.

## Configuration

Set `window.__TANGLE_REMOTE_TROUBLESHOOT_ACTION__` to a `RemoteTroubleshootActionConfig` object before the React app mounts. Because the config is static deploy-time data, inline it directly in your `index.html` with a synchronous `<script>`. This sets the global at HTML parse time, so it is ready before the app mounts without a render-blocking network round-trip:

```html
<script>
window.__TANGLE_REMOTE_TROUBLESHOOT_ACTION__ = {
endpointUrl: "https://your-backend.example.com/api/troubleshoot",
buttonText: "Get help with this execution",
modalTitle: "Get help with this execution",
modalDescription:
"Describe the problem or add any context, then submit to notify your support channel.",
successTitle: "Request submitted",
successMessage:
"Your request has been submitted. Someone will follow up shortly.",
source: "my-deployment",
};
</script>
```

Leaving the global unset (the default) disables the feature entirely — the button renders nothing.

### Config shape

```typescript
interface RemoteTroubleshootActionConfig {
/** URL the payload will be POSTed to. */
endpointUrl: string;

/** Label shown on the button and used as the default modal title. */
buttonText: string;

/** Optional modal title (defaults to buttonText). */
modalTitle?: string;

/** Optional modal description shown above the comments textarea. */
modalDescription?: string;

/** Optional heading shown in the success state (defaults to "Request submitted"). */
successTitle?: string;

/** Optional body shown in the success state. */
successMessage?: string;

/**
* Optional source tag included in the payload (defaults to "tangle-ui").
* Use this to distinguish requests from different deployments.
*/
source?: string;
}
```

### Minimal example

Only `endpointUrl` and `buttonText` are required:

```html
<script>
window.__TANGLE_REMOTE_TROUBLESHOOT_ACTION__ = {
endpointUrl:
"https://your-backend.example.com/api/remote_troubleshoot/execution",
buttonText: "Get help with this execution",
};
</script>
```

## Payload

The button POSTs the following JSON body to `endpointUrl`:

```json
{
"execution_id": "abc123",
"user_email": "user@example.com",
"pipeline_run_id": "run-456",
"pipeline_run_url": "https://your-deployment.example.com/runs/run-456",
"execution_url": "https://your-deployment.example.com/runs/run-456?nodeId=MyTask",
"additional_comments": "Optional user-provided context.",
"source": "tangle-ui"
}
```

| Field | Description |
| --------------------- | ------------------------------------------------------------ |
| `execution_id` | ID of the specific task execution |
| `user_email` | Authenticated user's email (empty string if unavailable) |
| `pipeline_run_id` | ID of the parent pipeline run |
| `pipeline_run_url` | URL of the run view page |
| `execution_url` | Deep-link to the specific task node via `?nodeId=<taskName>` |
| `additional_comments` | Free-text context entered by the user in the modal |
| `source` | Deployment identifier from config (`source` field) |

The endpoint is expected to return any `2xx` status on success. Any non-`2xx` response causes the modal to return to the input state so the user can retry.

## Visibility logic

| Status | Visible after |
| -------------- | ------------- |
| `FAILED` | Immediately |
| `CANCELLED` | Immediately |
| `SYSTEM_ERROR` | Immediately |
| `PENDING` | 5 minutes |
| `QUEUED` | 5 minutes |
| All others | Never |

The 5-minute timer is measured from when the component first observed the status in the current browser session (not from when the execution started). It is polled every 10 seconds to avoid re-rendering on every render cycle.
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
import "@/config/remoteTroubleshootAction";

import { useEffect, useRef, useState } from "react";

import { Button } from "@/components/ui/button";
import { InlineStack } from "@/components/ui/layout";
import { Paragraph } from "@/components/ui/typography";
import type { RemoteTroubleshootActionConfig } from "@/config/remoteTroubleshootAction";
import {
getRemoteTroubleshootRecord,
saveRemoteTroubleshootRecord,
} from "@/utils/remoteTroubleshootStorage";
import { getUserDetails } from "@/utils/user";

import { RemoteTroubleshootDialog } from "./RemoteTroubleshootDialog";

const ALWAYS_ELIGIBLE_STATUSES = new Set([
"CANCELLED",
"SYSTEM_ERROR",
"FAILED",
]);
const TIMER_ELIGIBLE_STATUSES = new Set(["PENDING", "QUEUED"]);
const PENDING_THRESHOLD_MS = 5 * 60 * 1000;
const POLL_INTERVAL_MS = 10 * 1000;

function isLocalEnvironment(): boolean {
const h = window.location.hostname;
return h === "localhost" || h === "127.0.0.1" || h.endsWith(".local");
}

function getConfig(): RemoteTroubleshootActionConfig | null {
return window.__TANGLE_REMOTE_TROUBLESHOOT_ACTION__ ?? null;
}

interface RemoteTroubleshootButtonProps {
runId: string;
executionId: string | undefined;
taskName: string;
status: string | undefined;
}

export function RemoteTroubleshootButton({
runId,
executionId,
taskName,
status,
}: RemoteTroubleshootButtonProps) {
const config = getConfig();
const [dialogOpen, setDialogOpen] = useState(false);
const [timerReady, setTimerReady] = useState(false);
const firstObservedRef = useRef<number | null>(null);
const intervalRef = useRef<ReturnType<typeof setInterval> | null>(null);

const isTimerStatus =
status !== undefined && TIMER_ELIGIBLE_STATUSES.has(status);
const isAlwaysEligible =
status !== undefined && ALWAYS_ELIGIBLE_STATUSES.has(status);

useEffect(() => {
if (!isTimerStatus) {
firstObservedRef.current = null;
setTimerReady(false);
if (intervalRef.current !== null) {
clearInterval(intervalRef.current);
intervalRef.current = null;
}
return;
}

if (firstObservedRef.current === null) {
firstObservedRef.current = Date.now();
}

const check = () => {
if (
firstObservedRef.current !== null &&
Date.now() - firstObservedRef.current >= PENDING_THRESHOLD_MS
) {
setTimerReady(true);
if (intervalRef.current !== null) {
clearInterval(intervalRef.current);
intervalRef.current = null;
}
}
};

check();

if (!timerReady && intervalRef.current === null) {
intervalRef.current = setInterval(check, POLL_INTERVAL_MS);
}

return () => {
if (intervalRef.current !== null) {
clearInterval(intervalRef.current);
intervalRef.current = null;
}
};
}, [isTimerStatus, timerReady]);

if (
!config ||
isLocalEnvironment() ||
!status ||
!executionId ||
(!isAlwaysEligible && !(isTimerStatus && timerReady))
) {
return null;
}

const existingRecord = getRemoteTroubleshootRecord(runId, executionId);

if (existingRecord) {
const requestedAt = new Date(existingRecord.requestedAt);
const formatted = requestedAt.toLocaleString(undefined, {
month: "short",
day: "numeric",
hour: "numeric",
minute: "2-digit",
});
return (
<InlineStack gap="1" className="px-1 py-1">
<Paragraph size="xs" tone="subdued">
{config.buttonText} session opened {formatted}.
</Paragraph>
</InlineStack>
);
}

const buildExecutionUrl = () => {
const url = new URL(window.location.href);
url.search = "";
url.searchParams.set("nodeId", taskName);
return url.toString();
};

const handleSubmit = async (additionalComments: string) => {
let userEmail = "";
try {
const user = await getUserDetails();
userEmail = user.id ?? "";
} catch {
// leave empty if unavailable
}

const payload = {
execution_id: executionId,
user_email: userEmail,
pipeline_run_id: runId,
pipeline_run_url: `${window.location.origin}${window.location.pathname}`,
execution_url: buildExecutionUrl(),
additional_comments: additionalComments,
source: config.source ?? "tangle-ui",
};

const response = await fetch(config.endpointUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(payload),
});

if (!response.ok) {
throw new Error(`Remote troubleshoot request failed: ${response.status}`);
}

saveRemoteTroubleshootRecord(runId, executionId);
};

return (
<>
<Button
variant="outline"
className="w-full"
onClick={() => setDialogOpen(true)}
>
{config.buttonText}
</Button>
<RemoteTroubleshootDialog
open={dialogOpen}
title={config.modalTitle ?? config.buttonText}
description={config.modalDescription}
successTitle={config.successTitle ?? "Request submitted"}
successMessage={
config.successMessage ??
"Your request has been submitted successfully."
}
onSubmit={handleSubmit}
onClose={() => setDialogOpen(false)}
/>
</>
);
}
Loading
Loading