Command Center

Single Dashboard. Every Incident. Every Fix. Complete Visibility.

Unified interface for infrastructure health monitoring, incident timelines, playbook management, and compliance reporting. Role-based views for executives, engineers, and auditors. Real-time updates, historical analysis, and audit trail access—all in one place.

Start Free (3 Nodes)Explore Dashboard →

1 dashboardComplete visibilityNo tool-switching required

4 viewsRole-based interfacesExecutive, Technical, Reports, Settings

Real-timeLive updates<5 second latency from incident to display

Multi-tenantMSP readySingle dashboard, 150+ clients

Four Views for Four Audiences

Command Center has four primary views, each optimized for different user needs. Executives see business impact. Engineers see real-time incidents. Compliance officers see audit evidence. Administrators configure the platform.

Audience: CTOs, VPs Engineering, Executives

Executive Dashboard

MTTR (30d)

92s

↓ 12% vs last month

Autonomous Resolution

87%

↑ 3% · industry avg: 0%

Engineering Time Saved

427 hrs

= 2.7 FTE / month

Cost Avoidance / mo

$34,160

vs. manual resolution

MTTR Trend (10 weeks)

W1W5W10

Incident Volume (Weekly)

Total Autonomous (87%)

498 hosts online (99.6%)

Compliance: 100%

What Executives Do Here

Review monthly/quarterly trends (not individual incidents)
Share metrics with board/investors
Justify platform ROI with saved engineering hours
Export executive summary reports (PDF, 1-pager)

Frequency: Weekly check-in (5 minutes)

Single Dashboard, 150+ Clients

MSPs manage dozens or hundreds of clients. SentienGuard eliminates context switching with one multi-tenant dashboard—client filtering, data isolation, per-client reporting, and role-based access across your entire portfolio.

The MSP Challenge

Managing 150 clients = 150 different dashboards:

Datadog: 150 accounts

PagerDuty: 150 instances

New Relic: 150 logins

Constant context-switching

SentienGuard Solution

Multi-Tenant Dashboard

Client:All Clients (150) ▾Filter by tag ▾

Total Clients

150

3,600 hosts

Incidents (24h)

41 auto · 6 manual

Portfolio Uptime

99.8%

148 nominal

Clients Requiring Attention

Beta Industries·2 pending approvals

Action

Gamma LLC·1 failed playbook

Review

148 clients: all systems nominal

Client: Acme Corp

hosts

incidents (7d)

78s

avg MTTR

Client Isolation (Security)

Data Segregation

Client A data:
  - Audit logs: s3://sentienguard-logs/client-a/
  - Metrics: timeseries-db/client-a/
  - Playbooks: namespace=client-a

Client B data:
  - Audit logs: s3://sentienguard-logs/client-b/
  - Metrics: timeseries-db/client-b/
  - Playbooks: namespace=client-b

Result: Client A cannot see Client B's data (database-level isolation)

User Permissions

# MSP engineer assigned to specific clients
users:
  - email: engineer1@msp.com
    role: Remediation Authority
    clients: [client-a, client-b, client-c]  # Can only access these 3

  - email: engineer2@msp.com
    role: Remediation Authority
    clients: [client-d, client-e]  # Can only access these 2

  - email: manager@msp.com
    role: Administrator
    clients: all  # Can access all 150 clients

Access Control Enforcement

Engineer1 logs in:
  - Dashboard shows only: Client A, Client B, Client C
  - Cannot see: Client D, Client E, ... Client Z
  - Cannot switch to unauthorized clients (dropdown filtered)

Engineer1 attempts to access Client D URL directly:
  - Result: 403 Forbidden
  - Audit log: Unauthorized access attempt recorded

MSP Reporting

Per-Client Report — Acme Corp

January 2026

Hosts24Uptime99.8%Auto resolved41 (87%)Avg MTTR78sTime saved35 hrsCost avoided$2,800

Aggregate MSP Report

January 2026

Clients150Hosts3,600Time saved5,412 hrsFTE equiv3.4 FTEClients/eng12.5 (↑ from 10)Gross margin86% (↑ from 68%)

Find Any Incident in Seconds

Full-text search across all log fields, advanced multi-filter combinations, saved filter sets for recurring queries, and a visual timeline for time-based investigation.

Search & Filtering

Search incidents, hosts, playbooks…

Last 7 daysProductionSuccessAutonomous+ Add filter

47 results matching all filters

prod-db-03·disk_cleanup_prod_db

2 minResolved

prod-db-05·disk_cleanup_prod_db

1 hrResolved

prod-db-07·disk_cleanup_prod_db

3 hrsResolved

+ 44 more results

Saved Filter Sets

Production failures (last 24h)

env:production · result:failed · 24h

Manual approvals (this week)

actor:user · action:approval · 7d

HIPAA systems (Q4 2025)

tag:hipaa · Oct–Dec 2025

Visual Incident Timeline — Feb 10, 2026

14:00

prod-api-07 · k8s_pod_restart

Resolved (23s)

15:00

prod-db-03 · disk_cleanup

Resolved (87s)

15:30

prod-db-05 · postgres_reset

Awaiting approval

16:30

staging-web-02 · ssl_cert_renewal

Failed — CA unreachable

Live Dashboard, <5 Second Latency

WebSocket connections push updates to all connected clients in real-time. Incident feed, metrics, approval requests, and health maps all update live—no manual refresh needed.

WebSocket Connection (Client-Server)

Browser opens dashboard:
  1. Establish WebSocket connection to control.sentienguard.com
  2. Subscribe to real-time incident feed
  3. Receive updates as incidents occur

Incident detected at 14:35:43:
  1. Control plane detects anomaly (14:35:43.124Z)
  2. WebSocket broadcast to all connected clients (14:35:43.289Z)
  3. Dashboard updates incident feed (14:35:43.450Z)

Total latency: 326ms (detection to display)

What Updates in Real-Time

1. Incident Feed (Live)

New — 0 sec ago

prod-db-03 · disk_cleanup

Executing…

↓ 87 seconds later

Updated — 1 min ago

prod-db-03 · disk_cleanup

Resolved (87s MTTR)

2. Metrics (Live)

Hosts online:498→499(new host added)

Incidents (24h):47→48(new incident)

MTTR (30d):92s→91s(improving)

3. Approval Requests (Live)

Slack & Dashboard simultaneously

prod-db-05 · postgres_connection_reset

ApproveDeny

4. Health Map (Live)

prod-db-07: High disk (anomaly detected)

↓ after resolution

prod-db-07: Healthy (resolved)

Notification Preferences

Dashboard

Banner for approvals

Sound for critical failures

Badge count for unread

Slack

Approval requests

Failed playbooks

Successful resolutions

Daily summary (8 AM)

Failed playbooks

Individual incidents

Full Dashboard Access on Mobile

Responsive layouts for on-call engineers. Approve playbooks, view incident feeds, and check infrastructure health from your phone. Push notifications for approval requests.

Incident Feed

prod-db-03

disk_cleanup

Resolved (87s)

prod-db-05

postgres_reset

Awaiting Approval

staging-web-02

ssl_cert_renewal

Failed

Approval Flow

Approval Required

prod-db-05

postgres_connection_reset

Connection pool: 98%

ApproveDeny

↓ after tap

Approved

Playbook executing now…

Metrics Dashboard

Today

Incidents: 12Resolved: 92%

This Week

MTTR: 92sAuto: 87%

Infrastructure

498/500 hosts1 alert

Create, Edit, Test Playbooks in Dashboard

Built-in YAML editor with syntax highlighting, auto-completion, real-time validation, linting, dry-run testing, and full version control. No external tools needed.

Built-In YAML Editor Features

Syntax highlighting (YAML keywords, values, comments)Auto-completion (playbook fields, step actions)Real-time validation and error checkingLinting (best practice suggestions)Version control (save multiple versions)

Playbook Creation Workflow

Step 1: Template

Create New Playbook

Dashboard → Playbooks → [Create New Playbook]

Template selection:
  - Blank playbook
  - Disk cleanup template
  - Service restart template
  - Kubernetes template
  - Custom command template

Selected: Service restart template

Step 3: Validate

YAML Validation

Click: [Validate YAML]

Validation results:
  ✅ Syntax: Valid YAML
  ✅ Schema: All required fields present
  ✅ Commands: Syntax valid
  ✅ Rollback: Defined for critical steps
  ⚠️  Warning: approval_gate.required=true (needs approval each time)

Lint suggestions:
  💡 Consider adding health check timeout (currently unlimited)
  💡 Add tags for better searchability

[Fix Warnings] [Save Anyway]

Step 2: Edit YAML

Playbook YAML Editor

name: custom_app_restart
version: 1.0.0
description: |
  Restart custom application when memory exceeds 90%.
  Gracefully stops app, clears cache, restarts, verifies health.

metadata:
  tags: ["memory", "restart", "application"]
  author: "alice.jones@company.com"
  created: "2026-02-10"

trigger:
  metric: memory_usage
  threshold: "> 90%"
  duration: 5m

approval_gate:
  required: true  # First deployment, require approval
  notify_channel: "#ops-production"

steps:
  - name: stop_application
    action: ssh_command
    command: "systemctl stop custom-app"
    timeout: 30s
    rollback: "systemctl start custom-app"

  - name: clear_cache
    action: ssh_command
    command: "rm -rf /var/cache/custom-app/*"
    timeout: 10s

  - name: start_application
    action: ssh_command
    command: "systemctl start custom-app"
    timeout: 30s
    rollback: "systemctl stop custom-app"

verification:
  - type: http
    url: "http://localhost:8080/health"
    expected_status: 200
    retry: 3
    retry_delay: 10s

  - type: metric
    metric: memory_usage
    threshold: "< 85%"

notes: |
  Initial version. Monitor success rate over 10 runs before enabling
  autonomous execution (approval_gate: required: false).

Step 4: Test (Dry-Run)

Dry-Run Results

Click: [Test Playbook]
Target host: staging-app-01 (dropdown)
Mode: Dry-run (no actual execution)
Click: [Run Test]

Dry-run results:
  ✅ Step 1: stop_application (simulated, 0.2s)
  ✅ Step 2: clear_cache (simulated, 0.1s)
  ✅ Step 3: start_application (simulated, 0.3s)
  ✅ Verification: http health check (simulated, PASS)
  ✅ Verification: memory check (simulated, PASS)

Total estimated duration: 87 seconds
Estimated success probability: 94% (based on similar playbooks)
[Save Playbook] [Deploy to Production]

Step 5: Deploy

Save & Deploy

Click: [Save Playbook]

Playbook saved:
  - Name: custom_app_restart
  - Version: 1.0.0
  - Status: Active
  - Approval: Required (until proven)

Next steps:
  1. Trigger manually on staging to validate
  2. Review execution logs
  3. After 5 successful runs, consider autonomous mode

[Trigger Manually] [View in Library]

Playbook Version Control

Version History — disk_cleanup_prod_db

Current: v1.4.2

v1.4.2Current2026-02-01

Added verification retry logic

47 runs96% successby bob.chen

v1.4.12026-01-15

Increased timeout for log rotation step

142 runs94% successby alice.jones

v1.4.02025-12-10

Added hash verification step

203 runs92% successby bob.chen

+ 8 more versions

Diff View (v1.4.1 → v1.4.2)

name: disk_cleanup_prod_db
-version: 1.4.1
+version: 1.4.2
description: Clear disk space on production databases

steps:
  - name: clear_temp_files
    action: ssh_command
    command: "find /tmp -type f -mtime +7 -delete"
-   timeout: 30s
+   timeout: 60s  # Increased timeout for large temp directories

verification:
  - type: metric
    metric: disk_usage
    threshold: "< 80%"
+   retry: 3  # NEW: Retry verification 3 times
+   retry_delay: 10s

Changelog:
  + Added retry logic for health verification
  + Increased timeout for temp file deletion
  + Improved reliability by 2% (96% vs 94% success rate)

Common Questions

Yes. Unlimited concurrent users. Each sees real-time updates independently. Perfect for NOC (Network Operations Center) wall displays, team collaboration, or distributed teams.

Dashboard enters offline mode (banner notification). Shows last known state. Reconnects automatically when internet restored. Missed updates loaded on reconnect. No data loss.

Yes. Generate shareable links with read-only access. Options include specific incident timelines, infrastructure health maps, and compliance reports. Set expiration from 1 day to never. No login required for read-only links.

Multiple export options: CSV (all incidents with filters applied), JSON (machine-readable for custom tooling), PDF (executive summaries, compliance reports), API (programmatic access for custom dashboards).

Yes (Enterprise tier). Drag-and-drop customization: move incident feed, health map, playbook performance widgets. Hide irrelevant sections like anomaly detection or compliance widgets. Save custom layouts for different workflows.

Command Center shows actions taken, not just metrics. Datadog says "Disk 91%" (observation). Command Center says "Disk 91% → 72% via disk_cleanup_prod_db (87s)" (observation + action + outcome). Focus: what we did about problems, not just what problems exist.

See Command Center in Action

Deploy agents, open Command Center, watch incidents resolve in real-time, review audit trail, generate reports.

First Login Experience

3 agents deployed and connected

Baseline learning in progress (Day 2 of 7)

First autonomous resolution pending

What to explore:

Incident TimelineInfrastructure HealthPlaybook Library (50+)Settings & RBAC

Start Free (3 Nodes)Watch Demo Video →

Free tier: 3 nodes, full Command Center access, unlimited users, no credit card.