SentienGuard
Home>Product>Command Center

Command Center

Single Dashboard. Every Incident. Every Fix. Complete Visibility.

Unified interface for infrastructure health monitoring, incident timelines, playbook management, and compliance reporting. Role-based views for executives, engineers, and auditors. Real-time updates, historical analysis, and audit trail access—all in one place.

1 dashboardComplete visibilityNo tool-switching required
4 viewsRole-based interfacesExecutive, Technical, Reports, Settings
Real-timeLive updates<5 second latency from incident to display
Multi-tenantMSP readySingle dashboard, 150+ clients

Four Views for Four Audiences

Command Center has four primary views, each optimized for different user needs. Executives see business impact. Engineers see real-time incidents. Compliance officers see audit evidence. Administrators configure the platform.

Audience: CTOs, VPs Engineering, Executives

Executive Dashboard

MTTR (30d)

92s

↓ 12% vs last month

Autonomous Resolution

87%

↑ 3% · industry avg: 0%

Engineering Time Saved

427 hrs

= 2.7 FTE / month

Cost Avoidance / mo

$34,160

vs. manual resolution

MTTR Trend (10 weeks)

W1W5W10

Incident Volume (Weekly)

Total Autonomous (87%)
498 hosts online (99.6%)
Compliance: 100%

What Executives Do Here

  • Review monthly/quarterly trends (not individual incidents)
  • Share metrics with board/investors
  • Justify platform ROI with saved engineering hours
  • Export executive summary reports (PDF, 1-pager)

Frequency: Weekly check-in (5 minutes)

Single Dashboard, 150+ Clients

MSPs manage dozens or hundreds of clients. SentienGuard eliminates context switching with one multi-tenant dashboard—client filtering, data isolation, per-client reporting, and role-based access across your entire portfolio.

The MSP Challenge

Managing 150 clients = 150 different dashboards:

Datadog: 150 accounts
PagerDuty: 150 instances
New Relic: 150 logins
Constant context-switching
SentienGuard Solution
Multi-Tenant Dashboard
Client:All Clients (150) ▾Filter by tag ▾

Total Clients

150

3,600 hosts

Incidents (24h)

47

41 auto · 6 manual

Portfolio Uptime

99.8%

148 nominal

Clients Requiring Attention

Beta Industries·2 pending approvals
Action
Gamma LLC·1 failed playbook
Review
148 clients: all systems nominal

Client: Acme Corp

24

hosts

12

incidents (7d)

78s

avg MTTR

Client Isolation (Security)

Data Segregation
Client A data:
  - Audit logs: s3://sentienguard-logs/client-a/
  - Metrics: timeseries-db/client-a/
  - Playbooks: namespace=client-a

Client B data:
  - Audit logs: s3://sentienguard-logs/client-b/
  - Metrics: timeseries-db/client-b/
  - Playbooks: namespace=client-b

Result: Client A cannot see Client B's data (database-level isolation)
User Permissions
# MSP engineer assigned to specific clients
users:
  - email: engineer1@msp.com
    role: Remediation Authority
    clients: [client-a, client-b, client-c]  # Can only access these 3

  - email: engineer2@msp.com
    role: Remediation Authority
    clients: [client-d, client-e]  # Can only access these 2

  - email: manager@msp.com
    role: Administrator
    clients: all  # Can access all 150 clients
Access Control Enforcement
Engineer1 logs in:
  - Dashboard shows only: Client A, Client B, Client C
  - Cannot see: Client D, Client E, ... Client Z
  - Cannot switch to unauthorized clients (dropdown filtered)

Engineer1 attempts to access Client D URL directly:
  - Result: 403 Forbidden
  - Audit log: Unauthorized access attempt recorded

MSP Reporting

Per-Client Report — Acme Corp

January 2026

Hosts24Uptime99.8%Auto resolved41 (87%)Avg MTTR78sTime saved35 hrsCost avoided$2,800

Aggregate MSP Report

January 2026

Clients150Hosts3,600Time saved5,412 hrsFTE equiv3.4 FTEClients/eng12.5 (↑ from 10)Gross margin86% (↑ from 68%)

Find Any Incident in Seconds

Full-text search across all log fields, advanced multi-filter combinations, saved filter sets for recurring queries, and a visual timeline for time-based investigation.

Search & Filtering
Search incidents, hosts, playbooks…
Search
Last 7 daysProductionSuccessAutonomous+ Add filter

47 results matching all filters

prod-db-03·disk_cleanup_prod_db
2 minResolved
prod-db-05·disk_cleanup_prod_db
1 hrResolved
prod-db-07·disk_cleanup_prod_db
3 hrsResolved
+ 44 more results

Saved Filter Sets

Production failures (last 24h)

env:production · result:failed · 24h

Manual approvals (this week)

actor:user · action:approval · 7d

HIPAA systems (Q4 2025)

tag:hipaa · Oct–Dec 2025

Visual Incident Timeline — Feb 10, 2026

14:00

prod-api-07 · k8s_pod_restart

Resolved (23s)

15:00

prod-db-03 · disk_cleanup

Resolved (87s)

15:30

prod-db-05 · postgres_reset

Awaiting approval

16:30

staging-web-02 · ssl_cert_renewal

Failed — CA unreachable

Live Dashboard, <5 Second Latency

WebSocket connections push updates to all connected clients in real-time. Incident feed, metrics, approval requests, and health maps all update live—no manual refresh needed.

WebSocket Connection (Client-Server)
Browser opens dashboard:
  1. Establish WebSocket connection to control.sentienguard.com
  2. Subscribe to real-time incident feed
  3. Receive updates as incidents occur

Incident detected at 14:35:43:
  1. Control plane detects anomaly (14:35:43.124Z)
  2. WebSocket broadcast to all connected clients (14:35:43.289Z)
  3. Dashboard updates incident feed (14:35:43.450Z)

Total latency: 326ms (detection to display)

What Updates in Real-Time

1. Incident Feed (Live)

New — 0 sec ago

prod-db-03 · disk_cleanup

Executing…

↓ 87 seconds later

Updated — 1 min ago

prod-db-03 · disk_cleanup

Resolved (87s MTTR)

2. Metrics (Live)

Hosts online:498499(new host added)
Incidents (24h):4748(new incident)
MTTR (30d):92s91s(improving)

3. Approval Requests (Live)

Slack & Dashboard simultaneously

prod-db-05 · postgres_connection_reset

ApproveDeny

4. Health Map (Live)

prod-db-07: High disk (anomaly detected)
↓ after resolution
prod-db-07: Healthy (resolved)

Notification Preferences

Dashboard

Banner for approvals
Sound for critical failures
Badge count for unread

Slack

Approval requests
Failed playbooks
Successful resolutions

Email

Daily summary (8 AM)
Failed playbooks
Individual incidents

Full Dashboard Access on Mobile

Responsive layouts for on-call engineers. Approve playbooks, view incident feeds, and check infrastructure health from your phone. Push notifications for approval requests.

Incident Feed

prod-db-03

disk_cleanup

Resolved (87s)

prod-db-05

postgres_reset

Awaiting Approval

staging-web-02

ssl_cert_renewal

Failed

Approval Flow

Approval Required

prod-db-05

postgres_connection_reset

Connection pool: 98%

ApproveDeny
↓ after tap

Approved

Playbook executing now…

Metrics Dashboard

Today

Incidents: 12Resolved: 92%

This Week

MTTR: 92sAuto: 87%

Infrastructure

498/500 hosts1 alert

Create, Edit, Test Playbooks in Dashboard

Built-in YAML editor with syntax highlighting, auto-completion, real-time validation, linting, dry-run testing, and full version control. No external tools needed.

Built-In YAML Editor Features

Syntax highlighting (YAML keywords, values, comments)Auto-completion (playbook fields, step actions)Real-time validation and error checkingLinting (best practice suggestions)Version control (save multiple versions)

Playbook Creation Workflow

Step 1: Template
Create New Playbook
Dashboard → Playbooks → [Create New Playbook]

Template selection:
  - Blank playbook
  - Disk cleanup template
  - Service restart template
  - Kubernetes template
  - Custom command template

Selected: Service restart template
Step 3: Validate
YAML Validation
Click: [Validate YAML]

Validation results:
  ✅ Syntax: Valid YAML
  ✅ Schema: All required fields present
  ✅ Commands: Syntax valid
  ✅ Rollback: Defined for critical steps
  ⚠️  Warning: approval_gate.required=true (needs approval each time)

Lint suggestions:
  💡 Consider adding health check timeout (currently unlimited)
  💡 Add tags for better searchability

[Fix Warnings] [Save Anyway]
Step 2: Edit YAML
Playbook YAML Editor
name: custom_app_restart
version: 1.0.0
description: |
  Restart custom application when memory exceeds 90%.
  Gracefully stops app, clears cache, restarts, verifies health.

metadata:
  tags: ["memory", "restart", "application"]
  author: "alice.jones@company.com"
  created: "2026-02-10"

trigger:
  metric: memory_usage
  threshold: "> 90%"
  duration: 5m

approval_gate:
  required: true  # First deployment, require approval
  notify_channel: "#ops-production"

steps:
  - name: stop_application
    action: ssh_command
    command: "systemctl stop custom-app"
    timeout: 30s
    rollback: "systemctl start custom-app"

  - name: clear_cache
    action: ssh_command
    command: "rm -rf /var/cache/custom-app/*"
    timeout: 10s

  - name: start_application
    action: ssh_command
    command: "systemctl start custom-app"
    timeout: 30s
    rollback: "systemctl stop custom-app"

verification:
  - type: http
    url: "http://localhost:8080/health"
    expected_status: 200
    retry: 3
    retry_delay: 10s

  - type: metric
    metric: memory_usage
    threshold: "< 85%"

notes: |
  Initial version. Monitor success rate over 10 runs before enabling
  autonomous execution (approval_gate: required: false).
Step 4: Test (Dry-Run)
Dry-Run Results
Click: [Test Playbook]
Target host: staging-app-01 (dropdown)
Mode: Dry-run (no actual execution)
Click: [Run Test]

Dry-run results:
  ✅ Step 1: stop_application (simulated, 0.2s)
  ✅ Step 2: clear_cache (simulated, 0.1s)
  ✅ Step 3: start_application (simulated, 0.3s)
  ✅ Verification: http health check (simulated, PASS)
  ✅ Verification: memory check (simulated, PASS)

Total estimated duration: 87 seconds
Estimated success probability: 94% (based on similar playbooks)
[Save Playbook] [Deploy to Production]
Step 5: Deploy
Save & Deploy
Click: [Save Playbook]

Playbook saved:
  - Name: custom_app_restart
  - Version: 1.0.0
  - Status: Active
  - Approval: Required (until proven)

Next steps:
  1. Trigger manually on staging to validate
  2. Review execution logs
  3. After 5 successful runs, consider autonomous mode

[Trigger Manually] [View in Library]

Playbook Version Control

Version History — disk_cleanup_prod_db

Current: v1.4.2

v1.4.2Current2026-02-01

Added verification retry logic

47 runs96% successby bob.chen
v1.4.12026-01-15

Increased timeout for log rotation step

142 runs94% successby alice.jones
v1.4.02025-12-10

Added hash verification step

203 runs92% successby bob.chen

+ 8 more versions

Diff View (v1.4.1 → v1.4.2)
name: disk_cleanup_prod_db
-version: 1.4.1
+version: 1.4.2
description: Clear disk space on production databases

steps:
  - name: clear_temp_files
    action: ssh_command
    command: "find /tmp -type f -mtime +7 -delete"
-   timeout: 30s
+   timeout: 60s  # Increased timeout for large temp directories

verification:
  - type: metric
    metric: disk_usage
    threshold: "< 80%"
+   retry: 3  # NEW: Retry verification 3 times
+   retry_delay: 10s

Changelog:
  + Added retry logic for health verification
  + Increased timeout for temp file deletion
  + Improved reliability by 2% (96% vs 94% success rate)

Common Questions

Yes. Unlimited concurrent users. Each sees real-time updates independently. Perfect for NOC (Network Operations Center) wall displays, team collaboration, or distributed teams.

Dashboard enters offline mode (banner notification). Shows last known state. Reconnects automatically when internet restored. Missed updates loaded on reconnect. No data loss.

Yes. Generate shareable links with read-only access. Options include specific incident timelines, infrastructure health maps, and compliance reports. Set expiration from 1 day to never. No login required for read-only links.

Multiple export options: CSV (all incidents with filters applied), JSON (machine-readable for custom tooling), PDF (executive summaries, compliance reports), API (programmatic access for custom dashboards).

Yes (Enterprise tier). Drag-and-drop customization: move incident feed, health map, playbook performance widgets. Hide irrelevant sections like anomaly detection or compliance widgets. Save custom layouts for different workflows.

Command Center shows actions taken, not just metrics. Datadog says "Disk 91%" (observation). Command Center says "Disk 91% → 72% via disk_cleanup_prod_db (87s)" (observation + action + outcome). Focus: what we did about problems, not just what problems exist.

See Command Center in Action

Deploy agents, open Command Center, watch incidents resolve in real-time, review audit trail, generate reports.

First Login Experience
3 agents deployed and connected
Baseline learning in progress (Day 2 of 7)
First autonomous resolution pending

What to explore:

Incident TimelineInfrastructure HealthPlaybook Library (50+)Settings & RBAC

Free tier: 3 nodes, full Command Center access, unlimited users, no credit card.