Agent Architecture

Lightweight. Outbound-Only. Zero Inbound Attack Surface.

50 MB agent binary, <100 MB resident memory, <0.5% CPU steady-state. Outbound HTTPS only—no listening ports, no inbound connections, no VPN dependency. Deploys in 2 minutes via curl, Helm, or Docker.

Start Free (3 Nodes)View Playbook Library →

50 MBAgent binary sizeCompiled Go binary

<100 MBResident memory (RSS)Steady-state operation

0 portsInbound listeningOutbound HTTPS (443) only

2 minInstall to first metricOne-line deployment

Four Principles That Define Agent Architecture

Outbound-Only Communication

Why: Inbound ports = attack surface. Every listening port is a potential entry point for exploitation.

Implementation

Agent initiates all connections to control plane
Protocol: HTTPS on port 443 (outbound)
No inbound ports opened on host
No listening services exposed
Firewall-friendly (works through NAT, proxies, corporate firewalls)

What This Prevents

Port scanning attacks (no services to discover)
Remote code execution via exposed endpoints
Lateral movement (compromised agent can't accept commands from attacker)
Network-based exploits targeting agent services

Minimal Footprint

Why: Production servers have constrained resources. Agent overhead must be negligible.

Implementation

Binary size: 50 MB (statically compiled Go)
Resident memory: <100 MB RSS (steady-state)
CPU usage: <0.5% average, 2% peak during playbook execution
Disk usage: 200 MB (binary + logs + cache)
Network bandwidth: <100 KB/s outbound (metrics batched every 30s)

Stateless Operation

Why: Operational state stored centrally means agent restart = zero data loss.

Implementation

Metrics sent to control plane immediately (not buffered locally)
Playbook execution state reported in real-time
Configuration fetched from control plane on startup
Historical data stored in control plane time-series DB

Idempotent Execution

Why: Network failures, timeouts, retries must not cause duplicate actions.

Implementation

Playbook validation rejects non-idempotent operations
State checks before execution ("if disk already <80%, skip cleanup")
Conditional steps ("only if service not running")
Health verification after execution (confirms desired state reached)

Attack Surface Comparison

Traditional monitoring agent:
  Listens on port 8125 (StatsD metrics)
  Listens on port 8126 (APM traces)
  Listens on port 9090 (HTTP metrics endpoint)
  Attack surface: 3 inbound ports

SentienGuard agent:
  Listens on: NOTHING
  Initiates outbound to: control.sentienguard.com:443
  Attack surface: 0 inbound ports

Installation Footprint

/opt/sentienguard/
├── bin/
│   └── sentienguard-agent          # 50 MB
├── config/
│   └── agent.yaml                  # 2 KB
├── logs/
│   └── agent.log                   # 10-50 MB (rotated)
└── cache/
    └── playbooks/                  # 10-20 MB

Agent	Binary Size	Memory (RSS)	CPU (avg)
Datadog	200 MB	200-300 MB	1-2%
New Relic	150 MB	150-250 MB	1-3%
Prometheus Node Exporter	20 MB	50-80 MB	0.3-0.5%
SentienGuard	50 MB	<100 MB	<0.5%

Stateless: Restart Behavior

# Agent crashes or restarts
systemctl restart sentienguard-agent

# What happens:
# 1. Agent reconnects to control plane (<5 seconds)
# 2. Fetches current configuration
# 3. Resumes metric collection from current state
# 4. No data loss (metrics already sent)
# 5. No incident interruption (state on control plane)

Idempotent (Good)

# Idempotent playbook step
- name: clear_temp_files
  command: "find /tmp -type f -mtime +7 -delete"

# First run: Deletes 100 files
# Second run: Deletes 0 files (already gone)
# Third run: Deletes 0 files
# Result: Same outcome regardless of runs

Non-Idempotent (Avoided)

# BAD - not idempotent
- name: increment_counter
  command: "echo $((counter + 1)) > /var/counter"

# First run: counter = 1
# Second run: counter = 2 (wrong!)
# Problem: Side effects accumulate

Three Ways to Deploy Agents

Choose the deployment model that fits your infrastructure. All options deliver the same agent binary with identical capabilities.

Supported Platforms

Ubuntu 20.04+ (x86_64, ARM64)CentOS 7+ (x86_64)Debian 10+ (x86_64, ARM64)RHEL 8+ (x86_64, ARM64)

One-Line Install

# One-line install
curl -sSL https://get.sentienguard.com/install | bash

# What this does:
# 1. Downloads agent binary (50 MB, GPG-signed)
# 2. Verifies GPG signature
# 3. Installs to /opt/sentienguard/
# 4. Creates systemd service
# 5. Starts agent, enables auto-start on boot

# Verify installation
systemctl status sentienguard-agent
# Expected: "active (running)"

# View logs
tail -f /var/log/sentienguard/agent.log
# Expected: "Connected to control plane, heartbeat every 30s"

Manual Install (Air-Gapped)

# Download binary manually
wget https://releases.sentienguard.com/agent/v1.4.2/sentienguard-agent-linux-amd64

# Verify signature
gpg --verify sentienguard-agent-linux-amd64.sig

# Install
sudo cp sentienguard-agent-linux-amd64 /opt/sentienguard/bin/sentienguard-agent
sudo chmod +x /opt/sentienguard/bin/sentienguard-agent

# Configure
sudo tee /opt/sentienguard/config/agent.yaml <<EOF
api_key: YOUR_API_KEY
control_plane: https://control.sentienguard.com
environment: production
EOF

# Create systemd service
sudo tee /etc/systemd/system/sentienguard-agent.service <<EOF
[Unit]
Description=SentienGuard Agent
After=network.target

[Service]
Type=simple
User=sentienguard
ExecStart=/opt/sentienguard/bin/sentienguard-agent --config /opt/sentienguard/config/agent.yaml
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

# Start service
sudo systemctl daemon-reload
sudo systemctl enable sentienguard-agent
sudo systemctl start sentienguard-agent

2 minutes

Install time

200 MB

Disk space

<100 MB

Memory

<0.5%

CPU

Metrics Collected Every 30 Seconds

Agents collect infrastructure, process, Kubernetes, and application metrics using kernel-level instrumentation with zero performance overhead.

Infrastructure Metrics

CPU

Usage per core (user, system, idle, iowait)
Load average (1min, 5min, 15min)
Context switches per second
CPU throttling events (cgroups)

Memory

Total, used, available, free
Swap usage
Memory pressure (PSI)
Cache and buffer usage

Disk

Usage per filesystem (%, bytes)
Inode usage (%, count)
Read/write IOPS
Read/write throughput (MB/s)
I/O wait time

Network

Bytes in/out per interface
Packets in/out
Packet loss rate
Network errors (dropped, collisions)
Connection count (TCP, UDP)

eBPF (extended Berkeley Packet Filter) for kernel-level metrics. /proc filesystem for process metrics. sysfs for system metrics. Zero overhead (kernel-space collection).

Process Metrics

Per-Process Data

Process count (total, running, sleeping, zombie)
CPU usage per process
Memory (RSS, VSZ) per process
Open file descriptors per process
Process state (running, sleeping, zombie, stopped)

Service Health

systemd service status (active, inactive, failed)
Service restart count
Service uptime
Service memory usage

Kubernetes Metrics

If Applicable

Pod-Level

Pod status (Running, Pending, Failed, CrashLoopBackOff)
Container restarts
Resource usage (CPU, memory per container)
Pod events (OOMKilled, Evicted, etc.)

Node-Level

Node status (Ready, NotReady, SchedulingDisabled)
Allocatable resources vs capacity
Pod count per node
Node conditions (DiskPressure, MemoryPressure, PIDPressure)

Deployment-Level

Desired vs available replicas
Rollout status
Deployment events

Kubernetes API via ServiceAccount credentials. Metrics from kubelet (node-level). Events from API server.

Application Context

Optional

OpenTelemetry Integration

Request rate
Error rate
Latency percentiles (p50, p95, p99)
Active requests

Database Connections

Active connection count (PostgreSQL/MySQL)
Query performance monitoring

HTTP Endpoints

Health check endpoint monitoring
Response status codes

Example Process Metric

{
  "process": "postgresql",
  "pid": 1234,
  "cpu_percent": 12.4,
  "memory_rss_mb": 2048,
  "memory_percent": 12.8,
  "open_files": 147,
  "state": "running",
  "uptime_seconds": 864000
}

What Agents DON'T Collect

×Application logs (only infrastructure events)
×User data or PII
×Application source code
×Environment variables with secrets
×Custom business metrics (unless explicitly configured)

Defense-in-Depth Security Architecture

Five layers of security from network to secrets. Each layer independently prevents a class of attacks, so compromise of one layer doesn't compromise the system.

Outbound-Only Communication

Agent → Control Plane: Outbound HTTPS (443), TLS 1.3, certificate pinning. No inbound connections accepted.

What This Prevents

Network-based attacks (no listening ports)
Reverse shells (agent can't accept inbound connections)
Port scanning (no services to discover)

TLS 1.3 with Certificate Pinning

SentienGuard CA certificate hash embedded in agent binary at compile time. Connection refused if mismatch—no fallback to CA trust chain.

What This Prevents

Man-in-the-middle attacks
Certificate authority compromise
Rogue control plane impersonation

Cryptographic Playbook Signing

Ed25519 signatures on every playbook. Agent verifies signature, checks timestamp freshness (<5 min), and confirms target host before execution.

What This Prevents

Unauthorized playbook injection
Replay attacks (timestamp freshness check)
Playbook tampering
Wrong-host execution

Non-Root Execution

Agent runs as dedicated sentienguard user. Sudo with explicit allow-list for commands requiring root.

What This Prevents

Privilege escalation
System-wide damage (limited to sentienguard user permissions)
Lateral movement (can't modify other services)

Secret Management

API keys: file permissions (0600). SSH keys: AWS Secrets Manager / Azure Key Vault / HashiCorp Vault with just-in-time retrieval. Cloud credentials: IAM roles, no static keys.

What This Prevents

Secrets in version control
Secrets in logs (automatic redaction)
Long-lived credentials (automatic rotation)

Firewall Configuration

# Agent needs ONLY outbound HTTPS
iptables -A OUTPUT -p tcp --dport 443 -j ACCEPT
iptables -A INPUT -j DROP    # Deny all inbound
iptables -A OUTPUT -j DROP    # Deny all other outbound

TLS 1.3 + Certificate Pinning

// Pseudocode: Agent TLS configuration
tlsConfig := &tls.Config{
    MinVersion: tls.VersionTLS13,
    InsecureSkipVerify: false,
    VerifyPeerCertificate: func(rawCerts [][]byte,
        verifiedChains [][]*x509.Certificate) error {
        // Verify certificate matches pinned hash
        expectedHash := "sha256:a3f8b9c2d1e4..."
        actualHash := sha256(rawCerts[0])
        if actualHash != expectedHash {
            return errors.New("certificate pinning failed")
        }
        return nil
    },
}

Signed Playbook Payload

{
  "playbook": "disk_cleanup_prod_db",
  "version": "1.4.2",
  "target_host": "prod-db-03",
  "timestamp": "2026-02-10T14:35:43Z",
  "steps": ["..."],
  "signature": "ed25519:a8f3b2c1d9e4f5a6..."
}

Signature Verification (Pseudocode)

func verifyPlaybook(payload Playbook) error {
    // 1. Extract signature
    signature := payload.Signature

    // 2. Verify with control plane public key
    publicKey := loadEmbeddedPublicKey()
    valid := ed25519.Verify(publicKey, payload.Bytes(), signature)
    if !valid {
        return errors.New("invalid signature")
    }

    // 3. Check timestamp freshness (<5 minutes)
    age := time.Now().Sub(payload.Timestamp)
    if age > 5*time.Minute {
        return errors.New("playbook too old, possible replay attack")
    }

    // 4. Verify target host matches
    if payload.TargetHost != agent.Hostname {
        return errors.New("playbook not intended for this host")
    }

    return nil  // All checks passed
}

Non-Root Execution

# Agent runs as dedicated user (not root)
useradd -r -s /bin/false sentienguard
chown -R sentienguard:sentienguard /opt/sentienguard/

# Systemd service runs as sentienguard user
[Service]
User=sentienguard
Group=sentienguard

Sudo Allow-List

# /etc/sudoers.d/sentienguard
# Allow specific commands only
sentienguard ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart *
sentienguard ALL=(ALL) NOPASSWD: /usr/bin/find /tmp -type f -mtime +7 -delete
sentienguard ALL=(ALL) NOPASSWD: /usr/sbin/logrotate -f /etc/logrotate.conf

# Deny dangerous commands
sentienguard ALL=(ALL) !ALL  # Default deny

Secret Retrieval (Just-in-Time)

- name: restart_database
  action: ssh_command
  command: |
    PASSWORD=$(aws secretsmanager get-secret-value \
      --secret-id prod-db-password \
      --query SecretString \
      --output text)
    echo "$PASSWORD" | sudo -S systemctl restart postgresql
  secrets:
    - aws_secret: prod-db-password  # Retrieved just-in-time

From Install to Updates

Five-stage lifecycle covering installation, steady-state operation, incident response, offline resilience, and automatic updates.

Installation

2 minutes

Download GPG-signed binary, install to /opt/sentienguard/, create systemd service, connect to control plane, register host, receive configuration, begin metric collection.

Normal Operation

30s heartbeat

Every 30 seconds: collect metrics (CPU, memory, disk, network, processes), batch and send to control plane via HTTPS POST, check for pending playbook executions. CPU: <0.5%, Memory: ~80 MB, Network: ~50 KB per heartbeat.

Incident Response

10-90 seconds

Control plane detects anomaly, selects and signs playbook, sends to agent. Agent verifies Ed25519 signature, checks timestamp freshness, executes steps sequentially, captures stdout/stderr, verifies health, reports results.

Offline Resilience

Up to 24 hours

If control plane unreachable: continue collecting metrics locally (cached 5 min), execute cached playbooks if incidents detected, queue audit logs, retry connection every 30 seconds. Max 24h offline before pausing execution.

Updates

~5s downtime

Weekly release cycle: control plane notifies agents, agent downloads new binary, verifies GPG signature, replaces binary, restarts service, reconnects. Rollback available if new version breaks.

Stage 1: Installation

curl -sSL https://get.sentienguard.com/install | bash

# Behind the scenes:
# 1. Download agent binary from releases.sentienguard.com
# 2. Verify GPG signature (pub key embedded in install script)
# 3. Copy binary to /opt/sentienguard/bin/
# 4. Generate default config
# 5. Create systemd service
# 6. Start service, enable auto-start on boot

Stage 4: Offline Resilience

Control plane down → Disk fills → Agent detects anomaly
→ Checks cache for disk_cleanup playbook
→ Found (executed 2 days ago, cached)
→ Executes from cache
→ Incident resolved
→ Queues audit log for upload when online

Stage 2: Normal Operation Logs

[2026-02-10 14:35:12] INFO: Heartbeat sent (30s interval)
[2026-02-10 14:35:12] INFO: Metrics: cpu=12.4%, mem=68.2%, disk=72.1%
[2026-02-10 14:35:12] INFO: No pending playbooks
[2026-02-10 14:35:42] INFO: Heartbeat sent (30s interval)
[2026-02-10 14:35:42] INFO: Anomaly detected: disk_usage=91.4% (4.8σ)
[2026-02-10 14:35:43] INFO: Playbook received: disk_cleanup_prod_db
[2026-02-10 14:35:43] INFO: Signature verified, executing playbook
[2026-02-10 14:37:09] INFO: Playbook completed successfully (87s)
[2026-02-10 14:37:09] INFO: Health verification: disk_usage=72.1% (PASS)

Stage 5: Automatic Updates

# agent.yaml
updates:
  automatic: true
  schedule: "weekly"  # Check every Sunday 2 AM
  window: "02:00-06:00"  # Only update during window

Stage 5: Manual Update + Rollback

# agent.yaml
updates:
  automatic: false

# Update manually:
$ sentienguard-agent update
$ systemctl restart sentienguard-agent

# Rollback if needed:
$ sentienguard-agent rollback
$ systemctl restart sentienguard-agent

Execution Isolation (Stage 3)

Concurrency

One playbook at a time (serialized)

Queuing

New requests queued if one running

Timeout

5 minutes max (configurable per playbook)

Failure

Automatic rollback if health check fails

Deploy Anywhere: AWS, GCP, Azure, On-Prem

Same agent binary, same capabilities, every environment. Cloud-native credential integration for each provider.

Supported Services

EC2 (Linux instances)EKS (Kubernetes)ECS (containerized workloads)RDS (database metrics via queries)ElastiCache (Redis/Memcached metrics)

IAM Role (Recommended)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeTags",
        "cloudwatch:PutMetricData",
        "secretsmanager:GetSecretValue"
      ],
      "Resource": "*"
    }
  ]
}

EC2 User Data

#!/bin/bash
# Install agent on EC2 instance launch
curl -sSL https://get.sentienguard.com/install | bash
echo "api_key: $SENTIENGUARD_API_KEY" >> /opt/sentienguard/config/agent.yaml
systemctl start sentienguard-agent

Monitoring the Monitors

Who monitors the monitoring agent? Agent health dashboard tracks status, version distribution, and resource usage across your entire fleet.

Agent Status (per host)

Status

Online / Offline

Last heartbeat

Timestamp

Version

Installed

Uptime

Duration

Resources

CPU, Memory, Disk

Example Agent Health Dashboard

┌─────────────────────────────────────────────────────────┐
│ Agent Health (500 nodes)                                │
├─────────────────────────────────────────────────────────┤
│ ✅ Online: 498                                          │
│ ⚠️  Offline: 2                                          │
│   - prod-db-12 (offline 15min, heartbeat timeout)      │
│   - staging-web-03 (offline 2h, host unreachable)      │
├─────────────────────────────────────────────────────────┤
│ Agent Version Distribution:                             │
│   v1.4.2: 487 nodes (97%)                              │
│   v1.4.1: 11 nodes (2%)   [Update available]          │
│   v1.3.9: 2 nodes (1%)    [Critical update needed]    │
├─────────────────────────────────────────────────────────┤
│ Resource Usage (avg across all agents):                 │
│   CPU: 0.4%  Memory: 82 MB  Network: 48 KB/s          │
└─────────────────────────────────────────────────────────┘

Troubleshooting Commands

Agent Not Connecting

# Check service status
systemctl status sentienguard-agent

# Check network connectivity
curl -v https://control.sentienguard.com/health

# Check logs
tail -f /var/log/sentienguard/agent.log

# Test API key
sentienguard-agent test-connection

Agent High CPU

# Check what agent is doing
strace -p $(pgrep sentienguard-agent)

# Check if playbook running
ps aux | grep sentienguard

# View recent playbook executions
sentienguard-agent playbook-history

Agent Offline

# Restart agent
systemctl restart sentienguard-agent

# If still offline, check control plane connectivity
ping control.sentienguard.com
telnet control.sentienguard.com 443

# Check firewall rules
iptables -L -n | grep 443

Common Questions

No. Agent runs as dedicated sentienguard user (non-root). Some playbooks require root commands (service restarts, disk operations)—use sudo with explicit allow-list for these commands only. See deployment docs for sudo configuration.

Systemd automatically restarts the agent (RestartSec=10s). Agent reconnects to control plane, fetches current config, resumes metric collection. No data loss—metrics already sent to control plane before crash. Operational state stored in control plane, not agent.

Yes, with Enterprise tier. Deploy on-premises control plane in your data center. Agents communicate with internal control plane (not cloud). All data stays within your network. Contact sales for air-gapped deployment architecture.

~100 KB/s outbound average. Metrics batched every 30 seconds (~50 KB per batch). Playbook downloads negligible (10-50 KB per playbook). Total: 150-250 MB/day per agent. For 500 agents: 75-125 GB/day outbound from your infrastructure.

Yes. Set HTTP_PROXY and HTTPS_PROXY environment variables, then restart the agent. Agent respects standard proxy environment variables.

Yes. Playbook metadata includes exclusions (e.g., host_pattern: "*.prod.*" to never run on production). Or disable via dashboard: Playbooks → disk_cleanup → Disable on prod-db-03.

Every playbook includes rollback steps. If health verification fails, agent automatically reverts changes. Example: Playbook restarts wrong service → health check fails → rollback restarts original service. Complete audit trail shows what happened for post-incident review.

Deploy Your First Agent

Install agent on Linux server in 2 minutes. Watch metrics flow to dashboard. Import playbook library. Trigger test incident. See autonomous resolution.

Installation

curl -sSL https://get.sentienguard.com/install | bash

Start Free (3 Nodes)View Installation Docs →

Free tier: 3 agents, unlimited playbooks, full audit logs, no credit card.