SentienGuard
Home>Product>Agent Architecture

Agent Architecture

Lightweight. Outbound-Only. Zero Inbound Attack Surface.

50 MB agent binary, <100 MB resident memory, <0.5% CPU steady-state. Outbound HTTPS only—no listening ports, no inbound connections, no VPN dependency. Deploys in 2 minutes via curl, Helm, or Docker.

50 MBAgent binary sizeCompiled Go binary
<100 MBResident memory (RSS)Steady-state operation
0 portsInbound listeningOutbound HTTPS (443) only
2 minInstall to first metricOne-line deployment

Four Principles That Define Agent Architecture

Outbound-Only Communication

Why: Inbound ports = attack surface. Every listening port is a potential entry point for exploitation.

Implementation

  • Agent initiates all connections to control plane
  • Protocol: HTTPS on port 443 (outbound)
  • No inbound ports opened on host
  • No listening services exposed
  • Firewall-friendly (works through NAT, proxies, corporate firewalls)

What This Prevents

  • Port scanning attacks (no services to discover)
  • Remote code execution via exposed endpoints
  • Lateral movement (compromised agent can't accept commands from attacker)
  • Network-based exploits targeting agent services

Minimal Footprint

Why: Production servers have constrained resources. Agent overhead must be negligible.

Implementation

  • Binary size: 50 MB (statically compiled Go)
  • Resident memory: <100 MB RSS (steady-state)
  • CPU usage: <0.5% average, 2% peak during playbook execution
  • Disk usage: 200 MB (binary + logs + cache)
  • Network bandwidth: <100 KB/s outbound (metrics batched every 30s)

Stateless Operation

Why: Operational state stored centrally means agent restart = zero data loss.

Implementation

  • Metrics sent to control plane immediately (not buffered locally)
  • Playbook execution state reported in real-time
  • Configuration fetched from control plane on startup
  • Historical data stored in control plane time-series DB

Idempotent Execution

Why: Network failures, timeouts, retries must not cause duplicate actions.

Implementation

  • Playbook validation rejects non-idempotent operations
  • State checks before execution ("if disk already <80%, skip cleanup")
  • Conditional steps ("only if service not running")
  • Health verification after execution (confirms desired state reached)
Attack Surface Comparison
Traditional monitoring agent:
  Listens on port 8125 (StatsD metrics)
  Listens on port 8126 (APM traces)
  Listens on port 9090 (HTTP metrics endpoint)
  Attack surface: 3 inbound ports

SentienGuard agent:
  Listens on: NOTHING
  Initiates outbound to: control.sentienguard.com:443
  Attack surface: 0 inbound ports
Installation Footprint
/opt/sentienguard/
├── bin/
│   └── sentienguard-agent          # 50 MB
├── config/
│   └── agent.yaml                  # 2 KB
├── logs/
│   └── agent.log                   # 10-50 MB (rotated)
└── cache/
    └── playbooks/                  # 10-20 MB
AgentBinary SizeMemory (RSS)CPU (avg)
Datadog200 MB200-300 MB1-2%
New Relic150 MB150-250 MB1-3%
Prometheus Node Exporter20 MB50-80 MB0.3-0.5%
SentienGuard50 MB<100 MB<0.5%
Stateless: Restart Behavior
# Agent crashes or restarts
systemctl restart sentienguard-agent

# What happens:
# 1. Agent reconnects to control plane (<5 seconds)
# 2. Fetches current configuration
# 3. Resumes metric collection from current state
# 4. No data loss (metrics already sent)
# 5. No incident interruption (state on control plane)
Idempotent (Good)
# Idempotent playbook step
- name: clear_temp_files
  command: "find /tmp -type f -mtime +7 -delete"

# First run: Deletes 100 files
# Second run: Deletes 0 files (already gone)
# Third run: Deletes 0 files
# Result: Same outcome regardless of runs
Non-Idempotent (Avoided)
# BAD - not idempotent
- name: increment_counter
  command: "echo $((counter + 1)) > /var/counter"

# First run: counter = 1
# Second run: counter = 2 (wrong!)
# Problem: Side effects accumulate

Three Ways to Deploy Agents

Choose the deployment model that fits your infrastructure. All options deliver the same agent binary with identical capabilities.

Supported Platforms

Ubuntu 20.04+ (x86_64, ARM64)CentOS 7+ (x86_64)Debian 10+ (x86_64, ARM64)RHEL 8+ (x86_64, ARM64)
One-Line Install
# One-line install
curl -sSL https://get.sentienguard.com/install | bash

# What this does:
# 1. Downloads agent binary (50 MB, GPG-signed)
# 2. Verifies GPG signature
# 3. Installs to /opt/sentienguard/
# 4. Creates systemd service
# 5. Starts agent, enables auto-start on boot

# Verify installation
systemctl status sentienguard-agent
# Expected: "active (running)"

# View logs
tail -f /var/log/sentienguard/agent.log
# Expected: "Connected to control plane, heartbeat every 30s"
Manual Install (Air-Gapped)
# Download binary manually
wget https://releases.sentienguard.com/agent/v1.4.2/sentienguard-agent-linux-amd64

# Verify signature
gpg --verify sentienguard-agent-linux-amd64.sig

# Install
sudo cp sentienguard-agent-linux-amd64 /opt/sentienguard/bin/sentienguard-agent
sudo chmod +x /opt/sentienguard/bin/sentienguard-agent

# Configure
sudo tee /opt/sentienguard/config/agent.yaml <<EOF
api_key: YOUR_API_KEY
control_plane: https://control.sentienguard.com
environment: production
EOF

# Create systemd service
sudo tee /etc/systemd/system/sentienguard-agent.service <<EOF
[Unit]
Description=SentienGuard Agent
After=network.target

[Service]
Type=simple
User=sentienguard
ExecStart=/opt/sentienguard/bin/sentienguard-agent --config /opt/sentienguard/config/agent.yaml
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

# Start service
sudo systemctl daemon-reload
sudo systemctl enable sentienguard-agent
sudo systemctl start sentienguard-agent

2 minutes

Install time

200 MB

Disk space

<100 MB

Memory

<0.5%

CPU

Metrics Collected Every 30 Seconds

Agents collect infrastructure, process, Kubernetes, and application metrics using kernel-level instrumentation with zero performance overhead.

Infrastructure Metrics

CPU

  • Usage per core (user, system, idle, iowait)
  • Load average (1min, 5min, 15min)
  • Context switches per second
  • CPU throttling events (cgroups)

Memory

  • Total, used, available, free
  • Swap usage
  • Memory pressure (PSI)
  • Cache and buffer usage

Disk

  • Usage per filesystem (%, bytes)
  • Inode usage (%, count)
  • Read/write IOPS
  • Read/write throughput (MB/s)
  • I/O wait time

Network

  • Bytes in/out per interface
  • Packets in/out
  • Packet loss rate
  • Network errors (dropped, collisions)
  • Connection count (TCP, UDP)

eBPF (extended Berkeley Packet Filter) for kernel-level metrics. /proc filesystem for process metrics. sysfs for system metrics. Zero overhead (kernel-space collection).

Process Metrics

Per-Process Data

  • Process count (total, running, sleeping, zombie)
  • CPU usage per process
  • Memory (RSS, VSZ) per process
  • Open file descriptors per process
  • Process state (running, sleeping, zombie, stopped)

Service Health

  • systemd service status (active, inactive, failed)
  • Service restart count
  • Service uptime
  • Service memory usage

Kubernetes Metrics

If Applicable

Pod-Level

  • Pod status (Running, Pending, Failed, CrashLoopBackOff)
  • Container restarts
  • Resource usage (CPU, memory per container)
  • Pod events (OOMKilled, Evicted, etc.)

Node-Level

  • Node status (Ready, NotReady, SchedulingDisabled)
  • Allocatable resources vs capacity
  • Pod count per node
  • Node conditions (DiskPressure, MemoryPressure, PIDPressure)

Deployment-Level

  • Desired vs available replicas
  • Rollout status
  • Deployment events

Kubernetes API via ServiceAccount credentials. Metrics from kubelet (node-level). Events from API server.

Application Context

Optional

OpenTelemetry Integration

  • Request rate
  • Error rate
  • Latency percentiles (p50, p95, p99)
  • Active requests

Database Connections

  • Active connection count (PostgreSQL/MySQL)
  • Query performance monitoring

HTTP Endpoints

  • Health check endpoint monitoring
  • Response status codes
Example Process Metric
{
  "process": "postgresql",
  "pid": 1234,
  "cpu_percent": 12.4,
  "memory_rss_mb": 2048,
  "memory_percent": 12.8,
  "open_files": 147,
  "state": "running",
  "uptime_seconds": 864000
}

What Agents DON'T Collect

  • ×Application logs (only infrastructure events)
  • ×User data or PII
  • ×Application source code
  • ×Environment variables with secrets
  • ×Custom business metrics (unless explicitly configured)

Defense-in-Depth Security Architecture

Five layers of security from network to secrets. Each layer independently prevents a class of attacks, so compromise of one layer doesn't compromise the system.

1

Outbound-Only Communication

Agent → Control Plane: Outbound HTTPS (443), TLS 1.3, certificate pinning. No inbound connections accepted.

What This Prevents

  • Network-based attacks (no listening ports)
  • Reverse shells (agent can't accept inbound connections)
  • Port scanning (no services to discover)
2

TLS 1.3 with Certificate Pinning

SentienGuard CA certificate hash embedded in agent binary at compile time. Connection refused if mismatch—no fallback to CA trust chain.

What This Prevents

  • Man-in-the-middle attacks
  • Certificate authority compromise
  • Rogue control plane impersonation
3

Cryptographic Playbook Signing

Ed25519 signatures on every playbook. Agent verifies signature, checks timestamp freshness (<5 min), and confirms target host before execution.

What This Prevents

  • Unauthorized playbook injection
  • Replay attacks (timestamp freshness check)
  • Playbook tampering
  • Wrong-host execution
4

Non-Root Execution

Agent runs as dedicated sentienguard user. Sudo with explicit allow-list for commands requiring root.

What This Prevents

  • Privilege escalation
  • System-wide damage (limited to sentienguard user permissions)
  • Lateral movement (can't modify other services)
5

Secret Management

API keys: file permissions (0600). SSH keys: AWS Secrets Manager / Azure Key Vault / HashiCorp Vault with just-in-time retrieval. Cloud credentials: IAM roles, no static keys.

What This Prevents

  • Secrets in version control
  • Secrets in logs (automatic redaction)
  • Long-lived credentials (automatic rotation)
Firewall Configuration
# Agent needs ONLY outbound HTTPS
iptables -A OUTPUT -p tcp --dport 443 -j ACCEPT
iptables -A INPUT -j DROP    # Deny all inbound
iptables -A OUTPUT -j DROP    # Deny all other outbound
TLS 1.3 + Certificate Pinning
// Pseudocode: Agent TLS configuration
tlsConfig := &tls.Config{
    MinVersion: tls.VersionTLS13,
    InsecureSkipVerify: false,
    VerifyPeerCertificate: func(rawCerts [][]byte,
        verifiedChains [][]*x509.Certificate) error {
        // Verify certificate matches pinned hash
        expectedHash := "sha256:a3f8b9c2d1e4..."
        actualHash := sha256(rawCerts[0])
        if actualHash != expectedHash {
            return errors.New("certificate pinning failed")
        }
        return nil
    },
}
Signed Playbook Payload
{
  "playbook": "disk_cleanup_prod_db",
  "version": "1.4.2",
  "target_host": "prod-db-03",
  "timestamp": "2026-02-10T14:35:43Z",
  "steps": ["..."],
  "signature": "ed25519:a8f3b2c1d9e4f5a6..."
}
Signature Verification (Pseudocode)
func verifyPlaybook(payload Playbook) error {
    // 1. Extract signature
    signature := payload.Signature

    // 2. Verify with control plane public key
    publicKey := loadEmbeddedPublicKey()
    valid := ed25519.Verify(publicKey, payload.Bytes(), signature)
    if !valid {
        return errors.New("invalid signature")
    }

    // 3. Check timestamp freshness (<5 minutes)
    age := time.Now().Sub(payload.Timestamp)
    if age > 5*time.Minute {
        return errors.New("playbook too old, possible replay attack")
    }

    // 4. Verify target host matches
    if payload.TargetHost != agent.Hostname {
        return errors.New("playbook not intended for this host")
    }

    return nil  // All checks passed
}
Non-Root Execution
# Agent runs as dedicated user (not root)
useradd -r -s /bin/false sentienguard
chown -R sentienguard:sentienguard /opt/sentienguard/

# Systemd service runs as sentienguard user
[Service]
User=sentienguard
Group=sentienguard
Sudo Allow-List
# /etc/sudoers.d/sentienguard
# Allow specific commands only
sentienguard ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart *
sentienguard ALL=(ALL) NOPASSWD: /usr/bin/find /tmp -type f -mtime +7 -delete
sentienguard ALL=(ALL) NOPASSWD: /usr/sbin/logrotate -f /etc/logrotate.conf

# Deny dangerous commands
sentienguard ALL=(ALL) !ALL  # Default deny
Secret Retrieval (Just-in-Time)
- name: restart_database
  action: ssh_command
  command: |
    PASSWORD=$(aws secretsmanager get-secret-value \
      --secret-id prod-db-password \
      --query SecretString \
      --output text)
    echo "$PASSWORD" | sudo -S systemctl restart postgresql
  secrets:
    - aws_secret: prod-db-password  # Retrieved just-in-time

From Install to Updates

Five-stage lifecycle covering installation, steady-state operation, incident response, offline resilience, and automatic updates.

1

Installation

2 minutes

Download GPG-signed binary, install to /opt/sentienguard/, create systemd service, connect to control plane, register host, receive configuration, begin metric collection.

2

Normal Operation

30s heartbeat

Every 30 seconds: collect metrics (CPU, memory, disk, network, processes), batch and send to control plane via HTTPS POST, check for pending playbook executions. CPU: <0.5%, Memory: ~80 MB, Network: ~50 KB per heartbeat.

3

Incident Response

10-90 seconds

Control plane detects anomaly, selects and signs playbook, sends to agent. Agent verifies Ed25519 signature, checks timestamp freshness, executes steps sequentially, captures stdout/stderr, verifies health, reports results.

4

Offline Resilience

Up to 24 hours

If control plane unreachable: continue collecting metrics locally (cached 5 min), execute cached playbooks if incidents detected, queue audit logs, retry connection every 30 seconds. Max 24h offline before pausing execution.

5

Updates

~5s downtime

Weekly release cycle: control plane notifies agents, agent downloads new binary, verifies GPG signature, replaces binary, restarts service, reconnects. Rollback available if new version breaks.

Stage 1: Installation
curl -sSL https://get.sentienguard.com/install | bash

# Behind the scenes:
# 1. Download agent binary from releases.sentienguard.com
# 2. Verify GPG signature (pub key embedded in install script)
# 3. Copy binary to /opt/sentienguard/bin/
# 4. Generate default config
# 5. Create systemd service
# 6. Start service, enable auto-start on boot
Stage 4: Offline Resilience
Control plane down → Disk fills → Agent detects anomaly
→ Checks cache for disk_cleanup playbook
→ Found (executed 2 days ago, cached)
→ Executes from cache
→ Incident resolved
→ Queues audit log for upload when online
Stage 2: Normal Operation Logs
[2026-02-10 14:35:12] INFO: Heartbeat sent (30s interval)
[2026-02-10 14:35:12] INFO: Metrics: cpu=12.4%, mem=68.2%, disk=72.1%
[2026-02-10 14:35:12] INFO: No pending playbooks
[2026-02-10 14:35:42] INFO: Heartbeat sent (30s interval)
[2026-02-10 14:35:42] INFO: Anomaly detected: disk_usage=91.4% (4.8σ)
[2026-02-10 14:35:43] INFO: Playbook received: disk_cleanup_prod_db
[2026-02-10 14:35:43] INFO: Signature verified, executing playbook
[2026-02-10 14:37:09] INFO: Playbook completed successfully (87s)
[2026-02-10 14:37:09] INFO: Health verification: disk_usage=72.1% (PASS)
Stage 5: Automatic Updates
# agent.yaml
updates:
  automatic: true
  schedule: "weekly"  # Check every Sunday 2 AM
  window: "02:00-06:00"  # Only update during window
Stage 5: Manual Update + Rollback
# agent.yaml
updates:
  automatic: false

# Update manually:
$ sentienguard-agent update
$ systemctl restart sentienguard-agent

# Rollback if needed:
$ sentienguard-agent rollback
$ systemctl restart sentienguard-agent

Execution Isolation (Stage 3)

Concurrency

One playbook at a time (serialized)

Queuing

New requests queued if one running

Timeout

5 minutes max (configurable per playbook)

Failure

Automatic rollback if health check fails

Deploy Anywhere: AWS, GCP, Azure, On-Prem

Same agent binary, same capabilities, every environment. Cloud-native credential integration for each provider.

Supported Services

EC2 (Linux instances)EKS (Kubernetes)ECS (containerized workloads)RDS (database metrics via queries)ElastiCache (Redis/Memcached metrics)
IAM Role (Recommended)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeTags",
        "cloudwatch:PutMetricData",
        "secretsmanager:GetSecretValue"
      ],
      "Resource": "*"
    }
  ]
}
EC2 User Data
#!/bin/bash
# Install agent on EC2 instance launch
curl -sSL https://get.sentienguard.com/install | bash
echo "api_key: $SENTIENGUARD_API_KEY" >> /opt/sentienguard/config/agent.yaml
systemctl start sentienguard-agent

Monitoring the Monitors

Who monitors the monitoring agent? Agent health dashboard tracks status, version distribution, and resource usage across your entire fleet.

Agent Status (per host)

Status

Online / Offline

Last heartbeat

Timestamp

Version

Installed

Uptime

Duration

Resources

CPU, Memory, Disk

Example Agent Health Dashboard
┌─────────────────────────────────────────────────────────┐
│ Agent Health (500 nodes)                                │
├─────────────────────────────────────────────────────────┤
│ ✅ Online: 498                                          │
│ ⚠️  Offline: 2                                          │
│   - prod-db-12 (offline 15min, heartbeat timeout)      │
│   - staging-web-03 (offline 2h, host unreachable)      │
├─────────────────────────────────────────────────────────┤
│ Agent Version Distribution:                             │
│   v1.4.2: 487 nodes (97%)                              │
│   v1.4.1: 11 nodes (2%)   [Update available]          │
│   v1.3.9: 2 nodes (1%)    [Critical update needed]    │
├─────────────────────────────────────────────────────────┤
│ Resource Usage (avg across all agents):                 │
│   CPU: 0.4%  Memory: 82 MB  Network: 48 KB/s          │
└─────────────────────────────────────────────────────────┘

Troubleshooting Commands

Agent Not Connecting
# Check service status
systemctl status sentienguard-agent

# Check network connectivity
curl -v https://control.sentienguard.com/health

# Check logs
tail -f /var/log/sentienguard/agent.log

# Test API key
sentienguard-agent test-connection
Agent High CPU
# Check what agent is doing
strace -p $(pgrep sentienguard-agent)

# Check if playbook running
ps aux | grep sentienguard

# View recent playbook executions
sentienguard-agent playbook-history
Agent Offline
# Restart agent
systemctl restart sentienguard-agent

# If still offline, check control plane connectivity
ping control.sentienguard.com
telnet control.sentienguard.com 443

# Check firewall rules
iptables -L -n | grep 443

Common Questions

No. Agent runs as dedicated sentienguard user (non-root). Some playbooks require root commands (service restarts, disk operations)—use sudo with explicit allow-list for these commands only. See deployment docs for sudo configuration.

Systemd automatically restarts the agent (RestartSec=10s). Agent reconnects to control plane, fetches current config, resumes metric collection. No data loss—metrics already sent to control plane before crash. Operational state stored in control plane, not agent.

Yes, with Enterprise tier. Deploy on-premises control plane in your data center. Agents communicate with internal control plane (not cloud). All data stays within your network. Contact sales for air-gapped deployment architecture.

~100 KB/s outbound average. Metrics batched every 30 seconds (~50 KB per batch). Playbook downloads negligible (10-50 KB per playbook). Total: 150-250 MB/day per agent. For 500 agents: 75-125 GB/day outbound from your infrastructure.

Yes. Set HTTP_PROXY and HTTPS_PROXY environment variables, then restart the agent. Agent respects standard proxy environment variables.

Yes. Playbook metadata includes exclusions (e.g., host_pattern: "*.prod.*" to never run on production). Or disable via dashboard: Playbooks → disk_cleanup → Disable on prod-db-03.

Every playbook includes rollback steps. If health verification fails, agent automatically reverts changes. Example: Playbook restarts wrong service → health check fails → rollback restarts original service. Complete audit trail shows what happened for post-incident review.

Deploy Your First Agent

Install agent on Linux server in 2 minutes. Watch metrics flow to dashboard. Import playbook library. Trigger test incident. See autonomous resolution.

Installation
curl -sSL https://get.sentienguard.com/install | bash

Free tier: 3 agents, unlimited playbooks, full audit logs, no credit card.