Infrastructure Layer

Daemon & Client Overview

Architecture

Client <--> Daemon --> Control Plane

The Gremlin daemon, gremlind is a binary installed on the operating system or available inside the Gremlin container. It heartbeats with the Gremlin Control Plane to let Gremlin know that the host is active and able to receive attack orders. It only communicates outbound with the Gremlin Control Plane. All traffic is encrypted.

The Gremlin client, gremlin, refers to the Gremlin command-line client that is responsible for creating the local impact within the host.

The daemon bundled with the command line client as a unit is referred to as a targetable Client to the platform.

Client Lifecycle

Gremlin clients (infrastructure and application) that have been authenticated to the Gremlin Control Plane appear in the infrastructure clients and application clients lists. You can only run attacks on "active" clients. A client goes into an "idle" state if there is no activity for the past 5 minutes. You cannot run or schedule attacks on idle clients. If Gremlin does not hear from these idle clients for a period of 24 hours, the clients are removed from the list. However, if a client starts communicating with Gremlin again while still within the 24 hour idle window, the client is reactivated and returned to the "active" state.


Logs

Logs can be found under the /var/log/gremlin directory.

Daemon log entries can be found in the daemon.log file. Log entries in this file may indicate events where the daemon is not able to communicate with the Control Plane and eventually trigger the Dead Man Switch.

Each attack on the host is logged under /var/log/gremlin/executions using its unique attack execution ID.

Bandwidth Usage

Idle State

The daemon uses very little bandwidth in its idle state. In testing over a 15 minute period the daemon used only an average of 1.15 KB/sec over any given 10 seconds.

Inbound bandwidth average, zeros dropped
1.98918829974

Outbound bandwidth average, zeros dropped
0.55621495

Aggregate bandwidth average, zeros dropped
2.51990083564

Aggregate bandwidth average over testing period
1.15347573462

Attack State

While testing, there is a slight increase in overall bandwidth consumption during attacks. While attacks are being executed, the daemon stays in constant communication with the control plane as it checks for the abort condition to be executed. Regardless of attack being run, the attack state behavior looks the same. Provided are two data sets, one from a CPU Resource attack, and one from a Latency Network attack.

Inbound bandwidth average, zeros dropped
2.71233697984

Outbound bandwidth average, zeros dropped
0.87034858716

Aggregate bandwidth average, zeros dropped
3.56685480642

Aggregate bandwidth average over testing period
3.0556056175
Inbound bandwidth average, zeros dropped
2.64944416154

Outbound bandwidth average, zeros dropped
0.917963469407

Aggregate bandwidth average, zeros dropped
3.55221005449

Aggregate bandwidth average over testing period
3.11712392366

There is no statistically significant difference between the two attacks. Despite a nearly 3X rise in aggregate average traffic, the 3 KB/sec bandwidth utilization is still very low on a per client basis.