Container runtime security on AWS Fargate
As a managed service AWS Fargate boosts security by limiting access to the underlying operating system, making traditional security measures that rely on host-level control and visibility challenging to implement. Fargate platform's 1.4 release introduces ptrace support, enabling Falco to offer an open-source solution for runtime security within containerised workloads.
Falco as a solution
Falco is an open-source, cloud-native runtime security tool that helps monitor and protect containerized applications. It is part of the CNCF (Cloud Native Computing Foundation) and is designed to provide real-time visibility into the behaviour of applications running in containers.
System calls are streamed to Falco for analysis in several ways:
By loading a kernel module
By leveraging BPF probes
User space instrumentation
The user space instrumentation is the only way to stream system calls for containerised workloads on AWS Fargate, leveraging the SYS_PTRACE Linux capability. There are different user space tracing patterns, such as:
Embedding the tracing binary and the Falco binary into the workload container image.
Mounting the tracing binary and the Falco binary into the workload container image at runtime via a "sleeping" sidecar.
Embedding the tracing binary into the workload container and running the Falco Binary as a sidecar.
This post illustrates the implementation of embedding both the tracing binary and the Falco binary within the workload container image as the only solution for AWS ECS with Fargate Launch Type. A Debian-based solution is selected to encompass a broad spectrum of widely used applications such as Nginx, Python, Golang, and many others.
Implementation
The solution relies on four main components, which include:
Pdig binary: A standalone executable based on ptrace and falcosecurity libraries. It facilitates the streaming of system calls.
Falco binary from official Falco container image falcosecurity/falco.
Supervisord - client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.
A Falco Rule file - default set of rules from the official Falco image falcosecurity/falco. Additional custom rules can also be written.
In a Dockerfile, two essential binaries are copied along with the required files for both custom and default rules.
FROM nginx
RUN apt-get update && \
apt-get install -y \
supervisor
COPY --from=ollypom/pdig:latest /pdig /vendor/falco/bin/pdig
COPY --from=falcosecurity/falco:0.32.2-slim /usr/bin/falco /vendor/falco/bin/falco
COPY --from=falcosecurity/falco:0.32.2-slim /etc/falco/ /vendor/falco/etc/falco/
COPY ./supervisord.conf /vendor/falco/scripts/supervisord.conf
COPY ./falco.yaml /data/falco.yaml
COPY ./falco_rules.local.yaml /data/falco_rules.local.yaml
CMD [ "/usr/bin/supervisord", "-c", "/vendor/falco/scripts/supervisord.conf" ]
As an entrypoint Supervisord is executed with a configuration file. The configuration file defines how Supervisord should manage and monitor the programs, e.g., the workload process and the Falco daemon.
[supervisord]
user=root
loglevel=warn
nodaemon=true
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[program:myapp]
command=/vendor/falco/bin/pdig /bin/bash <YOUR_APPLICATION>
<ADDITIONAL_SETTINGS>
…
[program:falco]
command=/vendor/falco/bin/falco --userspace -c /data/falco.yaml
autorestart=false
startsecs=0
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
[eventlistener:exit_on_any_fatal]
<ADDITIONAL_SETTINGS>
When the Dockerfile is ready, an image is built and an Amazon ECS Task Definition is created, in which we add the following JSON:
"linuxParameters": {
"capabilities": {
"add": [
"SYS_PTRACE"
],
"drop": null
}
},
After the task is successfully started, Falco's triggering can be tested by running a malicious script or a single command in the container.
Examples of such actions involve executing commands to alter the file structure, like writing to a binary file or requesting a URL address using curl, e.g. "curl -s https://google.co.uk."
The results of the events can be observed in the Amazon CloudWatch logs.
Explore a collection of useful resources here:
Incident response management
Once a potential threat is detected, it’s important to respond promptly to minimize damage.
Within the provided solution, incident response is managed using CloudWatch Logs Metric Filters. Metric filters help convert log data into numbers, which can be used as metrics to extract important data from logs and understand specific patterns or events.
The subsequent step involves the configuration of notifications upon the attainment of specific thresholds or conditions. Amazon SNS, a fully managed pub-sub messaging service, simplifies the delivery of notifications to various endpoints, including email, SMS, mobile push, and more.
Additional solutions are automatic incident mitigation via an AWS Lambda function or triggering and forwarding an alert to a specified location, like a file or a webhook.
Conclusion
In today's DevOps landscape, where quickly deploying applications is essential, containers are widely used. This has made container runtime security a crucial aspect of overall security strategies, mainly due to containers being dynamic and requiring real-time automated protection.
Even though AWS ECS Fargate is secure by design and compliant with multiple industry standards, it's wise to adhere to best practices and consider potential vulnerabilities. In the context of the shared responsibility model, it's up to us to apply a runtime security solution, and Falco is an excellent choice for this task.
The main strength of the solution lies in its use of an open-source tool and its broad applicability. It can be applied to a significant portion of Docker Hub images, with around 60% of them being based on Debian.