Blog: Having fun with seccomp profiles on the edge
-
Author: Sascha Grunert
The Security Profiles Operator (SPO) is a feature-rich operator for Kubernetes to make managing seccomp, SELinux and AppArmor profiles easier than ever. Recording those profiles from scratch is one of the key features of this operator, which usually involves the integration into large CI/CD systems. Being able to test the recording capabilities of the operator in edge cases is one of the recent development efforts of the SPO and makes it excitingly easy to play around with seccomp profiles.
Recording seccomp profiles with
spoc record
The v0.8.0 release of the Security Profiles Operator shipped a new command line interface called
spoc
, a little helper tool for recording and replaying seccomp profiles among various other things that are out of scope of this blog post.Recording a seccomp profile requires a binary to be executed, which can be a simple golang application which just calls
uname(2)
:package main import ( "syscall" ) func main() { utsname := syscall.Utsname{} if err := syscall.Uname(&utsname); err != nil { panic(err) } }
Building a binary from that code can be done by:
> go build -o main main.go > ldd ./main not a dynamic executable
Now it's possible to download the latest binary of
spoc
from GitHub and run the application on Linux with it:> sudo ./spoc record ./main 10:08:25.591945 Loading bpf module 10:08:25.591958 Using system btf file libbpf: loading object 'recorder.bpf.o' from buffer … libbpf: prog 'sys_enter': relo #3: patched insn #22 (ALU/ALU64) imm 16 -> 16 10:08:25.610767 Getting bpf program sys_enter 10:08:25.610778 Attaching bpf tracepoint 10:08:25.611574 Getting syscalls map 10:08:25.611582 Getting pid_mntns map 10:08:25.613097 Module successfully loaded 10:08:25.613311 Processing events 10:08:25.613693 Running command with PID: 336007 10:08:25.613835 Received event: pid: 336007, mntns: 4026531841 10:08:25.613951 No container ID found for PID (pid=336007, mntns=4026531841, err=unable to find container ID in cgroup path) 10:08:25.614856 Processing recorded data 10:08:25.614975 Found process mntns 4026531841 in bpf map 10:08:25.615110 Got syscalls: read, close, mmap, rt_sigaction, rt_sigprocmask, madvise, nanosleep, clone, uname, sigaltstack, arch_prctl, gettid, futex, sched_getaffinity, exit_group, openat 10:08:25.615195 Adding base syscalls: access, brk, capget, capset, chdir, chmod, chown, close_range, dup2, dup3, epoll_create1, epoll_ctl, epoll_pwait, execve, faccessat2, fchdir, fchmodat, fchown, fchownat, fcntl, fstat, fstatfs, getdents64, getegid, geteuid, getgid, getpid, getppid, getuid, ioctl, keyctl, lseek, mkdirat, mknodat, mount, mprotect, munmap, newfstatat, openat2, pipe2, pivot_root, prctl, pread64, pselect6, readlink, readlinkat, rt_sigreturn, sched_yield, seccomp, set_robust_list, set_tid_address, setgid, setgroups, sethostname, setns, setresgid, setresuid, setsid, setuid, statfs, statx, symlinkat, tgkill, umask, umount2, unlinkat, unshare, write 10:08:25.616293 Wrote seccomp profile to: /tmp/profile.yaml 10:08:25.616298 Unloading bpf module
I have to execute
spoc
as root because it will internally run an ebpf program by reusing the same code parts from the Security Profiles Operator itself. I can see that the bpf module got loaded successfully andspoc
attached the required tracepoint to it. Then it will track the main application by using its mount namespace and process the recorded syscall data. The nature of ebpf programs is that they see the whole context of the Kernel, which means thatspoc
tracks all syscalls of the system, but does not interfere with their execution.The logs indicate that
spoc
found the syscallsread
,close
,mmap
and so on, includinguname
. All other syscalls thanuname
are coming from the golang runtime and its garbage collection, which already adds overhead to a basic application like in our demo. I can also see from the log lineAdding base syscalls: …
thatspoc
adds a bunch of base syscalls to the resulting profile. Those are used by the OCI runtime (like runc or crun) in order to be able to run a container. This means thatspoc
can be used to record seccomp profiles which then can be containerized directly. This behavior can be disabled inspoc
by using the--no-base-syscalls
/-n
or customized via the--base-syscalls
/-b
command line flags This can be helpful in cases where different OCI runtimes other than crun and runc are used, or if I just want to record the seccomp profile for the application and stack it with another base profile.The resulting profile is now available in
/tmp/profile.yaml
, but the default location can be changed using the--output-file value
/-o
flag:> cat /tmp/profile.yaml
apiVersion: security-profiles-operator.x-k8s.io/v1beta1 kind: SeccompProfile metadata: creationTimestamp: null name: main spec: architectures: - SCMP_ARCH_X86_64 defaultAction: SCMP_ACT_ERRNO syscalls: - action: SCMP_ACT_ALLOW names: - access - arch_prctl - brk - … - uname - … status: {}
The seccomp profile Custom Resource Definition (CRD) can be directly used together with the Security Profiles Operator for managing it within Kubernetes.
spoc
is also capable of producing raw seccomp profiles (as JSON), by using the--type
/-t
raw-seccomp
flag:> sudo ./spoc record --type raw-seccomp ./main … 52.628827 Wrote seccomp profile to: /tmp/profile.json
> jq . /tmp/profile.json
{ "defaultAction": "SCMP_ACT_ERRNO", "architectures": ["SCMP_ARCH_X86_64"], "syscalls": [ { "names": ["access", "…", "write"], "action": "SCMP_ACT_ALLOW" } ] }
The utility
spoc record
allows us to record complex seccomp profiles directly from binary invocations in any Linux system which is capable of running the ebpf code within the Kernel. But it can do more: How about modifying the seccomp profile and then testing it by usingspoc run
.Running seccomp profiles with
spoc run
spoc
is also able to run binaries with applied seccomp profiles, making it easy to test any modification to it. To do that, just run:> sudo ./spoc run ./main 10:29:58.153263 Reading file /tmp/profile.yaml 10:29:58.153311 Assuming YAML profile 10:29:58.154138 Setting up seccomp 10:29:58.154178 Load seccomp profile 10:29:58.154189 Starting audit log enricher 10:29:58.154224 Enricher reading from file /var/log/audit/audit.log 10:29:58.155356 Running command with PID: 437880 >
It looks like that the application exited successfully, which is anticipated because I did not modify the previously recorded profile yet. I can also specify a custom location for the profile by using the
--profile
/-p
flag, but this was not necessary because I did not modify the default output location from the record.spoc
will automatically determine if it's a raw (JSON) or CRD (YAML) based seccomp profile and then apply it to the process.The Security Profiles Operator supports a log enricher feature, which provides additional seccomp related information by parsing the audit logs.
spoc run
uses the enricher in the same way to provide more data to the end users when it comes to debugging seccomp profiles.Now I have to modify the profile to see anything valuable in the output. For example, I could remove the allowed
uname
syscall:> jq 'del(.syscalls[0].names[] | select(. == "uname"))' /tmp/profile.json > /tmp/no-uname-profile.json
And then try to run it again with the new profile
/tmp/no-uname-profile.json
:> sudo ./spoc run -p /tmp/no-uname-profile.json ./main 10:39:12.707798 Reading file /tmp/no-uname-profile.json 10:39:12.707892 Setting up seccomp 10:39:12.707920 Load seccomp profile 10:39:12.707982 Starting audit log enricher 10:39:12.707998 Enricher reading from file /var/log/audit/audit.log 10:39:12.709164 Running command with PID: 480512 panic: operation not permitted goroutine 1 [running]: main.main() /path/to/main.go:10 +0x85 10:39:12.713035 Unable to run: launch runner: wait for command: exit status 2
Alright, that was expected! The applied seccomp profile blocks the
uname
syscall, which results in an "operation not permitted" error. This error is pretty generic and does not provide any hint on what got blocked by seccomp. It is generally extremely difficult to predict how applications behave if single syscalls are forbidden by seccomp. It could be possible that the application terminates like in our simple demo, but it could also lead to a strange misbehavior and the application does not stop at all.If I now change the default seccomp action of the profile from
SCMP_ACT_ERRNO
toSCMP_ACT_LOG
like this:> jq '.defaultAction = "SCMP_ACT_LOG"' /tmp/no-uname-profile.json > /tmp/no-uname-profile-log.json
Then the log enricher will give us a hint that the
uname
syscall got blocked when usingspoc run
:> sudo ./spoc run -p /tmp/no-uname-profile-log.json ./main 10:48:07.470126 Reading file /tmp/no-uname-profile-log.json 10:48:07.470234 Setting up seccomp 10:48:07.470245 Load seccomp profile 10:48:07.470302 Starting audit log enricher 10:48:07.470339 Enricher reading from file /var/log/audit/audit.log 10:48:07.470889 Running command with PID: 522268 10:48:07.472007 Seccomp: uname (63)
The application will not terminate any more, but seccomp will log the behavior to
/var/log/audit/audit.log
andspoc
will parse the data to correlate it directly to our program. Generating the log messages to the audit subsystem comes with a large performance overhead and should be handled with care in production systems. It also comes with a security risk when running untrusted apps in audit mode in production environments.This demo should give you an impression how to debug seccomp profile issues with applications, probably by using our shiny new helper tool powered by the features of the Security Profiles Operator.
spoc
is a flexible and portable binary suitable for edge cases where resources are limited and even Kubernetes itself may not be available with its full capabilities.Thank you for reading this blog post! If you're interested in more, providing feedback or asking for help, then feel free to get in touch with us directly via Slack (#security-profiles-operator) or the mailing list.
https://kubernetes.io/blog/2023/05/18/seccomp-profiles-edge/
© Lightnetics 2024