Commit Graph

293 Commits

Author SHA1 Message Date
Sascha Grunert 4a4d4f109b Add support for seccomp actions ActKillThread and ActKillProcess
Two new seccomp actions have been added to the libseccomp-golang
dependency, which can be now supported by runc, too.

ActKillThread kills the thread that violated the rule. It is the same as
ActKill. All other threads from the same thread group will continue to
execute.

ActKillProcess kills the process that violated the rule. All threads in
the thread group are also terminated. This action is only usable when
libseccomp API level 3 or higher is supported.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-09-09 17:47:00 +10:00
Alban Crequy 2b025c0173 Implement Seccomp Notify
This commit implements support for the SCMP_ACT_NOTIFY action. It
requires libseccomp-2.5.0 to work but runc still works with older
libseccomp if the seccomp policy does not use the SCMP_ACT_NOTIFY
action.

A new synchronization step between runc[INIT] and runc run is introduced
to pass the seccomp fd. runc run fetches the seccomp fd with pidfd_get
from the runc[INIT] process and sends it to the seccomp agent using
SCM_RIGHTS.

As suggested by @kolyshkin, we also make writeSync() a wrapper of
writeSyncWithFd() and wrap the error there. To avoid pointless errors,
we made some existing code paths just return the error instead of
re-wrapping it. If we don't do it, error will look like:

	writing syncT <act>: writing syncT: <err>

By adjusting the code path, now they just look like this
	writing syncT <act>: <err>

Signed-off-by: Alban Crequy <alban@kinvolk.io>
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
Co-authored-by: Rodrigo Campos <rodrigo@kinvolk.io>
2021-09-07 13:04:24 +02:00
Akihiro Suda bd75bc2dc6 Merge pull request #3176 from kolyshkin/rm-config-error-alt
libct/error.go: rm ConfigError (alt)
2021-09-02 14:34:32 +09:00
Kir Kolyshkin d8da00355e *: add go-1.17+ go:build tags
Go 1.17 introduce this new (and better) way to specify build tags.
For more info, see https://golang.org/design/draft-gobuild.

As a way to seamlessly switch from old to new build tags, gofmt (and
gopls) from go 1.17 adds the new tags along with the old ones.

Later, when go < 1.17 is no longer supported, the old build tags
can be removed.

Now, as I started to use latest gopls (v0.7.1), it adds these tags
while I edit. Rather than to randomly add new build tags, I guess
it is better to do it once for all files.

Mind that previous commits removed some tags that were useless,
so this one only touches packages that can at least be built
on non-linux.

Brought to you by

        go1.17 fmt ./...

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-08-30 20:58:22 -07:00
Kir Kolyshkin 3c7db3827c Merge pull request #2883 from flouthoc/master
Add support for rdma cgroup introduced in Linux Kernel 4.11
2021-08-30 20:02:04 -07:00
Kir Kolyshkin 62ec6dc973 Merge pull request #2920 from marquiz/devel/rdt
libcontainer/intelrdt: support ClosID parameter
2021-08-30 19:36:03 -07:00
Qiang Huang b4b797200e Merge pull request #3136 from kolyshkin/cg-d-c
libct/cg: rm dead code to improve clarity
2021-08-25 14:46:27 +08:00
Kir Kolyshkin 6145628fff configs/validate: audit all returned errors
All the errors returned from Validate should tell about a configuration
error. Some were lacking a context, so add it.

While at it, fix abusing fmt.Errorf and logrus.Warnf where the argument
do not contain %-style formatting.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-08-23 18:54:47 -07:00
flouthoc b3d14488b5 Add support for rdma cgroup introduced in Linux Kernel 4.11
Signed-off-by: Aditya Rajan <flouthoc.git@gmail.com>
2021-08-23 12:25:33 +05:30
Kir Kolyshkin 9a095e44db libct/cg/sd/v1: add SkipFreezeOnSet knob
This is helpful to kubernetes in cases it knows for sure that the freeze
is not required (since it created the systemd unit with no device
restrictions).

As the code is trivial, no tests are required.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-08-18 12:43:55 -07:00
Markus Lehtonen 17e3b41dd0 libcontainer/intelrdt: support ClosID parameter
Handle ClosID parameter of IntelRdt. Makes it possible to use
pre-configured classes/ClosIDs and avoid running out of available IDs
which easily happens with per-container classes.

Remove validator checks for empty L3CacheSchema and MemBwSchema fields
in order to be able to leave them empty, and only specify ClosID for
a pre-configured class.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2021-08-09 15:58:03 +03:00
Kir Kolyshkin 1cbfe23464 libct/cg: rm dead code
This was initially added by commits 41d9d26513 and 4a8f0b4db4,
apparently to implement docker run --cgroup container:ID, which was
never merged. Therefore, this code is not and was never used.

It needs to be removed mainly because having it makes it much harder to
understand how cgroup manager works (because with this in place we have
not one or two but three sets of cgroup paths to think about).

Note if the paths are known and there is a need to add a PID to existing
cgroup, cgroup manager is not needed at all -- something like
cgroups.WriteCgroupProc or cgroups.EnterPid is sufficient (and the
latter is what runc exec uses in (*setnsProcess).start).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-08-08 13:03:51 -07:00
Kir Kolyshkin a91ce3062f libct/*_test.go: use t.TempDir
Replace ioutil.TempDir (mostly) with t.TempDir, which require no
explicit cleanup.

While at it, fix incorrect usage of os.ModePerm in libcontainer/intelrdt
test. This is supposed to be a mask, not mode bits.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-07-27 01:41:47 -07:00
Kir Kolyshkin a7cfb23b88 *: stop using pkg/errors
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-22 16:09:47 -07:00
Kir Kolyshkin 7be93a66b9 *: fmt.Errorf: use %w when appropriate
This should result in no change when the error is printed, but make the
errors returned unwrappable, meaning errors.As and errors.Is will work.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-22 16:09:47 -07:00
Kir Kolyshkin 627a06ad92 Replace fmt.Errorf w/o %-style to errors.New
Using fmt.Errorf for errors that do not have %-style formatting
directives is an overkill. Switch to errors.New.

Found by

	git grep fmt.Errorf | grep -v ^vendor | grep -v '%'

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-22 11:42:07 -07:00
Sebastiaan van Stijn b31a9340f9 libcontainer: relax validation for absolute paths
Commits 1f1e91b1a0 and 2192670a24
added validation for mountpoints to be an absolute path, to match the OCI
specs.

Unfortunately, the old behavior (accepting the path to be a relative path)
has been around for a long time, and although "not according to the spec",
various higher level runtimes rely on this behavior.

While higher level runtime have been updated to address this requirement,
there will be a transition period before all runtimes are updated to carry
these fixes.

This patch relaxes the validation, to generate a WARNING instead of failing,
allowing runtimes to update (but allowing them to update runc to the current
version, which includes security fixes).

We can remove this exception in a future patch release.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-09 13:20:28 +02:00
Sebastiaan van Stijn dbb35411f8 configs/validator: move cgroup validation to the list of checks
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-09 13:15:54 +02:00
Kir Kolyshkin 2f8e8e9d97 Merge pull request #2994 from kolyshkin/skip-devices-on-update
runc update: skip devices
2021-06-04 15:33:26 -07:00
Sebastiaan van Stijn 3f23a736cb libcontainer/configs: remove stubs for deprecated Devices funcs
These were deprecated and moved; the stubs were included in the
last two (rc94, rc95) releases, so external consumers would have
the chance to update their code.

Removing this so that this doesn't get into v1.0.0 GA

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-04 17:46:25 +02:00
Kir Kolyshkin bf7492ee5d runc update: skip devices
The runc update CLI is not able to modify devices, so let's set SkipDevices
(so that a cgroup controller won't try to update devices cgroup).

This helps use cases when some other device management (NVIDIA GPUs)
applies its configuration on top of what runc does.

Make sure we do not save SkipDevices into state.json.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-03 10:40:55 -07:00
Sebastiaan van Stijn e204d6a9e7 libcontainer/configs: add / fix godoc (golint)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-02 17:44:11 +02:00
Kir Kolyshkin e6048715e4 Use gofumpt to format code
gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules.

Brought to you by

	git ls-files \*.go | grep -v ^vendor/ | xargs gofumpt -s -w

Looking at the diff, all these changes make sense.

Also, replace gofmt with gofumpt in golangci.yml.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-01 12:17:27 -07:00
Aleksa Sarai ed4781029f merge branch 'pr-2781'
Sebastiaan van Stijn (7):
  errcheck: utils
  errcheck: signals
  errcheck: tty
  errcheck: libcontainer
  errcheck: libcontainer/nsenter
  errcheck: libcontainer/configs
  errcheck: libcontainer/integration

LGTM: AkihiroSuda cyphar
Closes #2781
2021-05-25 12:31:52 +10:00
Aleksa Sarai c7c70ce810 *: clean t.Skip messages
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-05-23 17:53:01 +10:00
Sebastiaan van Stijn 7e7ff8722a errcheck: libcontainer/configs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-05-20 14:17:45 +02:00
Giuseppe Scrivano c61f606254 libcontainer: honor seccomp defaultErrnoRet
https://github.com/opencontainers/runtime-spec/pull/1087 added support
for defaultErrnoRet to the OCI runtime specs.

If a defaultErrnoRet is specified, disable patching the generated
libseccomp cBPF.

Closes: https://github.com/opencontainers/runc/issues/2943

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-05-17 09:23:32 +02:00
Kir Kolyshkin 2192670a24 libct/configs/validate: validate mounts
Add a check that mount destination is absolute (as per OCI spec).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-20 11:48:44 -07:00
Qiang Huang 2d38476c96 Merge pull request #2840 from kolyshkin/ignore-kmem
Ignore kernel memory settings
2021-04-13 09:44:14 +08:00
Kir Kolyshkin 52390d6804 Ignore kernel memory settings
This is somewhat radical approach to deal with kernel memory.

Per-cgroup kernel memory limiting was always problematic. A few
examples:

 - older kernels had bugs and were even oopsing sometimes (best example
   is RHEL7 kernel);
 - kernel is unable to reclaim the kernel memory so once the limit is
   hit a cgroup is toasted;
 - some kernel memory allocations don't allow failing.

In addition to that,

 - users don't have a clue about how to set kernel memory limits
   (as the concept is much more complicated than e.g. [user] memory);
 - different kernels might have different kernel memory usage,
   which is sort of unexpected;
 - cgroup v2 do not have a [dedicated] kmem limit knob, and thus
   runc silently ignores kernel memory limits for v2;
 - kernel v5.4 made cgroup v1 kmem.limit obsoleted (see
   https://github.com/torvalds/linux/commit/0158115f702b).

In view of all this, and as the runtime-spec lists memory.kernel
and memory.kernelTCP as OPTIONAL, let's ignore kernel memory
limits (for cgroup v1, same as we're already doing for v2).

This should result in less bugs and better user experience.

The only bad side effect from it might be that stat can show kernel
memory usage as 0 (since the accounting is not enabled).

[v2: add a warning in specconv that limits are ignored]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-12 12:18:11 -07:00
Akihiro Suda d8a5f6084a Merge pull request #2885 from thaJeztah/config_missing_type
libcontainer/configs: add missing type for hooknames
2021-04-06 13:24:28 +09:00
Akihiro Suda 913b9f14e8 Merge pull request #2886 from thaJeztah/check_cleanup 2021-04-06 03:28:45 +09:00
Kir Kolyshkin 365c6282c7 Merge pull request #2888 from thaJeztah/fixup_rm_win_carry
libcontainer: rm windows pieces (carry #2700)
2021-04-03 19:10:36 -07:00
Kir Kolyshkin b1deba8c5a libcontainer/configs/config_windows_test.go: rm
Nothing is in there, so removing.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-02 11:55:33 +02:00
Sebastiaan van Stijn f1586dbd7a libcontainer/configs/validate: make Validate() less DRY
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-02 11:41:19 +02:00
Sebastiaan van Stijn 4126b807cc libcontainer/configs: add missing type for hooknames
Commit ccdd75760c introduced the HookName type
for hooks, but only set this type on the Prestart const, but not for the
other hooks.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-02 11:08:24 +02:00
Kir Kolyshkin b118430231 libct/configs/validator: add some cgroup support
Add some minimal validation for cgroups. The following checks
are implemented:

 - cgroup name and/or prefix (or path) is set;
 - for cgroup v1, unified resources are not set;
 - for cgroup v2, if memorySwap is set, memory is also set,
   and memorySwap > memory.

This makes some invalid configurations fail earlier (before runc init
is started), which is better.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-03-31 14:36:52 -07:00
AdamKorcz 2ae5665351 Move fuzzers upstream
Signed-off-by: AdamKorcz <adam@adalogics.com>
2021-03-09 10:07:11 +00:00
Kir Kolyshkin c342872276 libct/config: fix a data race
As reported by go test -race ./libcontainer/configs:

=== RUN   TestCommandHookRunTimeout
==================
WARNING: DATA RACE
Read at 0x00c000202230 by goroutine 23:
  os/exec.(*Cmd).Wait()
      /usr/lib/golang/src/os/exec/exec.go:502 +0x91
  github.com/opencontainers/runc/libcontainer/configs.Command.Run()
      /home/kir/go/src/github.com/opencontainers/runc/libcontainer/configs/config.go:390 +0x58c
  github.com/opencontainers/runc/libcontainer/configs_test.TestCommandHookRunTimeout()
      /home/kir/go/src/github.com/opencontainers/runc/libcontainer/configs/config_test.go:223 +0x3ed
  testing.tRunner()
      /usr/lib/golang/src/testing/testing.go:1123 +0x202

Previous write at 0x00c000202230 by goroutine 27:
  os/exec.(*Cmd).Wait()
      /usr/lib/golang/src/os/exec/exec.go:505 +0xb4
  github.com/opencontainers/runc/libcontainer/configs.Command.Run.func1()
      /home/kir/go/src/github.com/opencontainers/runc/libcontainer/configs/config.go:373 +0x55

Goroutine 23 (running) created at:
  testing.(*T).Run()
      /usr/lib/golang/src/testing/testing.go:1168 +0x5bb
  testing.runTests.func1()
      /usr/lib/golang/src/testing/testing.go:1439 +0xa6
  testing.tRunner()
      /usr/lib/golang/src/testing/testing.go:1123 +0x202
  testing.runTests()
      /usr/lib/golang/src/testing/testing.go:1437 +0x612
  testing.(*M).Run()
      /usr/lib/golang/src/testing/testing.go:1345 +0x3b3
  main.main()
      _testmain.go:69 +0x236

Goroutine 27 (running) created at:
  github.com/opencontainers/runc/libcontainer/configs.Command.Run()
      /home/kir/go/src/github.com/opencontainers/runc/libcontainer/configs/config.go:372 +0x415
  github.com/opencontainers/runc/libcontainer/configs_test.TestCommandHookRunTimeout()
      /home/kir/go/src/github.com/opencontainers/runc/libcontainer/configs/config_test.go:223 +0x3ed
  testing.tRunner()
      /usr/lib/golang/src/testing/testing.go:1123 +0x202
==================
    testing.go:1038: race detected during execution of test
--- FAIL: TestCommandHookRunTimeout (0.10s)

Apparently, the issue is we call two Wait()s for the same command
which can race internally.

Fix is easy -- since we already have a waiting goroutine,
wait for it to return instead of calling a second Wait().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-02-05 13:02:47 -08:00
Mauricio Vásquez 2be806d139 libcontainer/configs: improve CommandHook unit tests
Test that CommandHook actually executes a new process with the given env
variables, parameters and json state.

This commit also solves an issue with the previous approach that was calling
'os.Exit(0)' failing to signal test failures.

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
2021-01-26 13:01:41 -05:00
Kir Kolyshkin cb3dd9d8c7 libct/configs/validate: test for bind-mounted netns
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-12-18 09:32:46 -08:00
Kir Kolyshkin 8e8661e124 libct/configs/validate/sysctl: fix repeated netns checks
In case many net.* sysctls are provided, and we're not running
in the host netns, the function keep repeating isNetNS check
for every such sysctl. This is a waste of resources.

Do the isNetNS check only once, and only if needed.

Note that using sync.Once() is not really needed here; we could
have used a boolean variable to skip the repeated check, but
it looks more idiomatic that way.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-12-18 09:32:39 -08:00
Kir Kolyshkin 2dce06995a libct/configs/validate: fix host netns check
In case nsfs mount (such as /run/docker/netns/xxxx) is provided as
the netns path, the current way of determining whether path is of
host netns or not is not working.

The proper way to check is to do stat(2) and compare dev_t and
inode fields, which is what this commit does.

This is a minimal fix which does not try to optimize repeated
check in case more than one net.* sysctl is given and there is
no error.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-12-18 09:29:13 -08:00
Sebastiaan van Stijn 4fc2de77e9 libcontainer/devices: remove "Device" prefix from types
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-12-01 11:11:23 +01:00
Sebastiaan van Stijn 677baf22d2 libcontainer: isolate libcontainer/devices
Move the Device-related types to libcontainer/devices, so that
the package can be used in isolation. Aliases have been created
in libcontainer/configs for backward compatibility.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-12-01 11:11:21 +01:00
Xiaochen Shen f62ad4a0de libcontainer/intelrdt: rename CAT and MBA enabled flags
Rename CAT and MBA enabled flags to be consistent with others.
No functional change.

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2020-11-10 15:32:01 +08:00
Amim Knabben 978fa6e906 Fixing some lint issues
Signed-off-by: Amim Knabben <amim.knabben@gmail.com>
2020-10-06 14:44:14 -04:00
Kenta Tada faaecac77d libcontainer: remove loadConfig which is the unused function
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2020-10-01 12:07:49 +09:00
Sebastiaan van Stijn 8bf216728c use string-concatenation instead of sprintf for simple cases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-09-30 10:51:59 +02:00
Kir Kolyshkin b006f4a180 libct/cgroups: support Cgroups.Resources.Unified
Add support for unified resource map (as per [1]), and add some test
cases for the new functionality.

[1] https://github.com/opencontainers/runtime-spec/pull/1040

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-09-24 15:29:35 -07:00