Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: [sc-118372] /var/lib/embedded-cluster is not exec by other non-root user #1729

Merged
merged 28 commits into from
Feb 18, 2025

Conversation

nvanthao
Copy link
Member

@nvanthao nvanthao commented Jan 23, 2025

Story details: https://app.shortcut.com/replicated/story/118372

Problem

When creating directories using os.MkdirAll, the final permissions are affected by the system's umask. This behavior is undesirable because it can lead to permission issues when other processes attempt to access or modify files in our data directory (/var/lib/embedded-cluster). Specifically, directories may end up with restrictive permissions that prevent necessary access by other services or processes.

Solution

To ensure consistent and correct directory permissions, this PR sets the umask to 022 before beginning execution of the install, restore, or join commands.

Demo

Before: https://asciinema.org/a/6BIISglNrpFdvONP03d683q15

ls -lah /var/lib/embedded-cluster/                                                                                  
total 24K                                                                                                                                            
drwx------  6 root root 4.0K Jan 22 23:54 .                                                                                                          
drwxr-xr-x 39 root root 4.0K Jan 22 23:54 ..                                                                                                         
drwx------  2 root root 4.0K Jan 22 23:54 bin                                                                                                        
drwx------  3 root root 4.0K Jan 22 23:54 k0s                                                                                                        
drwx------  2 root root 4.0K Jan 22 23:54 support                                                                                                    
drwx------  2 root root 4.0K Jan 22 23:54 tmp

After: https://asciinema.org/a/GZCkeoOtPuW0uOPhCDP6EuXdc

ls -lah /var/lib/embedded-cluster/
total 36K
drwxr-xr-x 6 root root 4.0K Jan 23 02:46 .
drwxr-xr-x 1 root root 4.0K Jan 23 02:20 ..
drwxr-xr-x 2 root root 4.0K Jan 23 02:46 bin
drwxr-xr-x 3 root root 4.0K Jan 23 02:46 k0s
drwx------ 2 root root 4.0K Jan 23 02:46 support
drwxr-xr-x 2 root root 4.0K Jan 23 02:46 tmp

Copy link

github-actions bot commented Jan 23, 2025

This PR has been released (on staging) and is available for download with a embedded-cluster-smoke-test-staging-app license ID.

Online Installer:

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci/appver-dev-15d4856" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Airgap Installer (may take a few minutes before the airgap bundle is built):

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci-airgap/appver-dev-15d4856?airgap=true" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Happy debugging!

pkg/helpers/fs.go Outdated Show resolved Hide resolved
Copy link
Member

@sgalsaleh sgalsaleh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's many other places in the code that use os.MkdirAll that I think we should change as well

pkg/helpers/fs.go Outdated Show resolved Hide resolved
@nvanthao
Copy link
Member Author

I've added more dir but only for dirs under EmbeddedClusterHomeDirectory as it could be use by other users created by k0s.

@banjoh
Copy link
Member

banjoh commented Feb 4, 2025

I think we should modify these as well since they are code paths in the install workflow

@emosbaugh / @sgalsaleh is there any harm in applying this change to all os.MkdirAll in the codebase?

@sgalsaleh
Copy link
Member

@banjoh no, that should be fine I think.

Copy link
Member

@banjoh banjoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran an install using a binary built from this PR but the install failed. Here is my install attempt

root@evans-vm1:/home/evans# umask 0077
root@evans-vm1:/home/evans# ./embedded-cluster install
? Set the Admin Console password (minimum 6 characters): ******
? Confirm the Admin Console password: ******
✔  Host files materialized!
✔  Host preflights succeeded!
✔  Waiting for embedded-cluster node to be ready
unable to wait for node: unable to get status: exit status 1: Error: status: can't get "status" via "/run/k0s/status.sock": Get "http://localhost/status": dial unix /run/k0s/status.sock: connect: connection refused

I picked a number of paths where the permissions were not right such as

  • drwx------ root /var/lib/embedded-cluster/
  • All files in /var/lib/embedded-cluster/k0s/bin have -r-xr-x--- root. They do not have global execute permission

Since the install failed very early on, there might have been some other paths not set right later on.

Changes needed

  • I addition to the changes here, I think we should consider setting 0022 umask before the install/join starts using unix.Umask(0022). This will be a good safeguard to ensure we do not miss created files as well. This PR just addresses directories.
  • We need to test joining a new node.
  • We need cover install and joining in e2e tests where the umask used is 0077. I choose this umask one since its what recommendations such as STIG/CIS document.

@sgalsaleh
Copy link
Member

@banjoh did you resolve conflicts first before you built the binary?

@banjoh
Copy link
Member

banjoh commented Feb 5, 2025

@banjoh did you resolve conflicts first before you built the binary?

Hmm. No. Good catch. I'll do that then run another test.

@banjoh
Copy link
Member

banjoh commented Feb 5, 2025

            umask := 0022
            oldUmask := unix.Umask(umask)
            if oldUmask != umask {
                logrus.Debugf("seting umask to %04o from %04o", umask, oldUmask)
            }

@nvanthao I added this change to func RootCmd(ctx context.Context, name string) *cobra.Command to test working how the install works. It succeeded with your last commit (before the merge done above - git checkout 1d31f87d1a35d7415fcfe913315fc91e8b23d7fa)

@laverya
Copy link
Member

laverya commented Feb 5, 2025

    install_test.go:1071: set umask stdout: 'umask 0077' >> /etc/profile
    install_test.go:1072: stderr: 
    install_test.go:1078: check umask stdout: 0022
    install_test.go:1079: stderr: 

evidently my test did not work

@banjoh banjoh force-pushed the gerard/sc-118372/chmod branch from d4d863b to c9fe8c3 Compare February 6, 2025 12:32
banjoh
banjoh previously approved these changes Feb 12, 2025
@@ -68,6 +69,9 @@ func RootCmd(ctx context.Context, name string) *cobra.Command {
metrics.DisableMetrics()
}

// set the umask to 022 so that we can create files/directories with 755 permissions
_ = syscall.Umask(0o022)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume having this in root isn't sufficient, can you please add a comment to the other ones explaining why?

also, are there no other commands that generate dirs and files other than install, join, and restore? what about materialize? isn't it used by LAM? should we set umask for materialize and LAM cli commands as well? operator cli?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it to the various 'main' functions, and try to come up with a better comment for it

@@ -171,6 +172,9 @@ func preRunInstall(cmd *cobra.Command, flags *InstallCmdFlags) error {
return fmt.Errorf("install command must be run as root")
}

// set the umask to 022 so that we can create files/directories with 755 permissions
_ = syscall.Umask(0o022)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you at the very least logrus.Debug this error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't return an error, it returns the previous umask value

@@ -96,6 +97,9 @@ func preRunJoin(flags *JoinCmdFlags) error {

flags.isAirgap = flags.airgapBundle != ""

// set the umask to 022 so that we can create files/directories with 755 permissions
_ = syscall.Umask(0o022)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logrus.Debug the error

@@ -68,6 +69,9 @@ func RootCmd(ctx context.Context, name string) *cobra.Command {
metrics.DisableMetrics()
}

// set the umask to 022 so that we can create files/directories with 755 permissions
_ = syscall.Umask(0o022)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logrus.Debug the error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value is not an error. Its the previous umask.

@laverya laverya enabled auto-merge (squash) February 18, 2025 16:26
@laverya laverya merged commit 0ac84ec into main Feb 18, 2025
65 checks passed
@laverya laverya deleted the gerard/sc-118372/chmod branch February 18, 2025 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants