General best-practices cleanup #51

charles-dyfis-net · 2022-12-28T17:39:39Z

Changes proposed in this pull request:

Avoid calling jq more than once per operation. To do this, we have jq generate assignments with eval-safe escaping setting all required shell variables from a single invocation.
Avoid passing anything but options through eval (keeping options themselves from being eval'd will be a compatibility-breaking change, and is the subject of Use of eval introduces possibility of shell injection unnecessarily. #50). The prior logic running eval aws s3 sync "s3://$bucket/$path" $dest $options gained none of the benefit from quotes around s3://$bucket/$path, because eval itself concatenated the literal arguments (with syntactic quotes discarded from the quote-removal parsing stage) into a single string before starting the parsing process over from the beginning; the new logic no longer uses eval for this entire command, but instead only uses eval when processing $options into "$@".
Avoid using echo to handle non-constant data (which introduces substantial portability problems across different implementations of sh; see https://unix.stackexchange.com/a/65819/3113 and https://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html). Also avoid echo $variable where $variable is unquoted, which can result in values being glob-expanded before being passed to echo, as well as transforming characters in IFS to spaces and collapsing runs of multiple such characters to a single instance per each.
Handle "null" return value from aws s3api list-objects gratefully (fixes Check fails if bucket is empty #49)
Quote "$0" inside "$(dirname "$0")"; the modern command substitution format used here introduces a new quoting context, so directory names with spaces (or an unexpected IFS value) could result dirname being passed an unexpected number of arguments with the old code.

Security considerations

Because we deliberately do not modify the behavior described in #50, the preexisting opportunity for shell injection via $options remains intact. However, we do narrow the code to only be evaling $options, so it's no longer possible to perform shell injection via $bucket or $path.

In every other respect, this PR works to reduce runtime ambiguity, and thus to also reduce attack surface.

- Avoid calling jq more than once per operation. To do this, we have `jq` generate assignments with eval-safe escaping setting shell variables. - Avoid passing anything but options through eval (keeping options themselves from being eval'd will be a compatibility-breaking change, and is the subject of cloud-gov#50). - Avoid using `echo` to handle non-constant data (which introduces substantial portability problems across different implementations of `sh`; see https://unix.stackexchange.com/a/65819/3113 and https://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html). - Handle "null" return value from `aws s3api list-objects` gratefully (cloud-gov#49)

markdboyd · 2023-01-03T16:00:35Z

assets/check

-payload=`cat`
-bucket=$(echo "$payload" | jq -r '.source.bucket')
-prefix="$(echo "$payload" | jq -r '.source.path // ""')"
+eval "$(jq -r '.source | [


So the reason to prefer using eval over the existing code is this:

Avoid calling jq more than once per operation. To do this, we have jq generate assignments with eval-safe escaping setting all required shell variables from a single invocation.

The code is definitely more efficient, but it's the "eval-safe escaping" part that worries me. Since there are numerous articles identifying the perils of eval.

I'm also a bit confused given #50. If that issue is saying that eval presents a shell-injection risk, which is my primary concern, then why refactor the code to use it?

The very short form of the argument is that jq's @sh is trustworthy because it doesn't try to be clever. :)

I've reported my share of shell injection bugs in code (mostly in Java) that tried to do eval-safe escaping and got it wrong (see f/e MSHARED-297 and PLXUTILS-161), but in all those cases the tools in question were trying to do the minimum quoting necessary, leaving trivial content unquoted or using double quotes instead of single quotes if there were literal single quotes in the string to be escaped, or otherwise attempting to do something interesting to generate "better" output.

The POSIX sh rules for single-quoted strings, though, are dirt simple: Absolutely everything is literal, including backslashes, newlines, and every other character until the next single quote; no mechanism to escape them is allowed (so one cannot have a single quote inside a single-quoted string). To insert a literal single quote character, jq exits the single-quoted string, adds a backslash-escaped single quote in an unquoted context, and then reenters the single-quoted context.

Dirt simple, next to impossible to get wrong as long as its authors continue to reject any PRs that would try to make @sh clever.

To put it a bit differently: eval presents a shell injection risk whenever it's passed non-constant data that hasn't been through a correct implementation of eval-safe escaping.

In the PR here, the varname= parts are constant, and the values to the right of them go through jq's eval-safe escaping implementation, which (as discussed above) is implemented in a simple enough way to be easy to audit as trivially correct for all POSIX-compliant shells.

That said -- if y'all are willing to give up sh compatibility to switch to bash, there are options that become available to drop eval, avoid shell injection, and still run jq only once; something like:

declare -A source_vars=( ) while IFS='' read -r -d '' element; do [[ $element = *=* ]] || continue source_vars[${element%%=*}]=${element#*=} done < <(jq -j ' def clean_nuls: . | sub("\u0000"; "<NUL>"); .source | to_entries[] | "\(.key|clean_nuls)=\(.value|clean_nuls)\u0000" ')

after which we work with ${source_vars[bucket]} and the like. That way we import every defined variable, but unexpected values like PATH, LD_PRELOAD, &c can't be used to attack the system -- one needs to explicitly set AWS_ACCESS_KEY_ID=${source_vars[access_key_id]} to get variables out of the map into a context where they're meaningful to anything that doesn't know about the associative array and intentionally access it.

Thanks for the explanation @charles-dyfis-net ! Ultimately, given that the only benefit of these changes is to eliminate duplicate calls to jq, I'm inclined not to introduce the use of eval. While the changes may be safe in their current form as you say, if jq changes it @sh behavior, that could change. Or, perhaps more likely, other maintainers of this code (myself included) who don't understand the risks of eval as well could introduce a shell-injection risk.

So I'd like to remove the changes to use eval

Updated per request. Let me know if you want the change and its revert squashed together.

I do plan on a separate PR to use xargs to remove the eval for options -- might come back with a concrete eval-free single-jq-invocation config-parsing implementation then.

charles-dyfis-net · 2023-01-05T04:26:44Z

BTW, one thing that may be notable in the recently added commit (returning to a separate jq call per variable) is the change from echo to printf. For background on that, see the excellent answer by Stéphane Chazelas on Why is printf better than echo?, or the APPLICATION USAGE and RATIONALE sections of the POSIX standard for echo.

If we were specifying a specific shell (like bash), making assumptions about echo behavior would be slightly justifiable (though only slightly: even with bash, configuration parameters like xpg_echo modifying behaviors in the cases the standard describes as ambiguous can be set at compile time, or via environment variables at runtime, or via explicit runtime commands); but if we're using /bin/sh, best to avoid cases the POSIX sh standard describes as ambiguous altogether.

markdboyd · 2023-01-06T19:53:26Z

Approved. Thanks @charles-dyfis-net for the contribution!

charles-dyfis-net mentioned this pull request Dec 29, 2022

Use of eval introduces possibility of shell injection unnecessarily. #50

Open

markdboyd reviewed Jan 3, 2023

View reviewed changes

Revert optimization to reduce jq invocations, per review feedback

2c6d870

markdboyd approved these changes Jan 6, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General best-practices cleanup #51

General best-practices cleanup #51

Uh oh!

charles-dyfis-net commented Dec 28, 2022 •

edited

Loading

Uh oh!

markdboyd Jan 3, 2023

Uh oh!

charles-dyfis-net Jan 3, 2023 •

edited

Loading

Uh oh!

charles-dyfis-net Jan 3, 2023 •

edited

Loading

Uh oh!

charles-dyfis-net Jan 3, 2023 •

edited

Loading

Uh oh!

markdboyd Jan 4, 2023

Uh oh!

charles-dyfis-net Jan 5, 2023

Uh oh!

charles-dyfis-net commented Jan 5, 2023

Uh oh!

markdboyd commented Jan 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

General best-practices cleanup #51

Are you sure you want to change the base?

General best-practices cleanup #51

Uh oh!

Conversation

charles-dyfis-net commented Dec 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes proposed in this pull request:

Security considerations

Uh oh!

markdboyd Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

charles-dyfis-net Jan 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charles-dyfis-net Jan 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charles-dyfis-net Jan 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markdboyd Jan 4, 2023

Choose a reason for hiding this comment

Uh oh!

charles-dyfis-net Jan 5, 2023

Choose a reason for hiding this comment

Uh oh!

charles-dyfis-net commented Jan 5, 2023

Uh oh!

markdboyd commented Jan 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

charles-dyfis-net commented Dec 28, 2022 •

edited

Loading

charles-dyfis-net Jan 3, 2023 •

edited

Loading

charles-dyfis-net Jan 3, 2023 •

edited

Loading

charles-dyfis-net Jan 3, 2023 •

edited

Loading