Skip to content

Conversation

Qix-
Copy link

@Qix- Qix- commented Jun 16, 2017

ref vercel/vercel#682

One of our users reported that their LABEL command was being incorrectly parsed. It appears quoted values in name/value pairs don't have their quotes removed when being parsed (although it does appear the quoted are being honored).

This PR fixes that by stripping the quotes from the beginning and end if they exist.

@twhiteman
Copy link
Contributor

Thanks for the patch. Whilst this successfully removes the double-quotes, I have some dependent components that rely on the quoting being preserved (or rather the quoting is later parsed/removed in a separate component).

I'm wondering if you could make this an optional feature, by setting an option (e.g. stripQuotes = true) on the parse() options object. You would also then need to copy that stripQuotes onto the cmd object itself so it could be later checked in the parseNameVal function.

@Qix-
Copy link
Author

Qix- commented Jun 20, 2017

Just curious, why rely on the quotes being there? That's not how Dockerfiles are parsed officially as far as I can tell.

If you need quotes on the values, why not do '"' + value + '"'?

@twhiteman
Copy link
Contributor

The quotes are used by the docker server, to display build output back to the client:

Here is an example:

$ cat Dockerfile
FROM busybox
LABEL    this="cool beans"
LABEL    this="cool \" quoted beans"
LABEL  foo=bar   bar="baz"    "baz=dil"

$ docker build .
Sending build context to Docker daemon  632.3kB
Step 1/4 : FROM busybox
 ---> c30178c5239f
Step 2/4 : LABEL this "cool beans"
 ---> Running in f086ef9ff5e3
 ---> 543be5d49273
Removing intermediate container f086ef9ff5e3
Step 3/4 : LABEL this "cool \" quoted beans"
 ---> Running in d568e10c7b32
 ---> 58c974d6cbfb
Removing intermediate container d568e10c7b32
Step 4/4 : LABEL foo bar bar "baz" "baz dil"
 ---> Running in ebe6b5377930
 ---> 24da41c2a3e2
Removing intermediate container ebe6b5377930
Successfully built 24da41c2a3e2

As you can see from the Step output above, it's not exactly as simple as wrapping with quotes, as docker is preserving the original Dockerfile quotation.

@Qix-
Copy link
Author

Qix- commented Jun 20, 2017

Sure, but I believe they're parsing those files much differently than what this module is intended for, no? The goal of this module is to extract the relevant values, correct?

Quotes make no sense in the parsed data. That's like if your shell retained the quotes on the command line.

$ ./my-program "foo" "bar baz" \"hello\"

Which currently get parsed as:

  1. foo
  2. bar baz
  3. "hello" (with literal quotes)

It would be catastrophic if the following were true:

int main(int, const char **argv) {
    bool isFoo = string(argv[1]) == "foo"; // false! argv[1] is "\"foo\""
}

I strongly believe Docker is merely echoing out the lines it parses as it parses them. I know Docker parses/processes line by line since interrupting or killing a build makes Docker think it's completed up to a certain part in the Docker file.

I think the fact Docker naïvely parses its Dockerfiles allows it to simply echo out the raw text that was in the Dockerfile (though I'm speculating).

Regardless, quotes in Dockerfiles are merely for syntax, not for semantics. I think keeping them in the parsed input significantly reduces the parser's usefulness and causes a lot of guesswork for downstream dependents.

@twhiteman
Copy link
Contributor

twhiteman commented Jun 20, 2017

This code is a direct translation from the docker code:
https://github.com/moby/moby/blob/4cbc953a5d80c850df7107b28e743e933bbeb1d3/builder/dockerfile/parser/line_parsers.go#L141
which keeps the quotes. Quotes are later handled/removed in a separate component:
https://github.com/moby/moby/blob/master/builder/dockerfile/shell_parser.go

The same happens for the dependent components I mentioned (that rely on the quotes being there), in the same way as docker's shell parser above:
https://github.com/joyent/sdc-docker-build/blob/0ef99b8e6f9b320a6c7a18464a5a551b281b5950/lib/shellparser.js

The docker server is not just echoing a raw representation of the text, as shown in this example:

LABEL  foo=bar   bar="baz"    "baz=dil"

for which docker will echo (intermediate whitespace is removed, quotes are preserved, equal sign was replaced with a space):

Step 4/4 : LABEL foo bar bar "baz" "baz dil"

Now I need to keep the same behaviour to docker, as the components that depend on this act as a real docker server, so they must output the same format/information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants