Skip to content

Conversation

@syranide
Copy link
Contributor

We discussed this briefly some time ago, because   is unescaped in the parser and JSX receives no information about it, it ends up being identical to just (a regular space) which is trimmed by JSX if appearing left/right-most on a line.

Here's a simple proposal that moves the unescaping from the parser to JSX, I also made the unescaping routine a bit more accurate while also making it lenient like browsers, so XJS & JSX is valid (XJS & JSX is quite cumbersome and annoying).

Alternative implementation could have the parser parse both XJSText and XJSEntity for instance, and introducing XJSStringLiteral which would then consist of one or more XJSText/XJSEntity. This is perhaps the neater solution, but also more invasive (I don't mind though).

Feel free to agree, disagree, reject, etc.

Depends on facebookarchive/esprima#19

@syranide
Copy link
Contributor Author

cc @jeffmo

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@RReverser
Copy link
Contributor

We discussed this briefly some time ago, because is unescaped in the parser and JSX receives no information about it ...

That's not completely true. "raw" property for string literals in Esprima, Acorn etc. is provided exactly for reason of advanced handling, while literal's value stays preprocessed.

Currently JSX literals have this property as well - so transformer can use it for advanced operations while JSX parser will produce general-purpose output (preprocessed value in value + raw value in raw).

I'm not sure if there is much sense to make JSX literals behave differently from simple string literals in sake of React's post-processing.

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

@RReverser Hmm? Not sure if I understand, parsing XHTMLEntities in esprima is an issue as information is lost for React/JSX. Sure, esprima could still parse XHTMLEntities, but we wouldn't use them anyway for React/JSX so it makes no sense for esprima to keep doing it. Or?

@RReverser
Copy link
Contributor

@syranide Feels like we don't understand each other.

Not sure if I understand, parsing XHTMLEntities in esprima is an issue as information is lost for React/JSX.

What information is lost? As I've told in previous comment, you have all the needed information returned from the parser - both pre-processed and raw strings. Check out https://github.com/facebook/esprima/blob/b697889d421178976dde9284c2dadc27a1828d26/test/fbtest.js#L422-L431 for example.

What I'm saying, if we want to keep JSX not hardly bound with React (and, AFAIK, we do want this), we shouldn't change current parser's behavior and instead we should teach React's transformer to use raw property whenever it needs (i.e., for distinguishing between   and (space)).

This way:

  1. We preserve structure compatibility between XJS literals and native ones (preprocessed string in value, raw string in raw).
  2. Any code that is fine with using pre-processed string literal and doesn't care about original entities, can simply use string from value, while any code that does make a difference between entities and produced values, also has everything needed in raw property.

Anyway, better to wait for @jeffmo's comments on this.

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

@RReverser Ah perhaps, I don't disagree, but AFAIK React/JSX would never want to use the decoded value (it also seems like a bad idea to mix use of "decode raw" and "value as-is"), so what use is it there in keeping it? Are there other projects that depend on it? That's where I'm confused :)

@RReverser
Copy link
Contributor

React/JSX would never want to use the decoded value (it also seems like a bad idea to mix use of "decode raw" and "value as-is")

Currently, React uses decoded value everywhere, and trimming feels to me like a very specific task where raw value might be needed (and, as far as I understood from diff, your PR relies on this assumption as well).

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

@RReverser Right, it currently uses the decoded value everywhere, but I see no reason to keep doing it if we have to do some of the decoding in JSX anyway, there will be no benefit to reading the decoded value instead of always decoding the raw value. So we would have to maintain two separate decoding implementations in separate projects for no actual benefit?

@RReverser
Copy link
Contributor

@syranide If it's just about trimming, you don't even need to decode values again - instead you can count spaces on the beginning and ending of raw string and cut value correspondingly.

Btw, do you know why the hell attribute values should be trimmed at all?

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

@RReverser Sure, we can count/cut/etc, but that seems super fragile and just decoding in JSX seems like the logical choice.

Quite a while since I last looked/touched that code, but IIRC there's quite a number of things about attributes that are kind of weird. But no, attributes values definitely shouldn't be trimmed, perhaps it's a multi-line thing? (JSX attribute strings can be multiline I think ... which is really weird)

@RReverser
Copy link
Contributor

just decoding in JSX seems like the logical choice

I'll still disagree since any string decoding feels like parser's task that should not be moved out from it. In the worst case, I'd rather expose method for entity decoding from parser and reuse it in React, but definitely not remove it from parser itself.

Quite a while since I last looked/touched that code, but IIRC there's quite a number of things about attributes that is kind of weird.

Yeah, I think we should better figure out reasons for existing of this feature before arguing about solutions for bug introduced by it.

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

I'll still disagree since any string decoding feels like parser's task that should not be moved out from it. In the worst case, I'd rather expose method for entity decoding from parser and reuse it in React, but definitely not remove it from parser itself.

I'm inclined to agree, but I think there's a slight subtlety to decoding XHTMLEntities vs strings in the parser. Perhaps what should be done is to move some of the JSX inline text trimming to esprima... then esprima becomes a "true" XHTML parser and it suddenly makes sense, currently it's only a JS parser IMO (it expects JSX to process the values) and as such decoding XHTMLEntities doesn't make sense to me (because information is lost).

Not sure if I'm making any sense, but I think decoding should move from esprima to JSX or trimming should move from JSX to esprima (so that JSX doesn't doesn't do any "interpretation/processing" of the value). The latter probably makes a lot more sense the more I think about it, as that's how attributes are handled.

Yeah, I think we should better figure out reasons for existing of this feature before arguing about solutions for bug introduced by it.

The trimming I'm referring to applies to inline text, not attribute values. Or did I misunderstand?

@RReverser
Copy link
Contributor

trimming should move from JSX to esprima (so that JSX doesn't doesn't do any "interpretation/processing" of the value)

Personally I like this idea.

The trimming I'm referring to applies to inline text, not attribute values. Or did I misunderstand?

Well, it's inside renderXJSString so it's applied to both. And while trimming inline text might make some sense (still not much IMO), trimming attribute values seems completely useless and probably even wrong to me:

> console.log(require('react-tools').transform('/** @jsx DOM */<x a="1" a="2\n3"> 1 2 </x>'))
/** @jsx DOM */x({a: "1", a: "2" + ' ' +
"3"}, " 1 2 ")

Maybe it's desired behavior, but anyway I don't see benefits from it.

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

Personally I like this idea.

Awesome, I think it makes a lot more sense than this PR the more I think about, and it seems like we both agree 👍

Maybe it's desired behavior, but anyway I don't see benefits from it.

Ah yes, that rings a bell. IIRC most of the attribute code uses non-attribute specific helpers, which kind of works, but for attributes has many small weird edge-cases. So unless @jeffmo thinks otherwise, it seems to me that attributes really should use their own logic where it makes sense.

@RReverser
Copy link
Contributor

Awesome, I think it makes a lot more sense than this PR the more I think about, and it seems like we both agree 👍

🍰

@RReverser
Copy link
Contributor

I see you've already closed facebookarchive/esprima#19, probably makes sense to close this one, too? (since it's linked)

Then new PR with trimming can be created in esprima. (with dependency on facebookarchive/esprima#32)

@syranide
Copy link
Contributor Author

syranide commented Aug 5, 2014

@RReverser Yep, 👍 for facebookarchive/esprima#32 and it seems there's a pretty straight-forward goal, closing :)

@syranide syranide closed this Aug 5, 2014
@syranide syranide deleted the jsxent branch August 5, 2014 15:02
@syranide
Copy link
Contributor Author

syranide commented Aug 6, 2014

Hmm, quickly thinking about this again, I'm not sure how one would go about doing this. It's straight-forward enough to move the trimming to the parser, but React/JSX should preserve line-numbers and necessary information is again lost in translation... or?

...or, perhaps we simply don't care about 1:1 mapping with inline text and just replicate the number of newlines. Practically everything is the same, but lose a bit of familiarity when looking at the JSX output.

@RReverser
Copy link
Contributor

React/JSX should preserve line-numbers and necessary information is again lost in translation... or?

Or it uses location info from loc properties (as it already does), which is a proper way for preserving line numbers, generating source maps etc.

@syranide
Copy link
Contributor Author

syranide commented Aug 6, 2014

@RReverser but loc only gives the starting line, not where each individual newline is in the "value". Unless we generate an XJSText for every line I suppose... hmm.

@RReverser
Copy link
Contributor

...or, perhaps we simply don't care about 1:1 mapping with inline text and just replicate the number of newlines

Ah, so you mean visual "similarity" between JSX and generated JS where each JSX line corresponds to compiled JS line?

@syranide
Copy link
Contributor Author

syranide commented Aug 6, 2014

@RReverser Yeah, that's an intended feature today AFAIK.

@RReverser
Copy link
Contributor

@syranide Well, current code in React's vendor/fbtransform/transforms/xjs anyway artificially looks for first/last empty lines and inserts corresponding count of those in JS, so we could simply teach it to use difference between loc.starts instead of counting them again using regexps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants