-
Notifications
You must be signed in to change notification settings - Fork 50.2k
Unescape XHTML entities in JSX literals (moved from parser) #1514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @jeffmo |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
That's not completely true. "raw" property for string literals in Esprima, Acorn etc. is provided exactly for reason of advanced handling, while literal's value stays preprocessed. Currently JSX literals have this property as well - so transformer can use it for advanced operations while JSX parser will produce general-purpose output (preprocessed value in I'm not sure if there is much sense to make JSX literals behave differently from simple string literals in sake of React's post-processing. |
|
@RReverser Hmm? Not sure if I understand, parsing XHTMLEntities in esprima is an issue as information is lost for React/JSX. Sure, esprima could still parse XHTMLEntities, but we wouldn't use them anyway for React/JSX so it makes no sense for esprima to keep doing it. Or? |
|
@syranide Feels like we don't understand each other.
What information is lost? As I've told in previous comment, you have all the needed information returned from the parser - both pre-processed and raw strings. Check out https://github.com/facebook/esprima/blob/b697889d421178976dde9284c2dadc27a1828d26/test/fbtest.js#L422-L431 for example. What I'm saying, if we want to keep JSX not hardly bound with React (and, AFAIK, we do want this), we shouldn't change current parser's behavior and instead we should teach React's transformer to use This way:
Anyway, better to wait for @jeffmo's comments on this. |
|
@RReverser Ah perhaps, I don't disagree, but AFAIK React/JSX would never want to use the decoded value (it also seems like a bad idea to mix use of "decode raw" and "value as-is"), so what use is it there in keeping it? Are there other projects that depend on it? That's where I'm confused :) |
Currently, React uses decoded value everywhere, and trimming feels to me like a very specific task where raw value might be needed (and, as far as I understood from diff, your PR relies on this assumption as well). |
|
@RReverser Right, it currently uses the decoded value everywhere, but I see no reason to keep doing it if we have to do some of the decoding in JSX anyway, there will be no benefit to reading the decoded value instead of always decoding the raw value. So we would have to maintain two separate decoding implementations in separate projects for no actual benefit? |
|
@syranide If it's just about trimming, you don't even need to decode values again - instead you can count spaces on the beginning and ending of Btw, do you know why the hell attribute values should be trimmed at all? |
|
@RReverser Sure, we can count/cut/etc, but that seems super fragile and just decoding in JSX seems like the logical choice. Quite a while since I last looked/touched that code, but IIRC there's quite a number of things about attributes that are kind of weird. But no, attributes values definitely shouldn't be trimmed, perhaps it's a multi-line thing? (JSX attribute strings can be multiline I think ... which is really weird) |
I'll still disagree since any string decoding feels like parser's task that should not be moved out from it. In the worst case, I'd rather expose method for entity decoding from parser and reuse it in React, but definitely not remove it from parser itself.
Yeah, I think we should better figure out reasons for existing of this feature before arguing about solutions for bug introduced by it. |
I'm inclined to agree, but I think there's a slight subtlety to decoding XHTMLEntities vs strings in the parser. Perhaps what should be done is to move some of the JSX inline text trimming to esprima... then esprima becomes a "true" XHTML parser and it suddenly makes sense, currently it's only a JS parser IMO (it expects JSX to process the values) and as such decoding XHTMLEntities doesn't make sense to me (because information is lost). Not sure if I'm making any sense, but I think decoding should move from esprima to JSX or trimming should move from JSX to esprima (so that JSX doesn't doesn't do any "interpretation/processing" of the value). The latter probably makes a lot more sense the more I think about it, as that's how attributes are handled.
The trimming I'm referring to applies to inline text, not attribute values. Or did I misunderstand? |
Personally I like this idea.
Well, it's inside > console.log(require('react-tools').transform('/** @jsx DOM */<x a="1" a="2\n3"> 1 2 </x>'))
/** @jsx DOM */x({a: "1", a: "2" + ' ' +
"3"}, " 1 2 ")Maybe it's desired behavior, but anyway I don't see benefits from it. |
Awesome, I think it makes a lot more sense than this PR the more I think about, and it seems like we both agree 👍
Ah yes, that rings a bell. IIRC most of the attribute code uses non-attribute specific helpers, which kind of works, but for attributes has many small weird edge-cases. So unless @jeffmo thinks otherwise, it seems to me that attributes really should use their own logic where it makes sense. |
🍰 |
|
I see you've already closed facebookarchive/esprima#19, probably makes sense to close this one, too? (since it's linked) Then new PR with trimming can be created in esprima. (with dependency on facebookarchive/esprima#32) |
|
@RReverser Yep, 👍 for facebookarchive/esprima#32 and it seems there's a pretty straight-forward goal, closing :) |
|
Hmm, quickly thinking about this again, I'm not sure how one would go about doing this. It's straight-forward enough to move the trimming to the parser, but React/JSX should preserve line-numbers and necessary information is again lost in translation... or? ...or, perhaps we simply don't care about 1:1 mapping with inline text and just replicate the number of newlines. Practically everything is the same, but lose a bit of familiarity when looking at the JSX output. |
Or it uses location info from |
|
@RReverser but |
Ah, so you mean visual "similarity" between JSX and generated JS where each JSX line corresponds to compiled JS line? |
|
@RReverser Yeah, that's an intended feature today AFAIK. |
|
@syranide Well, current code in React's |
We discussed this briefly some time ago, because
 is unescaped in the parser and JSX receives no information about it, it ends up being identical to just(a regular space) which is trimmed by JSX if appearing left/right-most on a line.Here's a simple proposal that moves the unescaping from the parser to JSX, I also made the unescaping routine a bit more accurate while also making it lenient like browsers, so
XJS & JSXis valid (XJS & JSXis quite cumbersome and annoying).Alternative implementation could have the parser parse both
XJSTextandXJSEntityfor instance, and introducingXJSStringLiteralwhich would then consist of one or moreXJSText/XJSEntity. This is perhaps the neater solution, but also more invasive (I don't mind though).Feel free to agree, disagree, reject, etc.
Depends on facebookarchive/esprima#19