[ZEPPELIN-6162] Implement revisions comparator for New UI#5155
[ZEPPELIN-6162] Implement revisions comparator for New UI#5155prabhjyotsingh wants to merge 7 commits intoapache:masterfrom
Conversation
| } | ||
|
|
||
| private buildLineDiffHtml(text1: string, text2: string): { html: SafeHtml; identical: boolean } { | ||
| const a = this.dmp.diff_linesToChars_(text1, text2); |
There was a problem hiding this comment.
| const a = this.dmp.diff_linesToChars_(text1, text2); | |
| const { chars1, chars2, lineArray } = this.dmp.diff_linesToChars_(text1, text2); |
Destructure it instead of naming the intermediate object. a.chars1 reads like nothing, chars1 and lineArray are self-explanatory.
...ar/src/app/pages/workspace/notebook/revisions-comparator/revisions-comparator.component.less
Show resolved
Hide resolved
| merge.sort((a, b) => { | ||
| const order = { added: 0, deleted: 1, compared: 2 }; | ||
| return order[a.type] - order[b.type]; | ||
| }); |
There was a problem hiding this comment.
The merge result gets sorted by type (added -> deleted -> compared), which reorders paragraphs away from their position in the note. Is there a reason for this? Keeping the original order and just showing tags per paragraph would make it easier to see where things changed.
There was a problem hiding this comment.
Yes, I did try to imitate https://github.com/apache/zeppelin/blob/master/zeppelin-web/src/app/notebook/revisions-comparator/revisions-comparator.component.js#L115
But now on hind sight, let me change this to dateCreated
There was a problem hiding this comment.
Ah, I caught this from the link you shared. Looking at the original implementation (ZEPPELIN-2813, PR #2506 by tinkoff-dwh, 2017), the sort was there from day one, probably to push changed paragraphs (added/deleted) to the top, since compared includes identical ones that aren't as interesting.
That said, we lose the original paragraph order in the note, which makes it harder to trace where things changed. Not sure which matters more here.
@tbonelee any thoughts?
There was a problem hiding this comment.
I think this is a trade-off between preserving context and enabling quick skimming.
If we had to pick one default, I’d lean toward keeping the original paragraph order, because changes may be scattered (which costs some reading time), but reordering the note breaks the narrative flow and makes it harder to understand where a change happened.
The ideal UX might be: keep the original order by default, and offer a toggle/checkbox for “show changed only” or “changed first” for people who want a faster scan.
What do you think?
There was a problem hiding this comment.
I agree on keeping the original paragraph order. Reordering by type loses where in the note things changed, which is what you usually care about when comparing revisions.
A "changed first" toggle could go into this PR, or if that's too much we could just open a separate issue for it.
| } | ||
| } | ||
|
|
||
| return { html: this.sanitizer.bypassSecurityTrustHtml(html), identical }; |
There was a problem hiding this comment.
dididy@5dc87e6
bypassSecurityTrustHtml + [innerHTML] is an XSS vector here. If someone stores <script> or event handlers in notebook paragraph text, it runs in the browser. Since paragraph content comes from users (and potentially shared notebooks), this is a real risk.
Replace the HTML string approach with a DiffSegment[] array and *ngFor:
interface DiffSegment { type: 'insert' | 'delete' | 'equal'; text: string; }<pre class="code-panel"><span
*ngFor="let seg of currentParagraphDiffDisplay?.segments"
[class.color-green-row]="seg.type === 'insert'"
[class.color-red-row]="seg.type === 'delete'"
[class.color-black]="seg.type === 'equal'"
>{{ seg.text }}</span></pre>Angular's interpolation ({{ }}) auto-escapes everything. No sanitizer needed.
There was a problem hiding this comment.
Sure let me remove this.
There was a problem hiding this comment.
Thanks for working on this. I checked the commit and the Files changed tab, and the XSS fix hasn't landed in the PR yet.
The PR still has:
import { DomSanitizer, SafeHtml }in the componentbuildLineDiffHtmlassembling raw HTML strings[innerHTML]="currentParagraphDiffDisplay?.diff"in the template
Paragraph text is user input and gets rendered in other users' browsers through shared notebooks. Relying on a manual escapeHtml implementation is risky, if it's ever bypassed or removed, that's a direct XSS vector.
Could you push the DiffSegment[] + *ngFor + {{ }} change to this branch? I put together a reference commit here: dididy@5dc87e6 - Angular's interpolation handles escaping automatically, so the sanitizer becomes unnecessary.
Port the revision comparison feature from the legacy AngularJS UI to the new Angular 13 frontend. Users can now select two revisions and view paragraph-by-paragraph diffs with color-coded additions and deletions.
e1fbc65 to
7e46b78
Compare

What is this PR for?
Port the revision comparison feature from the legacy AngularJS UI to the new Angular 13 frontend. Users can now select two revisions and view paragraph-by-paragraph diffs with color-coded additions and deletions.
What type of PR is it?
Improvement
What is the Jira issue?
How should this be tested?
Screenshots (if appropriate)
Questions: