Currently org.spdx.utility.compare.LicenseCompareHelper.isTextStandardLicense().getDifferenceMessage()
will list each difference in a seemingly random order, but it would be beneficial if instead they were sorted by line number and then by column number, for readability.
Activity
bact commentedon May 6, 2025
We can sort
List<LineColumn> differences
in the classDifferenceDescription
.Spdx-Java-Library/src/main/java/org/spdx/utility/compare/CompareTemplateOutputHandler.java
Lines 645 to 649 in 718fd65
This can be done either at 1)
setDifferences()
+addDifference()
or 2)getDifferences()
.Performance hit will depend on how
DifferenceDescription
got used normally (set/add frequently vs get frequently).The main change will be around
differenceMessage
because currently the string is created cumulatively at eachaddDifference()
call and the sorting ofdifferences
list will not change the order of text order in thedifferenceMessage
string.A possible solution is to move the string generation outside of
addDifference()
and delay the generation until it is being used atgetDifferenceMessage()
call.pmonks commentedon May 6, 2025
Another thought: should the
LineColumn
class be enhanced to include all of the information that's displayed in the difference message? That would allowgetDifferenceMessage()
to become a "dumb" string generator around theList<LineColumn>
, while (perhaps more importantly) providing all of the difference information as structured data available via an API to callers who want to generate their own difference message (e.g. as HTML).Going from memory, I think what's missing from
LineColumn
are the "expected" and "actual" texts (but there might be more).goneall commentedon May 6, 2025
I'll have to double check, but I think this is an order to the messages. The first message is where the match logic actually failed after exhausting all possible variable and optional texts. The subsequent messages are why any option or variable text failed their evaluations. When I use the messages to debug, the order is helpful to understand how the matching logic failed.
If you want to represent this order in a UI, you could do something like A failed a <line,column> after trying to match option or variable text failing at b <line,column>, ...
Also, the difference message doesn't actually capture all the differences in the license text - once it gets to a point where there obviously isn't a match, it just stops.
If we want to write a good UI for describing the changes and failures, I would suggest doing a standard text diff first to capture a complete list of differences in a highlighted, human readable way then using the difference messages highlight which actual diff caused the match failure.