Skip to content

Conversation

sim642
Copy link
Member

@sim642 sim642 commented May 23, 2025

Currently for the g2html output Goblint produces:

  1. One massive XML file with all results.
  2. .dot files for all function CFGs.

Then g2html takes those and does the following:

  1. Splits the massive XML file into individual XML files for nodes, files, globals, warnings.
  2. Runs Graphviz on all the .dot files to get .svg files.
  3. Copies some static resources to the output directory: XML transformers, CSS styles, some JS and fonts.

This PR shows that we can easily cut out the middle man and get rid of the Java-based g2html altogether by doing the following directly in Goblint:

  1. Producing the individual XML files.
  2. Producing the .dot files and also running Graphviz on them.
  3. Copying the same static stuff.

The result is almost exactly the same HTML-like results viewing experience.

The only difference is that g2html includes a custom lexer for unpreprocessed C that's used for highlighting code in the g2html file view. This remake doesn't do highlighting, but I don't know if anyone even looks at the file view in g2html.
I think it's a minor loss for what otherwise allows us to get rid of an ancient Java-based component which just wastes time copying XML around and reconstructing liveness information.
Doing the same directly in OCaml means that this output is also available to Goblint installed via opam itself, where otherwise g2html isn't present.

TODO

  • Add syntax highlighting. Maybe using Pygments?
  • Remove old single-XML output? Let's keep it for now as as possible fallback for when the new approach does fail.
  • Clean up new implementation code.

kalmera and others added 30 commits March 2, 2014 21:24
This rolls back to commit 760e44a8f8abe89031109cc5ee8506cafad8fff6.
tried new version, but seems like there is no option to disable the animation
Index (position()) based lookup breaks if XML doesn't contain analyses in same order for each global variable.
Somehow broke now and filtered too much.
@sim642 sim642 marked this pull request as ready for review July 25, 2025 08:24
@sim642
Copy link
Member Author

sim642 commented Jul 25, 2025

I've now added syntax highlighting with Pygments, so this now should be able to completely replace g2html.
I'm trying to do some benchmarking on the HTML results generation part to give a better view of the effect.

@sim642
Copy link
Member Author

sim642 commented Aug 15, 2025

I also ran before and after on sv-benchmarks with witness generation disabled and HTML generation enabled: https://goblint.cs.ut.ee/results/267-all-g2html-ocaml-2/table-generator-cmp.table.html.

cputime

image

walltime

image

memory

image

@sim642
Copy link
Member Author

sim642 commented Sep 3, 2025

As seen in extended benchmarks in #1810 (comment), the speedup is even greater when also using multiple cores for parallel GraphViz runs.

@sim642 sim642 added this to the v2.7.0 Bamboozled Buffalo milestone Sep 3, 2025
Copy link
Member

@jerhard jerhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I also tested this, and everything seems to work.

One minor difference to the old HTML view of the code is that now a trailing empty line of code is not shown anymore. This seems to be an irrelevant detail or maybe an improvement.

@sim642
Copy link
Member Author

sim642 commented Oct 3, 2025

One minor difference to the old HTML view of the code is that now a trailing empty line of code is not shown anymore. This seems to be an irrelevant detail or maybe an improvement.

I think this is an annoying behavior of OCaml's standard input_line just omitting trailing newlines: https://discuss.ocaml.org/t/how-do-you-read-the-lines-of-a-text-file/8834/8. It would be nice to have it exactly reflect the file, but hopefully this isn't an issue: I think there shouldn't be warnings on the last empty line that could get missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Refactoring, clean-up performance Analysis time, memory usage setup Dependencies, CI, releasing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants