Skip to content

jafingerhut/cljol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

cljol is specific to Clojure on Java. It uses a JVM library that knows deep internal details of the JVM, and those parts would need to be replaced with something else in order to work on a non-JVM platform.

cljol uses the Java Object Layout library to determine the precise size of a Java object, and all of the objects that it references, either directly, or by following a chain of references through multiple Java objects.

It can create images of these graphs, either:

  • popping up a window using the view function.
  • writing to a GraphViz dot file using the write-dot-file function.
  • writing to any of several other image file formats using the write-drawing-file function.

cljol has been tested most with Clojure 1.10.1 and 1.12.0 so far, but as far as I know should work fine with Clojure 1.7.0 or later (it cannot go earlier, since the Loom graph library requires that version of Clojure or later).

See the gallery for examples of figures created by this library that demonstrate aspects of the Java VM or Clojure's implementation that I find interesting.

Quick example

You must install GraphViz in order for the generation of figures to work -- see its home page for downloads and installation instructions if the following do not work:

  • Ubuntu Linux - sudo apt-get install graphviz
  • Mac OS X
    • If you use Homebrew: brew install graphviz
    • If you use MacPorts: sudo port install graphviz

There are not yet any packaged releases of cljol on Clojars. You can clone the repository yourself and create a JAR if you like, or use the clj / clojure commands provided by the Clojure installer.

No extra JVM options are necessary if you are running JDK 8 or 11:

clj -Sdeps '{:deps {cljol/cljol {:git/url "https://github.com/jafingerhut/cljol" :sha "787343793edb6df3a6769d99ebb87e9c60839f4f"}}}'

If you are running JDK 16 or later, then the additional JVM command line options below are strongly recommended. Not using these options often leads to missing edges between objects in drawn graphs, and/or fields with the values omitted, showing the message .setAccessible failed in place of the field value.

clj -Sdeps '{:deps {cljol/cljol {:git/url "https://github.com/jafingerhut/cljol" :sha "787343793edb6df3a6769d99ebb87e9c60839f4f"}}}' \
    -J-Djdk.attach.allowAttachSelf \
    -J-Djol.tryWithSudo=true \
    -J-XX:+EnableDynamicAgentLoading \
    -J--add-opens -Jjava.base/java.lang=ALL-UNNAMED

In the REPL:

(require '[cljol.dig9 :as d])
(def my-map {"a" 1 "foobar" 3.5})

;; Open a new window containing the figure.  Takes a collection of
;; objects.
(d/view [my-map])

;; Write the figure to a Graphviz dot file, or one of many other
;; formats.  I believe you can get a complete list of formats
;; supported by Graphviz by looking at the "device" list in the output
;; of the command "dot -v < /dev/null".

(d/write-dot-file [my-map] "my-map.dot")
(d/write-drawing-file [my-map] "my-map.pdf" :pdf)

See the "Warning messages" section below for messages that you are likely to see when using this code.

Graphviz dot files are a fairly simple text file format you can read in any text editor, and convert to many other graphic formats. You need not use the commands below to create these other formats, as the example of creating a PDF format file above shows, but below are some sample commands you can run in a shell to do this conversion:

$ dot -Tpdf my-map.dot -o my-map.pdf
$ dot -Tpng my-map.dot -o my-map.png

Below is the figure in the file my-map.png I get from the last command above.

Each rectangle is a Java object. The object (or objects) that you specified to start from are filled with a gray shading. By default each rectangle shows:

  • the object's size
  • the number of objects reachable from that object, via following a path of references starting from that object, which includes that object itself. Also the total size in bytes of all of those reachable objects.
  • either the message "this object in no reference cycles" or "<number> objects in same SCC with this one". The first message means that in the graph of objects reached from the ones you specified to start from, that object is not contained in any cycle of references. The second message means that the object does appear in at least one cycle of references, and there are <number> objects such that they are all in the same SCC, or "strongly connected component", of the graph of references. Two nodes "a" and "b" are in the same SCC exactly when there is a path from "a" to "b", and also a path from "b" to "a".
  • its type, usually a class name, with common prefixes like "clojure.lang." replaced with "c.l." and "java.lang." replaced with "j.l.". Java arrays are shown as "array of N class-name".
  • A sequence of lines, one per field stored in the object. This is all per-instance fields (i.e. not declared static in Java) defined for the object's class, and all of its superclasses. Each is listed with:
    • its byte offset from the beginning of the object in memory where the field is stored
    • the name of the field
    • its type, in parentheses
    • the value of the field
  • All references to other objects only show "ref" as the type. The value of a "ref" field is shown as "nil" if it is a Clojure nil value (i.e. Java null), or -> if the reference is to another object -- you may find the actual class of the referenced object by following the edge labeled with the field name that leaves the node.
  • a string representation of the object's value, or the message "val maybe realizes if str'ed". If you see that message, it means that cljol has determined that if it tried to convert the value to a string, it might cause lazy sequences to be realized more than they already have been before, and it is avoiding this possibility.

The string representation is by default limited to 50 characters, with " ..." appearing at the end if it was truncated.

The arrows out of an array object are labeled with "[i]", where "i" is a number that is the array index. Other labels on edges are the name of the field in the Java object that the edge comes from.

Immediately below is the cljol drawing of the objects representing the Clojure map {"a" 1 "foobar" 3.5}. In a clojure.lang.PersistentArrayMap, map keys are in even array indices, and their associated values in the index 1 larger.

my-map.png

More examples

It does not take much code to create data structures with very large graphs. For example, this graph likely has more nodes than you want to look at:

(def v1 (vec (range 1000)))
(d/view [v1])

cljol includes some code to give you summary statistics about a graph, and some functions that can produce a subset of a graph, which you can then display.

The sum function takes a collection of objects, creates and returns a graph representing its objects without drawing it, and prints some statistics about this graph. The example below shows that v1's graph has 1067 objects. The info near the end shows that 1001 of those have out-degree 0, where out-degree is the number of edges that leave a node directed towards another node. Those 1001 are 'leaf nodes' of the graph.

user=> (def g (d/sum [v1] {:summary-options #{:all}}))
1067 objects
1097 references between them
29480 bytes total in all objects
no cycles
1 weakly connected components found in: 5.6 msec, 0 gc-count, 0 gc-time-msec
number of nodes in all weakly connected components,
from most to fewest nodes:
(1067)
The scc-graph has 1067 nodes and 1097 edges, took: 5.8 msec, 0 gc-count, 0 gc-time-msec
The largest size strongly connected components, at most 10:
(1 1 1 1 1 1 1 1 1 1)
number of objects of each size in bytes:
({:size-bytes 16, :num-objects 1, :total-size 16}
 {:size-bytes 24, :num-objects 1032, :total-size 24768}
 {:size-bytes 40, :num-objects 1, :total-size 40}
 {:size-bytes 48, :num-objects 1, :total-size 48}
 {:size-bytes 144, :num-objects 32, :total-size 4608})
number and size of objects of each class:
({:total-size 16,
  :num-objects 1,
  :class "j.u.c.atomic.AtomicReference"}
 {:total-size 40, :num-objects 1, :class "c.l.PersistentVector"}
 {:total-size 768, :num-objects 32, :class "c.l.PersistentVector$Node"}
 {:total-size 4656, :num-objects 33, :class "j.l.Object/1"}
 {:total-size 24000, :num-objects 1000, :class "j.l.Long"})

1001 leaf objects (no references to other objects)
1 root nodes (no reference to them from other objects _in this graph_)
number of objects of each in-degree (# of references to it):
({:in-degree 0, :num-objects 1}
 {:in-degree 1, :num-objects 1065}
 {:in-degree 32, :num-objects 1})
number of objects of each out-degree (# of references from it):
({:out-degree 0, :num-objects 1001}
 {:out-degree 2, :num-objects 33}
 {:out-degree 8, :num-objects 1}
 {:out-degree 31, :num-objects 1}
 {:out-degree 32, :num-objects 31})
Number and total size of objects at each distance from a starting object:
({:distance 0, :num-objects 1, :total-size 40}
 {:distance 1, :num-objects 2, :total-size 72}
 {:distance 2, :num-objects 10, :total-size 352}
 {:distance 3, :num-objects 31, :total-size 744}
 {:distance 4, :num-objects 31, :total-size 4464}
 {:distance 5, :num-objects 992, :total-size 23808})
#'user/g

Here is a way to create another graph g2 from g with all of g's leaves removed, and then draw g2:

(require '[ubergraph.core :as uber]
         '[cljol.graph :as gr]
         '[cljol.ubergraph-extras :as gre])

(def g2 (uber/remove-nodes* g (gr/leaf-nodes g)))

(d/view-graph g2)

Below we demonstrate keeping only those nodes that are within at most distance 3 of the starting objects given when creating the graph. That is, the Java object is reachable from one of the starting objects in 3 or fewer 'hops'.

(def g3 (gre/induced-subgraph g (filter #(<= (uber/attr g % :distance) 3)
                                         (uber/nodes g))))

(d/view-graph g3)

These graphs g, g2, and g3 are all created using the Ubergraph library. All of its features are available for manipulating these graphs. The drawing functions use keys in the node and edge attribute maps to affect some aspects of the drawings, e.g. the :label key is used to generate the labels.

Another thing that can be interesting to see is the fraction of objects shared between a persistent collection, and the persistent collection created by making a small change to the first collection.

;; Create a graph of objects reachable from one or more
;; root objects.  Use a function add-attributes-by-reachability
;; to add color attributes to the graph nodes so that if
;; a node is reachable from one root object, but not any of
;; the others, it will be given a specified color.
;; If a node is reachable from more than one root node,
;; assign it the color 'multi-color'.  In this function,
;; all nodes of the graph should be reachable from one of
;; the given root objects, but in case there is some kind
;; of situation where a node cannot be reached from any
;; root object, assign it the color "gray".

(defn colored-graph [obj-color-pairs multi-color]
  (let [objs (map first obj-color-pairs)
        attrs (mapv (fn [[obj color]]
                      {:only-from obj :attrs {:color color}})
                    obj-color-pairs)
        attrs (conj attrs
                    {:from-multiple true :attrs {:color multi-color}}
                    {:from-none true :attrs {:color "gray"}})
        g (d/sum objs)]
    (d/add-attributes-by-reachability g attrs)))

;; Two similar Clojure vectors, with lots of sharing of objects.
(def v1 (vec (range 5)))
(def v2 (conj v1 5))
(def g (colored-graph [[v1 "red"] [v2 "green"]] "blue"))
(d/view-graph g)
(d/view-graph g {:save {:filename "g.pdf" :format :pdf}})

;; Two similar Clojure hash maps, with lots of sharing of objects.
(def m1 (hash-map 1 -1 2 -2 3 -3 5 -5 9 -9))
(def m2 (assoc m1 4 -4))
(def g (colored-graph [[m1 "red"] [m2 "green"]] "blue"))
(d/view-graph g)
(d/view-graph g {:save {:filename "g.pdf" :format :pdf}})

Error and Warning messages

When running with JDK 16 or later and the extra command line options recommended above, you may see error messages like these:

ERROR: Add these JVM command line options to avoid errors determining field values of objects: --add-opens java.base/java.util.concurrent.locks=ALL-UNNAMED
ERROR: Add these JVM command line options to avoid errors determining field values of objects: --add-opens java.base/java.lang=ALL-UNNAMED

These occur if cljol is attempting to read the values of fields of JVM objects, but an InaccessibleObjectException was thrown. This causes cljol to produce incorrect graphs of objects, and to show the message .setAccessible failed instead of the true values of fields.

I typically see a warning like the one below the first time I call the function view, write-dot-file, or write-drawing-file:

# WARNING: Unable to attach Serviceability Agent. sun.jvm.hotspot.memory.Universe.getNarrowOopBase()

The presence of this warning does not seem to harm the functionality of cljol in any way.

Tested with:

  • Ubuntu 24.04, Adoptium OpenJDK 8, 11, 16-24, Clojure 1.12.0
  • Ubuntu 18.04.2, OpenJDK 11, Clojure 1.10.1
  • Ubuntu 18.04.2, Oracle JDK 8, Clojure 1.10.1
  • Mac OS X 10.13 High Sierra, Oracle JDK 8, Clojure 1.10.1

It should work with older versions of Clojure, too, but I do not know how far back it can go. Probably as far back as Clojure 1.7, which is required by the version of the Loom library that cljol depends upon.

Possible future work

Perhaps some day this library might be enhanced to create nice figures and/or summary statistics showing how many of these objects are shared between two Clojure collections. There is some code in the cljol.dig namespace written with that in mind, but it is at best not well tested and thus probably contains many errors, if it even runs at all.

It would be nice if there was a way to cause the edges out of Java array objects to at least usually be in increasing order of array index. Right now they are fairly arbitrary.

License

Copyright © 2016-2019 Andy Fingerhut

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/org/documents/epl-v10.html

About

Experimental code using Java Object Layout (JOL) from Clojure

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published