venn.js compared to venneuler

The VennEuler approach is described in an excellent paper by Leland Wilkinson. As far as I'm aware this is the best published academic research on laying out area proportional Venn and Euler diagrams.

This page runs the same performance test on VennEuler that was used to test venn.js.

Circles are randomly positioned and then the intersection areas are calculated from these circles. Using only the intersection areas, VennEuler has to plot circles whose areas match the input.

Unfortunately, the VennEuler package doesn't perform all that well on this test. The problem seems to be that while VennEuler frequently gets a solution that is close to being correct, it rarely gets a solution that is close enough for this test to say it succeeded.

Examples of failure cases are shown first, followed by aggregate graphs showing performance relative to venn.js. Finally a couple of reasons for VennEuler's performance on this test are discussed.

A simple failing example

Calling VennEuler with two disjoint sets can produce a diagram where the circles overlap.

For example, calling VennEuler with two sets of size 98 and 48 that have 0 overlap between them produces this diagram:

The overlap is this diagram is small (about 2.2% of the size of the '1' set), but very noticeable. The performance test I'm using here labels this diagram as failing to adequately represent the input since the two sets shouldn't overlap at all.

However, VennEuler reports that it successfully laid out this diagram. The VennEuler paper defines a 'stress' metric that measures how well the diagram fits the input. Lower values are better, and the paper suggests that if the stress is less than 0.01 then the solution is good enough to use.

The stress for the above solution above is 0.0001 - 100x lower than the cutoff suggested. This is because of how that stress metric is calculated: its the sum of squared errors normalized by the sum of squared total areas. Since the circles themselves are included in the total areas, the error in this case is divided by (98² + 48²). Since the error is only ~ 1² and that is divided by the large squared total area - the stress value is very low.

Other Examples

More examples of individual trials in this test are shown below. While VennEuler occasionally works, it frequently produces solutions that differ by a significant amount from the original. These areas are shown in red if the difference in areas is more than 10%, at which stage we say it failed to reconstruct.

Original

Reconstructed

The results here are precomputed, with random circles being chosen uniformly in (0, 1) with radii uniformly in (.1, .5).

Aggregate Performance

Since VennEuler is a java package, precomputed results are shown here. To get results from live code, or to change the test parameters - type 'gradle run' from the same directory as this file. This will launch a simple java server that exposes the layouts from VennEuler to this page over a HTTP JSON api.

Success Rate

Average Time (ms)

Discussion

Wilkinson performs a similar test in his paper, and reports positive results with an average stress of .006. Since the performance here seems much worse than whats reported in the paper, I thought I'd quickly list out some of the reasons behind this discrepancy.

The test here is exceedingly strict and simply an average of a binary success/failure judgement. VennEuler frequently produces diagrams that are very close to being correct, but with a few failing areas.

The stress metric that VennEuler reports breaks down in the case of small regions in larger diagrams as seen in the first example. VennEuler uses this stress metric for an early exit performance optimization causing it to potentially stop the optimization process too early in some cases.

VennEuler does three separate steps to generate the layout. The first step is an initial layout based off of MDS. This is followed by steepest descent on a pseudo-gradient function and then finally coordinate descent directly on the loss function.

For an initial layout, MDS is better than random - but much worse than the greedy layout I wrote or the Constrained MDS variant I came up with. VennEuler also computes MDS on Jaccard distance to generate the initial layout, which seems to perform worse than the Euclidean distance that I used.

The pseudo-gradient descent step is interesting and actually performs fairly well. I ran some experiments with my own version of it when developing this code. I found it is the second best optimization technique I tried when starting from a random layout. However to get good results I had to run it for many more iterations than in the VennEuler package: VennEuler ran 50 iterations and I was seeing good results at 5000 iterations. This explains why VennEuler seems to get solutions that are mostly right, but frequently contains some small errors. Ultimately the Constrained MDS optimizer I wrote outperformed this method in terms of both speed and accuracy, so I decided not to use this.

Finally VennEuler uses coordinate descent to fine tune the results. I don't believe this will perform as well as a fine tuning step as the Nelder-Mead optimizer I'm using though I haven't done any experiments to verify.

While the running time seems worse in VennEuler, VennEuler scales up better than venn.js does. This is because the number of regions in the venn diagram grows exponentially with the number of sets. VennEuler uses an approximation method for calculating intersection area sizes that mitigates this, while venn.js calculates the sizes of each region directly. While each call is individually fast the sheer number of calls explodes as the number of sets increases, causing venn.js to perform sluggishly. Also worth noting is that timing results reported elsewhere in this repo are done on just the 2-way set intersections: I'm doing all circle intersections here to exactly match the input given to VennEuler. Using only 2-way set intersections leads to identical diagrams in this test, and is roughly a 100x faster on the 8-set case.

Most of the discrepancy in results is just because this test expects basically exact replicas of the input. VennEuler does a fine job of capturing the global structure, but it frequently lacks the fine details that are necessary to pass this test.

Comparison with VennEuler