For each benchmark, I plot
Using 'natural' sample size. I.e. since we have used 2000 buckets for our histograms, the sample size is 2000. For each of our benchmarks in the table of links below, the first line uses the natural sample size. The table below summarises the number of partitions and sites required to cover 95% of allocation by space rental and by volume. Only the larger sizes are shown (speed 100 for SPECjvm, large for DaCapo except hsqldb which is the default size.
benchmark | s/c | sr at 95+% | vol at 95+% | delta | ||
---|---|---|---|---|---|---|
compress | 6 | 7 | 19 | 14 | 26 | 0.05 |
jess | 7 | 20 | 62 | 2 | 118 | 0.05 |
raytrace | 7 | 10 | 68 | 8 | 159 | 0.05 |
db | 5 | 7 | 19 | 12 | 24 | 0.05 |
javac | 5 | 40 | 97 | 74 | 485 | 0.05 |
mpegaudio | 9 | 15 | 40 | 50 | 142 | 0.05 |
mtrt | 7 | 9 | 65 | 9 | 163 | 0.05 |
jack | 7 | 18 | 54 | 18 | 124 | 0.05 |
antlr | 20 | 13 | 47 | 18 | 291 | 0.09 |
bloat | 11 | 10 | 48 | 5 | 810 | 0.05 |
fop | 21 | 11 | 91 | 26 | 708 | 0.20 |
jython | 15 | 15 | 47 | 12 | 92 | 0.06 |
pmd | 9 | 24 | 82 | 22 | 191 | 0.06 |
ps | 23 | 7 | 36 | 5 | 105 | 0.06 |
hsqldb | 15 | 9 | 36 | 6 | 124 | 0.07 |
In most cases, the effect of using 2000 buckets is that the interval between buckets is very small (the s/c graphs about specify the interval as "granularity"). No stop-the-world collector would operate at such a fine grain. For the second line of each benchmark in the table of links below, the sample assumes that the interval between GCs is at least 1MB. Unless the total allocated is very large (>2000MB), this means that we have a smaller sample size, and therefore that our clustering analysis is more tolerant of differences between lifetime distributions. The effect, we believe, is to cluster looking through rather coarser grained glasses (as if we had re-bucked the data). For small inputs (e.g. SPECjvm98 speeds less than 100), the tolerance is much too high, but for speed 100, it seems to makes sense. Here we only give s/c for the large input size, and include all partition/site numbers.
benchmark | s/c | sr at 95+% | vol at 95+% | delta | ||
---|---|---|---|---|---|---|
compress | 13 | 6 | 19 | 11 | 26 | 0.22 |
jess | 9 | 12 | 62 | 2 | 118 | 0.14 |
raytrace | 13 | 8 | 68 | 3 | 159 | 0.19 |
db | 11 | 5 | 19 | 10 | 24 | 0.25 |
javac | 11 | 13 | 97 | 22 | 485 | 0.16 |
jack | 13 | 12 | 54 | 6 | 124 | 0.15 |
antlr | 20 | 13 | 47 | 18 | 291 | 0.09 |
bloat | N/a: default freq > 1MB | |||||
fop | 21 | 11 | 91 | 26 | 708 | 0.20 |
hsqldb | 15 | 9 | 36 | 6 | 124 | 0.07 |
jython | 15 | 15 | 47 | 12 | 92 | 0.06 |
pmd | 9 | 24 | 82 | 22 | 191 | 0.06 |
ps | 23 | 7 | 36 | 5 | 105 | 0.06 |
For the third line of each benchmark in the table of links below, the sample assumes that the interval between GCs is at least 4MB. This is suitable only for the very largest benchmarks. If a benchmark is too small, it would aggregate all non-immortal sites into the same partition. Thus, the only sites worth looking at at this granularity are those in the table below (chosen to have delta <0.2).
benchmark | s/c | sr at 95+% | vol at 95+% | delta | ||
---|---|---|---|---|---|---|
antlr_large | 34 | 9 | 47 | 9 | 291 | 0.18 |
bloat_large | 14 | 7 | 48 | 2 | 810 | 0.08 |
hsqldb | 21 | 7 | 36 | 7 | 124 | 0.14 |
jython_large | 20 | 12 | 47 | 7 | 92 | 0.11 |
pmd_large | 14 | 15 | 82 | 14 | 191 | 0.12 |
ps | 31 | 5 | 44 | 3 | 101 | 0.19 |
ps_large | 27 | 6 | 36 | 5 | 105 | 0.11 |
An appropriate question to ask about partitioning seems to be, to what extend does more aggressive partitioning (i.e. a larger GC frequency) aggregate into one partition sites whose behaviour is 'too' different? Just by looking at the graphs for the large inputs for antlr, bloat, hsqldb, jython, pmd and ps (and ps default), the answer seems to be that they are all either much the same (modulo some reordering of the partitions by space rental) or at least broadly similar (hsqldb, pmd and ps default). It does not seem to be the case that more aggressive clustering conflates different behaviours. One conclusion that might be drawn is that a delta (gap between the cumulative distribution function curves) of less than 0.20 works well.
Here's how to read our 3D plots. Time of death is plotted horizontally (from right 0% to left 100%). Age is plotted from back o% to front 100%. Volume that died is plotted vertically. Note that it is impossible for any point to fall SE of the green line (its age would be greater than its time of death). The plots have been annotated with coloured rectangles that group objects that seem to live and die together, i.e. with opposing corners at (phase_end-max_age,min_age) and (phase_end,max_age).
benchmark | clusters (0=immortal) |
---|