Excessive heap space size
DataStax recommends using the default heap space size for most use
cases. Exceeding this size can impair the Java virtual machine's
(JVM) ability to perform fluid garbage collections (GC). The
following table shows a comparison of heap space performances
reported by a Cassandra user:
Heap CPU utilization Queries per second Latency
40 GB 50% 750 1 second
8 GB 5% 8500 (not maxed out) 10 ms
For information on heap sizing, see Tuning Java resources.
As the benchmark indicate, the more heap you allocate, the higher the cpu usage is. Though the performance decrease is not linear but rather exponentially. So it is wise to keep the heap at 8GB or not more than 50% of that value. It is not deadly but it certainly decrease the performance of the cluster dramatically which would render it useless. If you encountered memory error in the log, in this situation, apart from other factors, it is better if you consider scale your cluster horizontally, that is adding more nodes to increase the capacity. But a quick workaround should you encounter memory error, the
So what happen really happen in the gc if high heap is allocated? well, excerpt from the guru,
..the concurrent mark/sweep phase runs concurrently with your
application. CMS will cause a stop-the-world full pause it it fails to
complete a CMS sweep in time and you hit the maximum heap size, but
unless that happens, CMS will run concurrently (though there are
stop-the-world pauses involved, that are typically very short, the
mark/sweep phase is concurrent).
Hence, if you really hit stop the world situation, this would render the node useless, because the node is too busy doing gc that, cassandra would not be able to perform.