Finish heap profiling by roots feature
This is the (hopefully) last step for #16788 (closed) coming after the PR currently active for review, !1433 (closed). This PR is now also ready for merging and if reviewers feel up to reviewing the whole thing that's also fine.
The highlights in here include:
- refactor the traversal based profilers to more tightly integrate with the heap profiler to simplify the code
- removal of the "flip" bit hack
- fix roots being kept alive forever
- add user's guide section