Non-moving GC does not free nursery area quickly by performGC.
I am not sure this is a bug.
Our application(hasktorch) manages many c++ objects by ForeignPtr. (hasktorch is FFI of torch.) The pointers are short-lived.
Each c++ object needs large memory area on GPU. The application frequently calls performGC, because I want it released quickly after each epoch.
When we do not use non-moving GC, the area is released by calling performGC soon. Memory usage does not increase.The program finishes in a few minutes.
When we use non-moving GC, the area is not released by calling performGC soon. Memory usage increases, then we can not allocate memory to GPU.
Is there a correct way to force the release of short lived objects?
Steps to reproduce
https://github.com/hasktorch/hasktorch/blob/master/examples/static-mnist-cnn/Main.hs This program runs on CUDA.
I'm sorry I haven't provided an example that can be easily reproduced.
The program can run without overflow.
- GHC version used: 8.10.1
- Operating System: ubuntu 18.04
- System Architecture: x86_64