Replace timing based test `unpack_sums_6` with one based on instruction counts.

Currently unpack_sums_6 trys to test that unpacked sums are indeed faster by measuring the runtime of two different functions.

This works fine on a machine with low load. But on potentially oversatured CI machines with tests running in parallel there is no good way to guarantee we can reliably measure these runtimes.

This was actually pointed out in the original MR already but I had some faith that the runtime difference would be big enough for it not to happen on CI. I was wrong.

I think the proper way to do this is to measure instructions executed (as a proxy for runtime). Something we talked about for a long time. See #22278 and #19546.

While this is very unsatisfying for the time being I will simply mark this test as flaky for now.

Ideally we will fix #22278 and #19546 in the future and then add a test that tracks the runtime via instructions executed which would be all around better.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information