Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,249
    • Issues 4,249
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 391
    • Merge Requests 391
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #15333

Closed
Open
Opened Jul 02, 2018 by Sebastian Graf@sgraf812Developer

Nursery size adulterates cachegrind metrics in nofib

I'm currently investigating an alleged regression in my branch of the late lambda lift and hit a confusing data point. Note that I'm very much relying on cachegrinds counted instructions/memory accesses for my findings.

Check out the most recent version of nofib, enter shootout/binary-trees and run the following script:

#! /bin/sh
sed -i 's/import Debug.Trace//g' Main.hs # Make the following line idempotent
echo "import Debug.Trace" | cat - Main.hs > Main.tmp && mv Main.tmp Main.hs # add the import for trace

# bt1: Vanilly
sed -i 's/trace "" \$ bit/bit/g' Main.hs # strip `trace $ ` prefixes in the call to `bit`
ghc -O2 -XBangPatterns Main.hs -o bt1

# bt2: Additional trace call
sed -i 's/bit/trace "" $ bit/g' Main.hs # prepend `trace $ ` to the call to `bit`
ghc -O2 -XBangPatterns Main.hs -o bt2

valgrind --tool=cachegrind ./bt1 12 2>&1 > /dev/null # without trace
valgrind --tool=cachegrind ./bt2 12 2>&1 > /dev/null # with trace

This will compile two versions of binary-trees, the original, unchanged version and one with an extra trace "" $ call before the only call to the bit function. One would expect the version with the trace call (bt2) to allocate more than the version without (bt1). Indeed, the output of +RTS -s suggests that:

$ ./bt1 12 +RTS -s
...
43,107,560 bytes allocated in the heap
...
$ ./bt2 12 +RTS -s
...
43,116,888 bytes allocated in the heap
...

That's fine. A few benchmark runs by hand also suggested the tracing version is a little slower (probably due to IO).

Compare that to the output of the above cachegrind calls:

$ valgrind --tool=cachegrind ./bt1 12 > /dev/null
...
I   refs:      118,697,583
...
D   refs:       43,475,212
...
$ valgrind --tool=cachegrind ./bt2 12 > /dev/null
...
I   refs:      116,340,710
...
D   refs:       42,523,369
...

It's the other way round here! How's that possible?

Even if this isn't strictly a bug in GHC or NoFib, it's relevant nonetheless, as our benchmark infrastructure currently relies on instruction counts. I couldn't reproduce this by writing my own no-op trace _ a = a; {-# NOINLINE trace #-}, btw.

I checked this on GHC 8.2.2, 8.4.3 and a semi-recent HEAD commit (bb539cfe).

Edited Mar 10, 2019 by Sebastian Graf
Assignee
Assign to
Research needed
Milestone
Research needed
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#15333