Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,251
    • Issues 4,251
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 394
    • Merge Requests 394
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #3181

Closed
Open
Opened Apr 21, 2009 by dolio@trac-dolio

Regression in unboxing

Greetings,

Finally having added some benchmarking to my uvector-algorithms package, I came across a regression in performance in the heap sort implementation. With 6.10.2, it did significantly more allocation than I'd remembered, and looking at the core, I spotted a boxed type that I have since determined to be the culprit. I'll attach the relevant portion of the algorithm as a test case (I apologize that it's rather large, but I haven't yet figured out how to make a smaller example that works).

The key lines are:

do (child' :*: ac) <- maximumChild cmp a off child len
   case cmp val ac of
     LT -> writeMU a (root + off) ac >> sift val child' len
     _  -> writeMU a (root + off) val

The monadic (ST) pattern match against the strict pair gets compiled to a function that accepts the arguments of the pair (as well as the ST state parameter) like so:

$w$j_s118 :: State# RealWorld
             -> Int
             -> Int#
             -> (# State# RealWorld, () #)
[Arity 3

$w$j_s118 =
  \ (w_s108 :: State# RealWorld)
    (ww_s10b :: Int)
    (ww1_s10e :: Int#) ->
    case <# sc3_s11t ww1_s10e of wild11_aY2 {
      False ->
        case writeIntArray#
               @ RealWorld
               marr#_aVT
               (+# sc2_s11s 40)
               sc3_s11t
               w_s108
        of s2#1_aX6 { __DEFAULT ->
        (# s2#1_aX6, () #)
        };
      True ->
        case writeIntArray#
               @ RealWorld
               marr#_aVT
               (+# sc2_s11s 40)
               ww1_s10e
               w_s108
        of s2#1_aX6 { __DEFAULT ->
        case ww_s10b of w1_X10F { I# ww2_X11k ->
        $s$wa_s11D s2#1_aX6 sc1_s11r ww2_X11k sc3_s11t
        }
        }
    }

As can be seen, all that is done with the boxed Int is that it is taken apart in one branch (this identifies the boxed argument as the child' variable, the Int being returned by maximumChild; I verified this by modifying the code to use a custom type:

data IP e = IP {-# UNPACK #-} !Int !e

this results in the desired unboxing behavior). Further, all calls to this function look like:

$w$j_s118 s2#3_XZ1 (I# x_aTU) r#_aXD

So this seems to be unnecessary boxing. Further, I have reports that 6.8.2 *did* spot this unboxing opportunity, which would explain why my algorithm wasn't performing as well as I had remembered, since the last time I'd seriously inspected the performance was in the 6.8 series.

One theory I had was that the fact that the value was only used in one branch of the case was causing it to look non-strict somehow, despite the pair being strict in its fields (perhaps the pair was eliminated before strictness analysis was done?). However, I've added bang patterns to the child' match, and changed the case statement to:

case child' `seq` cmp val ac of ...

without it making a difference. So I am a bit stumped as to why exactly GHC isn't eliminating this box.

As noted, I can get the desired unboxing with a custom unpacked type, but fixing the regression would be preferable. Thanks for your time and help!

Trac metadata
Trac field Value
Version 6.10.2
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
Assignee
Assign to
6.12.3
Milestone
6.12.3
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#3181