Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,323
    • Issues 4,323
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 376
    • Merge Requests 376
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #8096

Closed
Open
Opened Jul 27, 2013 by Edward Z. Yang@ezyangDeveloper

Add fudge-factor for performance tests run on non-validate builds

Since I'm not going to get around to this immediately, Trac'ifying for posterity:

These tests have been doing better than expected in the nightlies for some while.

 Unexpected failures:
    perf/compiler  T3064 [stat too good] (normal)
    perf/compiler  T3294 [stat too good] (normal)
    perf/compiler  T5642 [stat too good] (normal)
    perf/haddock   haddock.Cabal [stat too good] (normal)
    perf/haddock   haddock.base [stat too good] (normal)

Unfortunately, fixing them is not a simple matter of shifting the ranges up, since the tests only exceed expectations on a /perf/ build, so on a normal build such as 'quick', these tests all pass normally.

I could bump up the upper bounds so that the builder stops bleating about them; perhaps we could do something more complicated where the expected performance depends on what level of optimization GHC was built with (but I don't know how to implement this.)


The problem with just widening the bounds to cover 2 different types of build is that it increases the chance that performance changes won't actually be noticed by thge person responsible.

Having different bounds for different build configurations is a pain, because (a) the testsuite has to work out which set of bounds to use, and (b) you now have even more wobbly values to keep up-to-date.

I think perhaps the best thing would be to add some sort of (per-test?) fudge factor for non-validate builds. That way validate will still find performance regressions, like it does today, but other builds are less likely to give false positives. (Igloo)

Trac metadata
Trac field Value
Version 7.7
Type Task
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Build System
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#8096