Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,253
    • Issues 4,253
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 394
    • Merge Requests 394
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #2253

Closed
Open
Opened Apr 30, 2008 by dons@trac-dons

Native code generator could do better

An example set of programs that came up in the ndp library, where the C backend outperforms the current native code generator. Logging them here so we don't forget to check again with the new backend.

Program 1

import Data.Array.Vector
import Data.Bits
main = print . sumU $ zipWith3U (\x y z -> x * y * z)
                        (enumFromToU 1 (100000000 :: Int))
                        (enumFromToU 2 (100000001 :: Int))
                        (enumFromToU 7 (100000008 :: Int))

Core:

Main.$s$wfold =
  \ (sc_sPH :: Int#)
    (sc1_sPI :: Int#)
    (sc2_sPJ :: Int#)
    (sc3_sPK :: Int#) ->
    case ># sc2_sPJ 100000000 of wild_aJo {
      False ->
        case ># sc1_sPI 100000001 of wild1_XK6 {
          False ->
            case ># sc_sPH 100000008 of wild2_XKd {
              False ->
                Main.$s$wfold
                  (+# sc_sPH 1)
                  (+# sc1_sPI 1)
                  (+# sc2_sPJ 1)
                  (+# sc3_sPK (*# (*# sc2_sPJ sc1_sPI) sc_sPH));
              True -> sc3_sPK
            };
          True -> sc3_sPK
        };
      True -> sc3_sPK
    }

}

Which is great.

C backend:

Main_zdszdwfold_info:
  .text
  .p2align 4,,15
.text
  .align 8
  .type     Main_zdszdwfold_info, @function
  cmpq        $100000000, %r8
  jg  .L9
  cmpq        $100000001, %rdi
  jg  .L9
  cmpq        $100000008, %rsi
  jg  .L9
  movq        %r8, %rdx
  incq        %r8
  imulq       %rdi, %rdx
  incq        %rdi
  imulq       %rsi, %rdx
  incq        %rsi
  addq        %rdx, %r9
  jmp Main_zdszdwfold_info
.L5:
.L7:
  .p2align 6,,7
.L9:
  movq        %r9, %rbx
  jmp *(%rbp)

Native code generator:

Main_zdszdwfold_info:
  cmpq $100000000,%r8
  jg .LcRP
  cmpq $100000001,%rdi
  jg .LcRR
  cmpq $100000008,%rsi
  jg .LcRU
  movq %rdi,%rax
  imulq %rsi,%rax
  movq %r8,%rcx
  imulq %rax,%rcx
  movq %r9,%rax
  addq %rcx,%rax
  leaq 1(%r8),%rcx
  leaq 1(%rdi),%rdx
  incq %rsi
  movq %rdx,%rdi
  movq %rcx,%r8
  movq %rax,%r9
  jmp Main_zdszdwfold_info
.LcRP:
  movq %r9,%rbx
  jmp *(%rbp)
.LcRR:
  movq %r9,%rbx
  jmp *(%rbp)
.LcRU:
  movq %r9,%rbx
  jmp *(%rbp)

Runtime performance:

C backend: 0.269 Asm backend: 0.410s

Program 2

Source:

import Data.Array.Vector
import Data.Bits
main = print . sumU . mapU (`shiftL` 2) $
            appendU (replicateU 1000000000 (1::Int))
                    (replicateU 1000000000 (7::Int))

Core:

$s$wfold_rPr =
  \ (sc_sOw :: Int#) (sc1_sOx :: Int#) ->
    case sc_sOw of wild_X1j {
      __DEFAULT -> $s$wfold_rPr (+# wild_X1j 1) (+# sc1_sOx 28);
      1000000000 -> sc1_sOx
    }

Runtime:

Native backend: 2.637

C backend: 2.365

Trac metadata
Trac field Value
Version 6.8.2
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler (NCG)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC dons@galois.com
Operating system Unknown
Architecture x86_64 (amd64)
Assignee
Assign to
7.6.1
Milestone
7.6.1
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#2253