Skip to content

Cmm cannot parse negative float32 / float64 literals

Summary

While working on test-primops I wanted to generate calls from Cmm to C functions to check the calling convention. I noticed that negative float literals cannot be parsed correctly in Cmm.

I haven't found any usage of Cmm float literals in the GHC source code. This might be the reason why this bug hasn't been spotted before.

Steps to reproduce

Take this simple Cmm function and copy it into a test.cmm file:

 test() {
   float64 f1; f1 = (-1.0 :: float64);
   float64 ret;
   (ret) = foreign "C" test_c(f1);
   return (ret);
 }

Compile it with ghc test.cmm -dcmm-lint -ddump-cmm -no-hs-main -ddump-asm. This leads to:

Cmm lint error:
  in basic block c5
    in assignment:
      _c1::F64 = -1.0 :: W64;
      Reg ty: F64
      Rhs ty: I64
Program was:
  {offset
    c5: // global
        //tick src<test.cmm:(1,8)-(6,1)>
        //tick src<test.cmm:2:18-37>
        _c1::F64 = -1.0 :: W64;
        _c3::I64 = test_c;
        _c4::F64 = _c1::F64;
        (_c2::F64) = call "ccall" arg hints:  []  result hints:  [] (_c3::I64)(_c4::F64);
        D1 = _c2::F64;
        call (P64[(old + 8)])(D1) args: 8, res: 0, upd: 8;
  }

<no location info>: error:
Compilation had errors

Expected behavior

The expected behavior can be observed by changing the negative float literal to a positive one; e.g. by replacing -1.0 with 1.0:

 test() {
   float64 f1; f1 = (1.0 :: float64);
   float64 ret;
   (ret) = foreign "C" test_c(f1);
   return (ret);
 }

Compiling it with the same GHC command results in:

==================== Output Cmm ====================
[test() { //  []
         { info_tbls: []
           stack_info: arg_space: 8
         }
     {offset
       c5: // global
           //tick src<test.cmm:(1,8)-(6,1)>
           //tick src<test.cmm:2:18-36>
           _c1::F64 = 1.0 :: W64;
           _c3::I64 = test_c;
           _c4::F64 = _c1::F64;
           (_c2::F64) = call "ccall" arg hints:  []  result hints:  [] (_c3::I64)(_c4::F64);
           D1 = _c2::F64;
           call (P64[Sp])(D1) args: 8, res: 0, upd: 8;
     }
 }]



==================== Asm code ====================
.section .text,"ax",@progbits
.align 8
.globl test
.type test, @function
test:
.Lc5:
	movsd .Ln7(%rip),%xmm0
	leaq test_c(%rip),%rax
	subq $8,%rsp
	movq %rax,%rbx
	movl $1,%eax
	call *%rbx
	addq $8,%rsp
	movsd %xmm0,%xmm1
	jmp *(%rbp)
	.size test, .-test
.section .rodata
.align 8
.align 8
.Ln7:
	.double	1.0


/nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/crt1.o:function _start: error: undefined reference to 'main'
collect2: Fehler: ld gab 1 als Ende-Status zurück
`cc' failed in phase `Linker'. (Exit code: 1)

So, the float literal ends up in the data section of the assembly (as expected). The linker error can be ignored, because we really didn't define a main() function.

I guess that the @floating_point rule in the Cmm lexer needs to be changed to handle the - sign: https://gitlab.haskell.org/ghc/ghc/-/blob/471b267294bc5f17e4864ce9bb2f221c4d47eac8/compiler/GHC/Cmm/Lexer.x#L65 (But, this is really guessing. I don't know much about the Cmm lexer / parser.)

Environment

  • GHC version used: 9.6.3 (and 9.7.20230505)
Edited by Sven Tennie
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information