Skip to content

Share top-level code for strings

A string constant in GHC turns into

foo :: String
foo = unpackCString# "the-string'

This is a top-level thunk, and it expands into rather a lot of code like this

.text
	.align 4,0x90
	.long	0
	.long	22
.globl _Foo_zdfTypeableTzuds1_info
_Foo_zdfTypeableTzuds1_info:
.LcvI:
	movl %esi,%eax
	leal -12(%ebp),%ecx
	cmpl 84(%ebx),%ecx
	jb .LcvQ
	addl $8,%edi
	cmpl 92(%ebx),%edi
	ja .LcvS
	movl $_stg_CAF_BLACKHOLE_info,-4(%edi)
	movl 100(%ebx),%ecx
	movl %ecx,0(%edi)
	leal -4(%edi),%ecx
	pushl %ecx
	pushl %eax
	pushl %ebx
	movl %eax,76(%esp)
	call _newCAF
	addl $12,%esp
	testl %eax,%eax
	je .LcvL
	movl $_stg_bh_upd_frame_info,-8(%ebp)
	leal -4(%edi),%eax
	movl %eax,-4(%ebp)
	movl $_cvJ_str,-12(%ebp)
	addl $-12,%ebp
	jmp _ghczmprim_GHCziCString_unpackCStringzh_info
.LcvL:
	movl 64(%esp),%eax
	jmp *(%eax)
.LcvS:
	movl $8,116(%ebx)
.LcvQ:
	movl %eax,%esi
	jmp *-12(%ebx)

That's rather a lot of goop for one thunk! Of course we can share this, by making a 2-word thunk like this:

------------------------------
| TopUnpack_info  |   -------|-----> "the-string"#
------------------------------

where TopUnpack_info is a shared RTS info-table and code that embodies the code fragment above.

This would save useless code bloat for every constant string. (This came up when looking at the code generated by deriving(Typeable).)

Edited by Ben Gamari
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information