Skip to content

nativeGen: Produce local symbols for module-internal functions (approach C)

Last G requested to merge last-g/ghc:cover_toplevel_symbols into master

This is yet another approach to resolve #17605 (closed) which supersedes !2380 (closed)

This PR aims on covering all the code in an output binary with meaningful symbols in a correct manner. The core idea is to improve heuristic which was introduced in 64c54fff and add an extra case for top-level code labels which are not exported.

Differences from !2380 (closed):

  • + less hacky
  • + valid size and type information for symbols (this is required for perf and other tools to work correctly)
  • + using _info which points to the code instead of _entry which points to info table's data section (helps with disassembly)
  • - no module information

I tested this change locally with perf on the nofibb/nbody vs master and !2380 (closed).

This change:

    98.99%  bin      bin            [.] zdwadvance_r4W0_info
     0.28%  bin      bin            [.] stg_BLACKHOLE_info+0xffffffffffc00077
     0.18%  bin      bin            [.] Main_main1_info
     0.09%  bin      bin            [.] ghczmbignum_GHCziNumziInteger_integerQuotRemzh_info
     0.09%  bin      bin            [.] stg_IND_STATIC_info+0xffffffffffc00000
     0.09%  bin      bin            [.] stg_upd_frame_info+0xffffffffffc00023
     0.09%  bin      ld-2.17.so     [.] do_lookup_x
     0.09%  bin      ld-2.17.so     [.] match_symbol
     0.09%  bin      libc-2.17.so   [.] _dl_addr

!2380 (closed):

    98.60%  bin      bin            [.] _Main_zdwadvance_r4Xf_entry
     0.47%  bin      bin            [.] stg_BLACKHOLE_info+0xffffffffffc00077
     0.28%  bin      bin            [.] Main_main1_info
     0.19%  bin      bin            [.] stg_IND_STATIC_info+0xffffffffffc00000
     0.09%  bin      bin            [.] base_TextziParserCombinatorsziReadP_zdfAlternativePzuzdczlzbzg_info
     0.09%  bin      bin            [.] evacuate
     0.09%  bin      ld-2.17.so     [.] do_lookup_x
     0.09%  bin      ld-2.17.so     [.] strcmp
     0.09%  bin      libc-2.17.so   [.] init_cacheinfo

master branch:

    15.55%  bin      bin               [.] 0x0000000000009be9
     9.83%  bin      bin               [.] 0x0000000000009bfc
     8.67%  bin      bin               [.] 0x0000000000009a88
     8.22%  bin      bin               [.] 0x0000000000009a9b
     3.04%  bin      bin               [.] 0x0000000000009bee
     2.95%  bin      bin               [.] 0x0000000000009c2f
     2.77%  bin      bin               [.] 0x0000000000009c3a
     2.59%  bin      bin               [.] 0x0000000000009c0f
     2.23%  bin      bin               [.] 0x0000000000009ace
     1.97%  bin      bin               [.] 0x0000000000009aae
...

I also developed a simple python asm parser to check if all the code in asm output is covered by some non-hidden and correctly sized symbol. I need a suggestion on how I can integrate this into GHC's testsuite. https://gist.github.com/last-g/8daa931e31ab2a38ba30f178e894b28b

Additional changes:

  • added size declarations for data related symbols, this might help some debugging tools. I do not expect any real impact
  • extended debug output for labels when compiling with --dppr-debug

Merge request reports