Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,265
    • Issues 4,265
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 419
    • Merge Requests 419
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Wiki
    • Debugging
    • Low level profiling
  • q prof

Last edited by Tobias Dammers Mar 29, 2019
Page history New page

q prof

Using qprof

qprof is a tool for quick profiling of executables. It can give you an idea of where the program is spending most of its time without recompiling the program. It works by hooking a few functions using LD_PRELOAD, and installing a SIGVTALRM signal handler to sample the program counter.

Notes for using it with GHC

Installing qprof on most Linux distributions should be easy; e.g. on Debian-based systems use sudo apt-get install qprof.

Profiling a program is straightforward:

qprof ./prog <flags>

The profile will be by basic-block, so you'll need some combination of -ddump-simpl -ddump-stg -ddump-cmm to map back to the Haskell code.

I got some strange results when using the non-threaded RTS, such as the handle_tick function being high up the profile. Presumably because we also use SIGVTALRM. Using the threaded RTS or +RTS -V0 should help here.

I found the resolution of the samples to be way too low. You can increase the frequency of samples with qprof's -i option, but I wasn't able to increase it beyond about 1ms (the default is 10ms). Nevertheless, this is a good way to identify the inner loop quickly. It's not a good way to evaluate small optimisations (for that, PAPI or oprofile would be better).

Clone repository

GHC Home
GHC User's Guide

Joining In

Newcomers info
Mailing Lists & IRC
The GHC Team

Documentation

GHC Status Info
Working conventions
Building Guide
Debugging
Commentary

Wiki

Title Index
Recent Changes