Commit 20fc2f0c authored by hwloidl's avatar hwloidl

[project @ 2001-03-22 03:51:08 by hwloidl]

-*- outline -*-
Time-stamp: <Thu Mar 22 2001 03:50:16 Stardate: [-30]6365.79 hwloidl>

This commit covers changes in GHC to get GUM (way=mp) and GUM/GdH (way=md)
working. It is a merge of my working version of GUM, based on GHC 4.06,
with GHC 4.11. Almost all changes are in the RTS (see below).

GUM is reasonably stable, we used the 4.06 version in large-ish programs for
recent papers. Couple of things I want to change, but nothing urgent.
GUM/GdH has just been merged and needs more testing. Hope to do that in the
next weeks. It works in our working build but needs tweaking to run.
GranSim doesn't work yet (*sigh*). Most of the code should be in, but needs
more debugging.

ToDo: I still want to make the following minor modifications before the release
- Better wrapper skript for parallel execution [ghc/compiler/main]
- Update parallel docu: started on it but it's minimal [ghc/docs/users_guide]
- Clean up [nofib/parallel]: it's a real mess right now (*sigh*)
- Update visualisation tools (minor things only IIRC) [ghc/utils/parallel]
- Add a Klingon-English glossary

* RTS:

Almost all changes are restricted to ghc/rts/parallel and should not
interfere with the rest. I only comment on changes outside the parallel
dir:

- Several changes in Schedule.c (scheduling loop; createThreads etc);
  should only affect parallel code
- Added ghc/rts/hooks/ShutdownEachPEHook.c
- ghc/rts/Linker.[ch]: GUM doesn't know about Stable Names (ifdefs)!!
- StgMiscClosures.h: END_TSO_QUEUE etc now defined here (from StgMiscClosures.hc)
                     END_ECAF_LIST was missing a leading stg_
- SchedAPI.h: taskStart now defined in here; it's only a wrapper around
              scheduleThread now, but might use some init, shutdown later
- RtsAPI.h: I have nuked the def of rts_evalNothing

* Compiler:

- ghc/compiler/main/DriverState.hs
  added PVM-ish flags to the parallel way
  added new ways for parallel ticky profiling and distributed exec

- ghc/compiler/main/DriverPipeline.hs
  added a fct run_phase_MoveBinary which is called with way=mp after linking;
  it moves the bin file into a PVM dir and produces a wrapper script for
  parallel execution
  maybe cleaner to add a MoveBinary phase in DriverPhases.hs but this way
  it's less intrusive and MoveBinary makes probably only sense for mp anyway

* Nofib:

- nofib/spectral/Makefile, nofib/real/Makefile, ghc/tests/programs/Makefile:
  modified to skip some tests if HWL_NOFIB_HACK is set; only tmp to record
  which test prgs cause problems in my working build right now
parent 982fe3c7
% %
% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
% %
% $Id: CgClosure.lhs,v 1.45 2001/03/06 10:13:35 simonmar Exp $ % $Id: CgClosure.lhs,v 1.46 2001/03/22 03:51:08 hwloidl Exp $
% %
\section[CgClosure]{Code generation for closures} \section[CgClosure]{Code generation for closures}
...@@ -320,12 +320,7 @@ closureCodeBody binder_info closure_info cc all_args body ...@@ -320,12 +320,7 @@ closureCodeBody binder_info closure_info cc all_args body
-- --
arg_regs = case entry_conv of arg_regs = case entry_conv of
DirectEntry lbl arity regs -> regs DirectEntry lbl arity regs -> regs
other -> trace ("*** closureCodeBody:arg_regs " ++ (pprHWL entry_conv) ++ "(HWL ignored; no args passed in regs)") [] other -> [] -- "(HWL ignored; no args passed in regs)"
pprHWL :: EntryConvention -> String
pprHWL (ViaNode) = "ViaNode"
pprHWL (StdEntry cl) = "StdEntry"
pprHWL (DirectEntry cl i l) = "DirectEntry"
num_arg_regs = length arg_regs num_arg_regs = length arg_regs
......
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
-- $Id: DriverPipeline.hs,v 1.55 2001/03/15 11:26:27 simonmar Exp $ -- $Id: DriverPipeline.hs,v 1.56 2001/03/22 03:51:08 hwloidl Exp $
-- --
-- GHC Driver -- GHC Driver
-- --
...@@ -195,6 +195,7 @@ genPipeline todo stop_flag persistent_output lang filename ...@@ -195,6 +195,7 @@ genPipeline todo stop_flag persistent_output lang filename
| otherwise = [ ] -- just pass this file through to the linker | otherwise = [ ] -- just pass this file through to the linker
-- ToDo: this is somewhat cryptic -- ToDo: this is somewhat cryptic
not_valid = throwDyn (OtherError ("invalid option combination")) not_valid = throwDyn (OtherError ("invalid option combination"))
----------- ----- ---- --- -- -- - - - ----------- ----- ---- --- -- -- - - -
...@@ -240,7 +241,8 @@ genPipeline todo stop_flag persistent_output lang filename ...@@ -240,7 +241,8 @@ genPipeline todo stop_flag persistent_output lang filename
StopBefore phase -> phase StopBefore phase -> phase
DoMkDependHS -> Ln DoMkDependHS -> Ln
DoLink -> Ln DoLink -> Ln
annotated_pipeline = annotatePipeline (pipeline ++ [ Ln ]) stop_phase
annotated_pipeline = annotatePipeline (pipeline ++ [Ln]) stop_phase
phase_ne p (p1,_,_) = (p1 /= p) phase_ne p (p1,_,_) = (p1 /= p)
----------- ----- ---- --- -- -- - - - ----------- ----- ---- --- -- -- - - -
...@@ -677,6 +679,91 @@ run_phase SplitAs basename _suff _input_fn _output_fn ...@@ -677,6 +679,91 @@ run_phase SplitAs basename _suff _input_fn _output_fn
mapM_ assemble_file [1..n] mapM_ assemble_file [1..n]
return True return True
-----------------------------------------------------------------------------
-- MoveBinary sort-of-phase
-- After having produced a binary, move it somewhere else and generate a
-- wrapper script calling the binary. Currently, we need this only in
-- a parallel way (i.e. in GUM), because PVM expects the binary in a
-- central directory.
-- This is called from doLink below, after linking. I haven't made it
-- a separate phase to minimise interfering with other modules, and
-- we don't need the generality of a phase (MoveBinary is always
-- done after linking and makes only sense in a parallel setup) -- HWL
run_phase_MoveBinary input_fn
= do
top_dir <- readIORef v_TopDir
pvm_root <- getEnv "PVM_ROOT"
pvm_arch <- getEnv "PVM_ARCH"
let
pvm_executable_base = "=" ++ input_fn
pvm_executable = pvm_root ++ "/bin/" ++ pvm_arch ++ "/" ++ pvm_executable_base
sysMan = top_dir ++ "/ghc/rts/parallel/SysMan";
-- nuke old binary; maybe use configur'ed names for cp and rm?
system ("rm -f " ++ pvm_executable)
-- move the newly created binary into PVM land
system ("cp -p " ++ input_fn ++ " " ++ pvm_executable)
-- generate a wrapper script for running a parallel prg under PVM
writeFile input_fn (mk_pvm_wrapper_script pvm_executable pvm_executable_base sysMan)
return True
-- generates a Perl skript starting a parallel prg under PVM
mk_pvm_wrapper_script :: String -> String -> String -> String
mk_pvm_wrapper_script pvm_executable pvm_executable_base sysMan = unlines $
[
"eval 'exec perl -S $0 ${1+\"$@\"}'",
" if $running_under_some_shell;",
"# =!=!=!=!=!=!=!=!=!=!=!",
"# This script is automatically generated: DO NOT EDIT!!!",
"# Generated by Glasgow Haskell Compiler",
"# ngoqvam choHbogh vaj' vIHoHnISbej !!!!",
"#",
"$pvm_executable = '" ++ pvm_executable ++ "';",
"$pvm_executable_base = '" ++ pvm_executable_base ++ "';",
"$SysMan = '" ++ sysMan ++ "';",
"",
{- ToDo: add the magical shortcuts again iff we actually use them -- HWL
"# first, some magical shortcuts to run "commands" on the binary",
"# (which is hidden)",
"if ($#ARGV == 1 && $ARGV[0] eq '+RTS' && $ARGV[1] =~ /^--((size|file|strip|rm|nm).*)/ ) {",
" local($cmd) = $1;",
" system("$cmd $pvm_executable");",
" exit(0); # all done",
"}", -}
"",
"# Now, run the real binary; process the args first",
"$ENV{'PE'} = $pvm_executable_base;", -- ++ pvm_executable_base,
"$debug = '';",
"$nprocessors = 0; # the default: as many PEs as machines in PVM config",
"@nonPVM_args = ();",
"$in_RTS_args = 0;",
"",
"args: while ($a = shift(@ARGV)) {",
" if ( $a eq '+RTS' ) {",
" $in_RTS_args = 1;",
" } elsif ( $a eq '-RTS' ) {",
" $in_RTS_args = 0;",
" }",
" if ( $a eq '-d' && $in_RTS_args ) {",
" $debug = '-';",
" } elsif ( $a =~ /^-qN(\\d+)/ && $in_RTS_args ) {",
" $nprocessors = $1;",
" } elsif ( $a =~ /^-qp(\\d+)/ && $in_RTS_args ) {",
" $nprocessors = $1;",
" } else {",
" push(@nonPVM_args, $a);",
" }",
"}",
"",
"local($return_val) = 0;",
"# Start the parallel execution by calling SysMan",
"system(\"$SysMan $debug $pvm_executable $nprocessors @nonPVM_args\");",
"$return_val = $?;",
"# ToDo: fix race condition moving files and flushing them!!",
"system(\"cp $ENV{'HOME'}/$pvm_executable_base.???.gr .\") if -f \"$ENV{'HOME'}/$pvm_executable_base.002.gr\";",
"exit($return_val);"
]
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
-- Linking -- Linking
...@@ -743,6 +830,12 @@ doLink o_files = do ...@@ -743,6 +830,12 @@ doLink o_files = do
#endif #endif
) )
) )
-- parallel only: move binary to another dir -- HWL
ways_ <- readIORef v_Ways
when (WayPar `elem` ways_) (do
success <- run_phase_MoveBinary output_fn
if success then return ()
else throwDyn (OtherError ("cannot move binary to PVM dir")))
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
-- Making a DLL -- Making a DLL
......
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
-- $Id: DriverState.hs,v 1.33 2001/03/12 14:06:47 simonpj Exp $ -- $Id: DriverState.hs,v 1.34 2001/03/22 03:51:08 hwloidl Exp $
-- --
-- Settings for the driver -- Settings for the driver
-- --
...@@ -507,14 +507,45 @@ way_details = ...@@ -507,14 +507,45 @@ way_details =
(WayUnreg, Way "u" "Unregisterised" (WayUnreg, Way "u" "Unregisterised"
unregFlags ), unregFlags ),
-- optl's below to tell linker where to find the PVM library -- HWL
(WayPar, Way "mp" "Parallel" (WayPar, Way "mp" "Parallel"
[ "-fparallel" [ "-fparallel"
, "-D__PARALLEL_HASKELL__" , "-D__PARALLEL_HASKELL__"
, "-optc-DPAR" , "-optc-DPAR"
, "-package concurrent" , "-package concurrent"
, "-optc-w"
, "-optl-L${PVM_ROOT}/lib/${PVM_ARCH}"
, "-optl-lpvm3"
, "-optl-lgpvm3"
, "-fvia-C" ]), , "-fvia-C" ]),
(WayGran, Way "mg" "Gransim" -- at the moment we only change the RTS and could share compiler and libs!
(WayPar, Way "mt" "Parallel ticky profiling"
[ "-fparallel"
, "-D__PARALLEL_HASKELL__"
, "-optc-DPAR"
, "-optc-DPAR_TICKY"
, "-package concurrent"
, "-optc-w"
, "-optl-L${PVM_ROOT}/lib/${PVM_ARCH}"
, "-optl-lpvm3"
, "-optl-lgpvm3"
, "-fvia-C" ]),
(WayPar, Way "md" "Distributed"
[ "-fparallel"
, "-D__PARALLEL_HASKELL__"
, "-D__DISTRIBUTED_HASKELL__"
, "-optc-DPAR"
, "-optc-DDIST"
, "-package concurrent"
, "-optc-w"
, "-optl-L${PVM_ROOT}/lib/${PVM_ARCH}"
, "-optl-lpvm3"
, "-optl-lgpvm3"
, "-fvia-C" ]),
(WayGran, Way "mg" "GranSim"
[ "-fgransim" [ "-fgransim"
, "-D__GRANSIM__" , "-D__GRANSIM__"
, "-optc-DGRAN" , "-optc-DGRAN"
......
----------------------------------------------------------------------- -----------------------------------------------------------------------
-- $Id: primops.txt,v 1.18 2001/02/28 00:01:02 qrczak Exp $ -- $Id: primops.txt,v 1.19 2001/03/22 03:51:08 hwloidl Exp $
-- --
-- Primitive Operations -- Primitive Operations
-- --
...@@ -787,8 +787,6 @@ primop IndexOffForeignObjOp_Word32 "indexWord32OffForeignObj#" GenPrimOp ...@@ -787,8 +787,6 @@ primop IndexOffForeignObjOp_Word32 "indexWord32OffForeignObj#" GenPrimOp
primop IndexOffForeignObjOp_Word64 "indexWord64OffForeignObj#" GenPrimOp primop IndexOffForeignObjOp_Word64 "indexWord64OffForeignObj#" GenPrimOp
ForeignObj# -> Int# -> Word64# ForeignObj# -> Int# -> Word64#
primop ReadOffAddrOp_Char "readCharOffAddr#" GenPrimOp primop ReadOffAddrOp_Char "readCharOffAddr#" GenPrimOp
Addr# -> Int# -> State# s -> (# State# s, Char# #) Addr# -> Int# -> State# s -> (# State# s, Char# #)
...@@ -1152,7 +1150,6 @@ primop TouchOp "touch#" GenPrimOp ...@@ -1152,7 +1150,6 @@ primop TouchOp "touch#" GenPrimOp
with with
strictness = { \ arity -> StrictnessInfo [wwLazy, wwPrim] False } strictness = { \ arity -> StrictnessInfo [wwLazy, wwPrim] False }
------------------------------------------------------------------------ ------------------------------------------------------------------------
--- Weak pointers --- --- Weak pointers ---
------------------------------------------------------------------------ ------------------------------------------------------------------------
...@@ -1183,7 +1180,6 @@ primop FinalizeWeakOp "finalizeWeak#" GenPrimOp ...@@ -1183,7 +1180,6 @@ primop FinalizeWeakOp "finalizeWeak#" GenPrimOp
has_side_effects = True has_side_effects = True
out_of_line = True out_of_line = True
------------------------------------------------------------------------ ------------------------------------------------------------------------
--- Stable pointers and names --- --- Stable pointers and names ---
------------------------------------------------------------------------ ------------------------------------------------------------------------
...@@ -1302,6 +1298,7 @@ primop ParAtForNowOp "parAtForNow#" GenPrimOp ...@@ -1302,6 +1298,7 @@ primop ParAtForNowOp "parAtForNow#" GenPrimOp
-- copyable# and noFollow# have no corresponding entry in -- copyable# and noFollow# have no corresponding entry in
-- PrelGHC.hi-boot, so I don't know whether they should still -- PrelGHC.hi-boot, so I don't know whether they should still
-- be here or not. JRS, 15 Jan 01 -- be here or not. JRS, 15 Jan 01
-- not implemented; please, keep the comment as reminder -- HWL 12/3/01
-- --
--primop CopyableOp "copyable#" GenPrimOp --primop CopyableOp "copyable#" GenPrimOp
-- a -> Int# -- a -> Int#
......
...@@ -43,6 +43,7 @@ sequential execution, then fine. ...@@ -43,6 +43,7 @@ sequential execution, then fine.
<Para> <Para>
A Parallel Haskell program implies multiple processes running on A Parallel Haskell program implies multiple processes running on
multiple processors, under a PVM (Parallel Virtual Machine) framework. multiple processors, under a PVM (Parallel Virtual Machine) framework.
An MPI interface is under development but not fully functional, yet.
</Para> </Para>
<Para> <Para>
...@@ -51,8 +52,12 @@ fun&rdquo; than about &ldquo;speed.&rdquo; That will change. ...@@ -51,8 +52,12 @@ fun&rdquo; than about &ldquo;speed.&rdquo; That will change.
</Para> </Para>
<Para> <Para>
Again, check Simon's Web page for publications about Parallel Haskell Check the <ULink URL="http://www.cee.hw.ac.uk/~dsg/gph/">GPH Page</Ulink>
(including &ldquo;GUM&rdquo;, the key bits of the runtime system). for more information on &ldquo;GPH&rdquo; (Haskell98 with extensions for
parallel execution), the latest version of &ldquo;GUM&rdquo; (the runtime
system to enable parallel executions) and papers on research issues. A
list of publications about GPH and about GUM is also available from Simon's
Web Page.
</Para> </Para>
<Para> <Para>
...@@ -151,10 +156,10 @@ you'd like to see this with your very own eyes, just run GHC with the ...@@ -151,10 +156,10 @@ you'd like to see this with your very own eyes, just run GHC with the
</Sect3> </Sect3>
<Sect3 id="sec-scheduling-policy"> <Sect3>
<Title>Scheduling policy for concurrent/parallel threads <Title>Scheduling policy for concurrent threads
<IndexTerm><Primary>Scheduling&mdash;concurrent/parallel</Primary></IndexTerm> <IndexTerm><Primary>Scheduling&mdash;concurrent</Primary></IndexTerm>
<IndexTerm><Primary>Concurrent/parallel scheduling</Primary></IndexTerm></Title> <IndexTerm><Primary>Concurrent scheduling</Primary></IndexTerm></Title>
<Para> <Para>
Runnable threads are scheduled in round-robin fashion. Context Runnable threads are scheduled in round-robin fashion. Context
...@@ -179,6 +184,19 @@ of the currently active threads are completed. ...@@ -179,6 +184,19 @@ of the currently active threads are completed.
</Sect3> </Sect3>
<Sect3>
<Title>Scheduling policy for parallel threads
<IndexTerm><Primary>Scheduling&mdash;parallel</Primary></IndexTerm>
<IndexTerm><Primary>Parallel scheduling</Primary></IndexTerm></Title>
<Para>
In GUM we use an unfair scheduler, which means that a thread continues to
perform graph reduction until it blocks on a closure under evaluation, on a
remote closure or until the thread finishes.
</Para>
</Sect3>
</Sect2> </Sect2>
</Sect1> </Sect1>
......
...@@ -1392,21 +1392,21 @@ LinkEnd="sec-Concurrent">. ...@@ -1392,21 +1392,21 @@ LinkEnd="sec-Concurrent">.
<para> <para>
&lsqb;You won't be able to execute parallel Haskell programs unless PVM3 &lsqb;You won't be able to execute parallel Haskell programs unless PVM3
(Parallel Virtual Machine, version 3) is installed at your site.] (Parallel Virtual Machine, version 3) is installed at your site.&rsqb;
</para> </Para>
<para> <para>
To compile a Haskell program for parallel execution under PVM, use the To compile a Haskell program for parallel execution under PVM, use the
<option>-parallel</option> option,<indexterm><primary>-parallel <Option>-parallel</Option> option,<IndexTerm><Primary>-parallel
option</primary></indexterm> both when compiling <emphasis>and option</Primary></IndexTerm> both when compiling <Emphasis>and
linking</emphasis>. You will probably want to <literal>import linking</Emphasis>. You will probably want to <Literal>import
Parallel</literal> into your Haskell modules. Parallel</Literal> into your Haskell modules.
</para> </Para>
<para> <para>
To run your parallel program, once PVM is going, just invoke it To run your parallel program, once PVM is going, just invoke it
&ldquo;as normal&rdquo;. The main extra RTS option is &ldquo;as normal&rdquo;. The main extra RTS option is
<option>-N&lt;n&gt;</option>, to say how many PVM <Option>-qp&lt;n&gt;</Option>, to say how many PVM
&ldquo;processors&rdquo; your program to run on. (For more details of &ldquo;processors&rdquo; your program to run on. (For more details of
all relevant RTS options, please see <XRef all relevant RTS options, please see <XRef
LinkEnd="parallel-rts-opts">.) LinkEnd="parallel-rts-opts">.)
...@@ -1418,8 +1418,8 @@ out of them (e.g., parallelism profiles) is a battle with the vagaries of ...@@ -1418,8 +1418,8 @@ out of them (e.g., parallelism profiles) is a battle with the vagaries of
PVM, detailed in the following sections. PVM, detailed in the following sections.
</para> </para>
<sect2> <Sect2 id="pvm-dummies">
<title>Dummy's guide to using PVM</title> <Title>Dummy's guide to using PVM</Title>
<para> <para>
<indexterm><primary>PVM, how to use</primary></indexterm> <indexterm><primary>PVM, how to use</primary></indexterm>
...@@ -1438,11 +1438,23 @@ setenv PVM_DPATH $PVM_ROOT/lib/pvmd ...@@ -1438,11 +1438,23 @@ setenv PVM_DPATH $PVM_ROOT/lib/pvmd
<para> <para>
Creating and/or controlling your &ldquo;parallel machine&rdquo; is a purely-PVM Creating and/or controlling your &ldquo;parallel machine&rdquo; is a purely-PVM
business; nothing specific to Parallel Haskell. business; nothing specific to Parallel Haskell. The following paragraphs
</para> describe how to configure your parallel machine interactively.
</Para>
<para> <Para>
You use the <command>pvm</command><indexterm><primary>pvm command</primary></indexterm> command to start PVM on your If you use parallel Haskell regularly on the same machine configuration it
is a good idea to maintain a file with all machine names and to make the
environment variable PVM_HOST_FILE point to this file. Then you can avoid
the interactive operations described below by just saying
</Para>
<ProgramListing>
pvm $PVM_HOST_FILE
</ProgramListing>
<Para>
You use the <Command>pvm</Command><IndexTerm><Primary>pvm command</Primary></IndexTerm> command to start PVM on your
machine. You can then do various things to control/monitor your machine. You can then do various things to control/monitor your
&ldquo;parallel machine;&rdquo; the most useful being: &ldquo;parallel machine;&rdquo; the most useful being:
</para> </para>
...@@ -1504,8 +1516,8 @@ The PVM documentation can tell you much, much more about <command>pvm</command>! ...@@ -1504,8 +1516,8 @@ The PVM documentation can tell you much, much more about <command>pvm</command>!
</sect2> </sect2>
<sect2> <Sect2 id="par-profiles">
<title>Parallelism profiles</title> <Title>Parallelism profiles</Title>
<para> <para>
<indexterm><primary>parallelism profiles</primary></indexterm> <indexterm><primary>parallelism profiles</primary></indexterm>
...@@ -1518,25 +1530,25 @@ With Parallel Haskell programs, we usually don't care about the ...@@ -1518,25 +1530,25 @@ With Parallel Haskell programs, we usually don't care about the
results&mdash;only with &ldquo;how parallel&rdquo; it was! We want pretty pictures. results&mdash;only with &ldquo;how parallel&rdquo; it was! We want pretty pictures.
</para> </para>
<para> <Para>
Parallelism profiles (&agrave; la <command>hbcpp</command>) can be generated with the Parallelism profiles (&agrave; la <Command>hbcpp</Command>) can be generated with the
<option>-q</option><indexterm><primary>-q RTS option (concurrent, parallel)</primary></indexterm> RTS option. The <Option>-qP</Option><IndexTerm><Primary>-qP RTS option (concurrent, parallel)</Primary></IndexTerm> RTS option. The
per-processor profiling info is dumped into files named per-processor profiling info is dumped into files named
<filename>&lt;full-path&gt;&lt;program&gt;.gr</filename>. These are then munged into a PostScript picture, <Filename>&lt;full-path&gt;&lt;program&gt;.gr</Filename>. These are then munged into a PostScript picture,
which you can then display. For example, to run your program which you can then display. For example, to run your program
<filename>a.out</filename> on 8 processors, then view the parallelism profile, do: <Filename>a.out</Filename> on 8 processors, then view the parallelism profile, do:
</para> </Para>
<para> <Para>
<Screen> <Screen>
% ./a.out +RTS -N8 -q <prompt> ./a.out +RTS -qP -qp8
% grs2gr *.???.gr &#62; temp.gr # combine the 8 .gr files into one <prompt> grs2gr *.???.gr &#62; temp.gr # combine the 8 .gr files into one
% gr2ps -O temp.gr # cvt to .ps; output in temp.ps <prompt> gr2ps -O temp.gr # cvt to .ps; output in temp.ps
% ghostview -seascape temp.ps # look at it! <prompt> ghostview -seascape temp.ps # look at it!
</Screen> </Screen>
</para> </Para>
<para> <para>
The scripts for processing the parallelism profiles are distributed The scripts for processing the parallelism profiles are distributed
...@@ -1545,13 +1557,13 @@ in <filename>ghc/utils/parallel/</filename>. ...@@ -1545,13 +1557,13 @@ in <filename>ghc/utils/parallel/</filename>.
</sect2> </sect2>
<sect2> <Sect2>
<title>Other useful info about running parallel programs</title> <Title>Other useful info about running parallel programs</Title>
<para> <Para>
The &ldquo;garbage-collection statistics&rdquo; RTS options can be useful for The &ldquo;garbage-collection statistics&rdquo; RTS options can be useful for
seeing what parallel programs are doing. If you do either seeing what parallel programs are doing. If you do either
<option>+RTS -Sstderr</option><indexterm><primary>-Sstderr RTS option</primary></indexterm> or <option>+RTS -sstderr</option>, then <Option>+RTS -Sstderr</Option><IndexTerm><Primary>-Sstderr RTS option</Primary></IndexTerm> or <Option>+RTS -sstderr</Option>, then
you'll get mutator, garbage-collection, etc., times on standard you'll get mutator, garbage-collection, etc., times on standard
error. The standard error of all PE's other than the `main thread' error. The standard error of all PE's other than the `main thread'
appears in <filename>/tmp/pvml.nnn</filename>, courtesy of PVM. appears in <filename>/tmp/pvml.nnn</filename>, courtesy of PVM.
...@@ -1584,12 +1596,12 @@ for concurrent/parallel execution. ...@@ -1584,12 +1596,12 @@ for concurrent/parallel execution.
<para> <para>
<VariableList> <VariableList>
<varlistentry> <VarListEntry>
<term><option>-N&lt;N&gt;</option>:</term> <Term><Option>-qp&lt;N&gt;</Option>:</Term>
<listitem> <ListItem>
<para> <Para>
<indexterm><primary>-N&lt;N&gt; RTS option (parallel)</primary></indexterm> <IndexTerm><Primary>-qp&lt;N&gt; RTS option</Primary></IndexTerm>
(PARALLEL ONLY) Use <literal>&lt;N&gt;</literal> PVM processors to run this program; (PARALLEL ONLY) Use <Literal>&lt;N&gt;</Literal> PVM processors to run this program;
the default is 2. the default is 2.
</para> </para>
</listitem> </listitem>
...@@ -1623,60 +1635,98 @@ records the movement of threads between the green (runnable) and red ...@@ -1623,60 +1635,98 @@ records the movement of threads between the green (runnable) and red
green queue is split into green (for the currently running thread green queue is split into green (for the currently running thread
only) and amber (for other runnable threads). We do not recommend only) and amber (for other runnable threads). We do not recommend
that you use the verbose suboption if you are planning to use the that you use the verbose suboption if you are planning to use the
<command>hbcpp</command> profiling tools or if you are context switching at every heap <Command>hbcpp</Command> profiling tools or if you are context switching at every heap
check (with <option>-C</option>). check (with <Option>-C</Option>).
</para> -->
</listitem> </Para>
</varlistentry> </ListItem>
<varlistentry> </VarListEntry>
<term><option>-t&lt;num&gt;</option>:</term> <VarListEntry>
<listitem> <Term><Option>-qt&lt;num&gt;</Option>:</Term>
<para> <ListItem>
<indexterm><primary>-t&lt;num&gt; RTS option</primary></indexterm> <Para>
(PARALLEL ONLY) Limit the number of concurrent threads per processor <IndexTerm><Primary>-qt&lt;num&gt; RTS option</Primary></IndexTerm>
to <literal>&lt;num&gt;</literal>. The default is 32. Each thread requires slightly over 1K (PARALLEL ONLY) Limit the thread pool size, i.e. the number of concurrent
<emphasis>words</emphasis> in the heap for thread state and stack objects. (For