Skip to content
Snippets Groups Projects
Commit b8d23d80 authored by Julian Seward's avatar Julian Seward
Browse files

[project @ 2000-06-29 15:08:02 by sewardj]

Add my wizardly assembly-code basic-block matching program, which is
very useful for debugging the native code generator.  This is not
built by default, because it's totally useless to anyone except the
GHC developers.  The README file describes how to use and maintain it.
parent 868297c3
No related merge requests found
module Main where
import List
import System
import Char
import Array
type Label = String
type Code = [String]
pzipWith f [] [] = []
pzipWith f (a:as) (b:bs) = (f a b) : pzipWith f as bs
pzipWith f _ _ = error "pzipWith: unbalanced list"
main
= getArgs >>= \args ->
if length args /= 1
then putStr ("\ndiff_gcc_nat:\n" ++
" usage: create File.s-gcc and File.s-nat\n" ++
" then do: diff_gcc_nat File.s > synth.S\n" ++
" and compile synth.S into your program.\n" ++
"diff_gcc_nat is to help debug GHC's native code generator;\n" ++
"it is quite useless for any other purpose. For details, see\n" ++
" fptools/ghc/utils/debugNCG/README.\n"++
"\n"
)
else
do
let [f_root] = args
f_gcc <- readFile (f_root ++ "-gcc")
f_nat <- readFile (f_root ++ "-nat")
let split_nat0 = breakOn is_split_line (lines f_nat)
split_nat = filter (not.null.getLabels) split_nat0
labels_nat = map getLabels split_nat
labels_cls = map (map breakLabel) labels_nat
labels_merged :: [(Label, [LabelKind])]
labels_merged = map mergeBroken labels_cls
classified :: [(Label, [LabelKind], [String])]
classified
= pzipWith (\ merged text -> (fst merged, snd merged, text))
labels_merged split_nat
lines_gcc = lines f_gcc
(syncd, gcc_unused)
= find_correspondings classified lines_gcc
(ok_syncs, nat_unused)
= check_syncs syncd
num_ok = length ok_syncs
preamble
= map (\i -> "#define NATIVE_" ++ show i ++ " 0") [1 .. num_ok]
++ ["",
"#define UNMATCHED_NAT 0",
"#define UNMATCHED_GCC 1",
""]
final
= preamble
++ concat (pzipWith pp_ok_sync ok_syncs [1 .. num_ok])
++ ["",
"//============== unmatched NAT =================",
"#if UNMATCHED_NAT",
""]
++ nat_unused
++ ["",
"#endif",
"",
"//============== unmatched GCC =================",
"#if UNMATCHED_GCC"]
++ gcc_unused
++ ["#endif"
]
putStr (unlines final)
pp_ok_sync :: (Label, [LabelKind], [String], [String])
-> Int
-> [String]
pp_ok_sync (lbl, kinds, nat_code, gcc_code) number
= reconstruct number nat_code gcc_code
check_syncs :: [(Label, [LabelKind], [String], Maybe [String])] -- raw syncd
-> ( [(Label, [LabelKind], [String], [String])], -- ok syncs
[String] ) -- nat unsyncd
check_syncs [] = ([],[])
check_syncs (sync:syncs)
= let (syncs_ok, syncs_uu) = check_syncs syncs
in case sync of
(lbl, kinds, nat, Nothing)
-> (syncs_ok, nat ++ syncs_uu)
(lbl, kinds, nat, Just gcc_code)
-> ((lbl,kinds,nat,gcc_code):syncs_ok, syncs_uu)
find_correspondings :: [(Label, [LabelKind], [String])] -- native info
-> [String] -- gcc initial
-> ( [(Label, [LabelKind], [String], Maybe [String])],
[String] )
-- ( native info + found gcc stuff,
-- unused gcc stuff )
find_correspondings native gcc_init
= f native gcc_init
where
wurble x (xs, gcc_final) = (x:xs, gcc_final)
f [] gcc_uu = ( [], gcc_uu )
f (nat:nats) gcc_uu
= case nat of { (lbl, kinds, nat_code) ->
case find_corresponding lbl kinds gcc_uu of
Just (gcc_code, gcc_uu2)
| gcc_code == gcc_code
-> --gcc_code `seq` gcc_uu2 `seq`
wurble (lbl, kinds, nat_code, Just gcc_code) (f nats gcc_uu2)
Nothing
-> gcc_uu `seq`
wurble (lbl, kinds, nat_code, Nothing) (f nats gcc_uu)
}
find_corresponding :: Label -- root
-> [LabelKind] -- kinds
-> [String] -- gcc text
-> Maybe ([String],[String]) -- (found text, gcc leftovers)
find_corresponding root kinds gcc_lines
= case kinds of
[Vtbl]
-> let lbl_i = find_label arr (reconstruct_label root Vtbl)
fst_i = search_back arr lbl_i (pfxMatch [".text"])
in
splice arr fst_i lbl_i
[Closure]
-> let lbl_i = find_label arr (reconstruct_label root Closure)
fst_i = search_back arr lbl_i (pfxMatch [".data"])
lst_i = search_fwds arr (lbl_i+1) (not . pfxMatch [".long"])
in
splice arr fst_i (lst_i-1)
[Alt]
-> let lbl_i = find_label arr (reconstruct_label root Alt)
fst_i = search_back arr lbl_i (pfxMatch ["."])
lst_i = search_fwds arr lbl_i (pfxMatch [".d", ".t", ".r", ".g"])
in
splice arr fst_i (lst_i-1)
[Dflt]
-> let lbl_i = find_label arr (reconstruct_label root Dflt)
fst_i = search_back arr lbl_i (pfxMatch ["."])
lst_i = search_fwds arr lbl_i (pfxMatch [".d", ".t", ".r", ".g"])
in
splice arr fst_i (lst_i-1)
[Info,Entry]
-> let info_i = find_label arr (reconstruct_label root Info)
fst_i = search_back arr info_i (pfxMatch [".text"])
entry_i = find_label arr (reconstruct_label root Entry)
lst_i = search_fwds arr entry_i (pfxMatch [".d", ".t", ".r", ".g"])
in
splice arr fst_i (lst_i-1)
[Info,Entry,Fast k]
-> let info_i = find_label arr (reconstruct_label root Info)
fst_i = search_back arr info_i (pfxMatch [".text"])
fast_i = find_label arr (reconstruct_label root (Fast k))
lst_i = search_fwds arr fast_i (pfxMatch [".d", ".t", ".r", ".g"])
in
splice arr fst_i (lst_i-1)
[Info,Ret]
-> let info_i = find_label arr (reconstruct_label root Info)
fst_i = search_back arr info_i (pfxMatch [".text"])
ret_i = find_label arr (reconstruct_label root Ret)
lst_i = search_fwds arr ret_i (pfxMatch [".d", ".t", ".r", ".g"])
in
splice arr fst_i (lst_i-1)
[Srt]
-> let lbl_i = find_label arr (reconstruct_label root Srt)
fst_i = search_back arr lbl_i (pfxMatch [".text"])
lst_i = search_fwds arr (lbl_i+1) (not . pfxMatch [".long"])
in
splice arr fst_i (lst_i-1)
[CTbl]
-> let lbl_i = find_label arr (reconstruct_label root CTbl)
fst_i = search_back arr lbl_i (pfxMatch [".text"])
lst_i = search_fwds arr (lbl_i+1) (not . pfxMatch [".long"])
in
splice arr fst_i (lst_i-1)
[Init]
-> let lbl_i = find_label arr (reconstruct_label root Init)
fst_i = search_back arr lbl_i (pfxMatch [".data"])
lst_i = search_fwds arr lbl_i (pfxMatch [".d", ".t", ".r", ".g"])
in
splice arr fst_i (lst_i-1)
other
-> error ("find_corresponding: " ++ show kinds)
where
arr = listArray (1, length gcc_lines) gcc_lines
pfxMatch ss t
= let clean_t = filter (not.isSpace) t
in any (`isPrefixOf` clean_t) ss
search_back :: Array Int String -> Int -> (String -> Bool) -> Int
search_back code start_ix pred
= let test_ixs = [start_ix, start_ix-1 .. fst (bounds code)]
in case dropWhile (not . pred . (code !)) test_ixs of
(ok:_) -> ok
[] -> fst (bounds code) - 1
search_fwds :: Array Int String -> Int -> (String -> Bool) -> Int
search_fwds code start_ix pred
= let test_ixs = [start_ix .. snd (bounds code)]
in case dropWhile (not . pred . (code !)) test_ixs of
(ok:_) -> ok
[] -> snd (bounds code) + 1
find_label :: Array Int String -> Label -> Int
find_label code lbl
= case [idx | (idx, lbl2) <- assocs code, lbl == lbl2] of
[idx] -> idx
other -> error ("find_label: " ++ lbl)
reconstruct_label :: Label -> LabelKind -> Label
reconstruct_label root Init
= "__init_" ++ root ++ ":"
reconstruct_label root kind
= root ++ "_" ++ pp kind ++ ":"
where
pp Info = "info"
pp Entry = "entry"
pp Closure = "closure"
pp Alt = "alt"
pp Vtbl = "vtbl"
pp Default = "dflt"
pp (Fast i) = "fast" ++ show i
pp Dflt = "dflt"
pp Srt = "srt"
pp Ret = "ret"
pp CTbl = "tbl"
splice :: Array Int String -> Int -> Int -> Maybe ([String],[String])
splice gcc_code lo hi
| lo <= hi && clo <= lo && hi <= chi
= Just (map (gcc_code !) ix_used,
map (gcc_code !) (low_ix_uu ++ high_ix_uu))
| otherwise
= error "splice"
where
(clo,chi) = bounds gcc_code
low_ix_uu = [clo .. lo-1]
high_ix_uu = [hi+1 .. chi]
ix_used = [lo .. hi]
------------------------------------
getLabels :: [Label] -> [Label]
getLabels = sort . nub . filter is_interesting_label
data LabelKind
= Info | Entry | Fast Int | Closure | Alt | Vtbl | Default
| Dflt | Srt | Ret | CTbl | Init
deriving (Eq, Ord, Show)
breakLabel :: Label -> (Label,LabelKind)
breakLabel s
= let sr = reverse s
kr = takeWhile (/= '_') sr
mr = drop (1 + length kr) sr
m = reverse mr
k = reverse kr
kind
| take 4 k == "fast"
= Fast (read (takeWhile isDigit (drop 4 k)))
| otherwise
= case k of
"info:" -> Info
"entry:" -> Entry
"closure:" -> Closure
"alt:" -> Alt
"vtbl:" -> Vtbl
"dflt:" -> Dflt
"srt:" -> Srt
"ret:" -> Ret
"tbl:" -> CTbl
_ -> error ("breakLabel: " ++ show (s,k,m))
in
if head m == '_' && dropWhile (== '_') m == "init"
then (init k, Init)
else (m, kind)
mergeBroken :: [(Label,LabelKind)] -> (Label, [LabelKind])
mergeBroken pairs
= let (roots, kinds) = unzip pairs
ok = all (== (head roots)) (tail roots)
&& length kinds == length (nub kinds)
in
if ok
then (head roots, sort kinds)
else error ("mergeBroken: " ++ show pairs)
reconstruct :: Int -> Code -> Code -> Code
reconstruct number nat_code gcc_code
= ["",
"//------------------------------------------"]
++ map (comment ("//-- ")) (getLabels gcc_code)
++ ["", "#if NATIVE_" ++ show number, "//nat version", ""]
++ nat_code
++ ["", "#else", "//gcc version", ""]
++ gcc_code
++ ["", "#endif"]
comment str x = str ++ x
-----------------------------------------------------
split_marker = "___stg_split_marker"
is_split_line s
= let m = split_marker
in take 19 s == m || take 19 (drop 2 s) == m
is_interesting_label s
= not (null s)
&& not (any isSpace s)
&& last s == ':'
&& '_' `elem` s
breakOn :: (a -> Bool) -> [a] -> [[a]]
breakOn p [] = []
breakOn p xs
= let ys = takeWhile (not . p) xs
rest = drop (1 + length ys) xs
in
if null ys then breakOn p rest else ys : breakOn p rest
TOP=../..
include $(TOP)/mk/boilerplate.mk
INSTALL_PROGS += diff_gcc_nat
SRC_HC_OPTS += -O
OBJS = Diff_Gcc_Nat.o
CLEAN_FILES += diff_gcc_nat
all :: diff_gcc_nat
diff_gcc_nat: Diff_Gcc_Nat.o
$(HC) -o $@ $(HC_OPTS) $(LD_OPTS) $(OBJS)
CLEAN_FILES += diff_gcc_nat
CLEAN_FILES += $(OBJS)
include $(TOP)/mk/target.mk
This program is to assist in debugging GHC's native code generator.
Finding out which particular code block the native code block has
mis-compiled is like finding a needle in a haystack. This program
solves that problem. Given an assembly file created by the NCG (call
it Foo.s-nat) and one created by gcc (Foo.s-gcc), then
diff_gcc_nat Foo.s
will pair up corresponding code blocks, wrap each one in an #if and
spew the entire result out to stdout, along with a load of #defines at
the top, which you can use to switch between the gcc and ncg versions
of each code block. Pipe this into a .S file (I use the name
synth.S). Then you can used the #defines to do a binary search to
quickly arrive at the code block(s) which have been mis-compiled.
Note that the .S suffix tells ghc that this assembly file needs to be
cpp'd; so you should be sure to use .S and not .s.
The pattern matching can cope with the fact that the code blocks are
in different orders in the two files. The result synth.S is ordered
by in the order of the -nat input; the -gcc input is searched for the
corresponding stuff. The search relies on spotting artefacts like
section changes, so is fragile and susceptible to minor changes in the
gcc's assembly output. If that happens, it's well worth the effort
fixing this program, rather than trying to infer what's wrong with the
NCG directly from the -nat input.
This is only known to work on x86 linux (and cygwin). No idea if the
same matching heuristics will work on other archs -- if not, we need
to have multiple versions of this program, on a per-arch basis.
One other IMPORTANT thing: you *must* enable stg-split-markers in the
native code generator output, otherwise this won't work at all --
since it won't be able to find out where the code blocks start and
end. Enable these markers by compiling ghc (or at least
ghc/compiler/nativeGen/AsmCodeGen.lhs, function nativeCodeGen) with
-DDEBUG enabled.
Matching is simple but inefficient; diff-ing a large module could take
a minute or two.
JRS, 29 June 2000
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment