Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • ghc/ghc
  • bgamari/ghc
  • syd/ghc
  • ggreif/ghc
  • watashi/ghc
  • RolandSenn/ghc
  • mpickering/ghc
  • DavidEichmann/ghc
  • carter/ghc
  • harpocrates/ghc
  • ethercrow/ghc
  • mijicd/ghc
  • adamse/ghc
  • alexbiehl/ghc
  • gridaphobe/ghc
  • trofi/ghc
  • supersven/ghc
  • ppk/ghc
  • ulysses4ever/ghc
  • AndreasK/ghc
  • ghuntley/ghc
  • shayne-fletcher-da/ghc
  • fgaz/ghc
  • yav/ghc
  • osa1/ghc
  • mbbx6spp/ghc
  • JulianLeviston/ghc
  • reactormonk/ghc
  • rae/ghc
  • takenobu-hs/ghc
  • michalt/ghc
  • andrewthad/ghc
  • hsyl20/ghc
  • scottgw/ghc
  • sjakobi/ghc
  • angerman/ghc
  • RyanGlScott/ghc
  • hvr/ghc
  • howtonotwin/ghc
  • chessai/ghc
  • m-renaud/ghc
  • brprice/ghc
  • stevehartdata/ghc
  • sighingnow/ghc
  • kgardas/ghc
  • ckoparkar/ghc
  • alp/ghc
  • smaeul/ghc
  • kakkun61/ghc
  • sykloid/ghc
  • newhoggy/ghc
  • toonn/ghc
  • nineonine/ghc
  • Phyx/ghc
  • ezyang/ghc
  • tweag/ghc
  • langston/ghc
  • ndmitchell/ghc
  • rockbmb/ghc
  • artempyanykh/ghc
  • mniip/ghc
  • mynguyenbmc/ghc
  • alexfmpe/ghc
  • crockeea/ghc
  • nh2/ghc
  • vaibhavsagar/ghc
  • phadej/ghc
  • Haskell-mouse/ghc
  • lolotp/ghc
  • spacekitteh/ghc
  • michaelpj/ghc
  • mgsloan/ghc
  • HPCohen/ghc
  • tmobile/ghc
  • radrow/ghc
  • simonmar/ghc
  • _deepfire/ghc
  • Ericson2314/ghc
  • leitao/ghc
  • fumieval/ghc
  • trac-isovector/ghc
  • cblp/ghc
  • xich/ghc
  • ciil/ghc
  • erthalion/ghc
  • xldenis/ghc
  • autotaker/ghc
  • haskell-wasm/ghc
  • kcsongor/ghc
  • agander/ghc
  • Baranowski/ghc
  • trac-dredozubov/ghc
  • 23Skidoo/ghc
  • iustin/ghc
  • ningning/ghc
  • josefs/ghc
  • kabuhr/ghc
  • gallais/ghc
  • dten/ghc
  • expipiplus1/ghc
  • Pluralia/ghc
  • rohanjr/ghc
  • intricate/ghc
  • kirelagin/ghc
  • Javran/ghc
  • DanielG/ghc
  • trac-mizunashi_mana/ghc
  • pparkkin/ghc
  • bollu/ghc
  • ntc2/ghc
  • jaspervdj/ghc
  • JoshMeredith/ghc
  • wz1000/ghc
  • zkourouma/ghc
  • code5hot/ghc
  • jdprice/ghc
  • tdammers/ghc
  • J-mie6/ghc
  • trac-lantti/ghc
  • ch1bo/ghc
  • cgohla/ghc
  • lucamolteni/ghc
  • acairncross/ghc
  • amerocu/ghc
  • chreekat/ghc
  • txsmith/ghc
  • trupill/ghc
  • typetetris/ghc
  • sergv/ghc
  • fryguybob/ghc
  • erikd/ghc
  • trac-roland/ghc
  • setupminimal/ghc
  • Friede80/ghc
  • SkyWriter/ghc
  • xplorld/ghc
  • abrar/ghc
  • obsidiansystems/ghc
  • Icelandjack/ghc
  • adinapoli/ghc
  • trac-matthewbauer/ghc
  • heatsink/ghc
  • dwijnand/ghc
  • Cmdv/ghc
  • alinab/ghc
  • pepeiborra/ghc
  • fommil/ghc
  • luochen1990/ghc
  • rlupton20/ghc
  • applePrincess/ghc
  • lehins/ghc
  • ronmrdechai/ghc
  • leeadam/ghc
  • harendra/ghc
  • mightymosquito1991/ghc
  • trac-gershomb/ghc
  • lucajulian/ghc
  • Rizary/ghc
  • VictorCMiraldo/ghc
  • jamesbrock/ghc
  • andrewdmeier/ghc
  • luke/ghc
  • pranaysashank/ghc
  • cocreature/ghc
  • hithroc/ghc
  • obreitwi/ghc
  • slrtbtfs/ghc
  • kaol/ghc
  • yairchu/ghc
  • Mathemagician98/ghc
  • trac-taylorfausak/ghc
  • leungbk/ghc
  • MichaWiedenmann/ghc
  • chris-martin/ghc
  • TDecki/ghc
  • adithyaov/ghc
  • trac-gelisam/ghc
  • Lysxia/ghc
  • complyue/ghc
  • bwignall/ghc
  • sternmull/ghc
  • sonika/ghc
  • leif/ghc
  • broadwaylamb/ghc
  • myszon/ghc
  • danbroooks/ghc
  • Mechachleopteryx/ghc
  • zardyh/ghc
  • trac-vdukhovni/ghc
  • OmarKhaledAbdo/ghc
  • arrowd/ghc
  • Bodigrim/ghc
  • matheus23/ghc
  • cardenaso11/ghc
  • trac-Athas/ghc
  • mb720/ghc
  • DylanZA/ghc
  • liff/ghc
  • typedrat/ghc
  • trac-claude/ghc
  • jbm/ghc
  • Gertjan423/ghc
  • PHO/ghc
  • JKTKops/ghc
  • kockahonza/ghc
  • msakai/ghc
  • Sir4ur0n/ghc
  • barambani/ghc
  • vishnu.c/ghc
  • dcoutts/ghc
  • trac-runeks/ghc
  • trac-MaxGabriel/ghc
  • lexi.lambda/ghc
  • strake/ghc
  • spavikevik/ghc
  • JakobBruenker/ghc
  • rmanne/ghc
  • gdziadkiewicz/ghc
  • ani/ghc
  • iliastsi/ghc
  • smunix/ghc
  • judah/ghc
  • blackgnezdo/ghc
  • emilypi/ghc
  • trac-bpfoley/ghc
  • muesli4/ghc
  • trac-gkaracha/ghc
  • Kleidukos/ghc
  • nek0/ghc
  • TristanCacqueray/ghc
  • dwulive/ghc
  • mbakke/ghc
  • arybczak/ghc
  • Yang123321/ghc
  • maksbotan/ghc
  • QuietMisdreavus/ghc
  • trac-olshanskydr/ghc
  • emekoi/ghc
  • samuela/ghc
  • josephcsible/ghc
  • dramforever/ghc
  • lpsmith/ghc
  • DenisFrezzato/ghc
  • michivi/ghc
  • jneira/ghc
  • jeffhappily/ghc
  • Ivan-Yudin/ghc
  • nakaji-dayo/ghc
  • gdevanla/ghc
  • galen/ghc
  • fendor/ghc
  • yaitskov/ghc
  • rcythr/ghc
  • awpr/ghc
  • jeremyschlatter/ghc
  • Aver1y/ghc
  • mitchellvitez/ghc
  • merijn/ghc
  • tomjaguarpaw1/ghc
  • trac-NoidedSuper/ghc
  • erewok/ghc
  • trac-junji.hashimoto/ghc
  • adamwespiser/ghc
  • bjaress/ghc
  • jhrcek/ghc
  • leonschoorl/ghc
  • lukasz-golebiewski/ghc
  • sheaf/ghc
  • last-g/ghc
  • carassius1014/ghc
  • eschwartz/ghc
  • dwincort/ghc
  • felixwiemuth/ghc
  • TimWSpence/ghc
  • marcusmonteirodesouza/ghc
  • WJWH/ghc
  • vtols/ghc
  • theobat/ghc
  • BinderDavid/ghc
  • ckoparkar0/ghc
  • alexander-kjeldaas/ghc
  • dme2/ghc
  • philderbeast/ghc
  • aaronallen8455/ghc
  • rayshih/ghc
  • benkard/ghc
  • mpardalos/ghc
  • saidelman/ghc
  • leiftw/ghc
  • ca333/ghc
  • bwroga/ghc
  • nmichael44/ghc
  • trac-crobbins/ghc
  • felixonmars/ghc
  • adityagupta1089/ghc
  • hgsipiere/ghc
  • treeowl/ghc
  • alexpeits/ghc
  • CraigFe/ghc
  • dnlkrgr/ghc
  • kerckhove_ts/ghc
  • cptwunderlich/ghc
  • eiais/ghc
  • hahohihu/ghc
  • sanchayan/ghc
  • lemmih/ghc
  • sehqlr/ghc
  • trac-dbeacham/ghc
  • luite/ghc
  • trac-f-a/ghc
  • vados/ghc
  • luntain/ghc
  • fatho/ghc
  • alexbiehl-gc/ghc
  • dcbdan/ghc
  • tvh/ghc
  • liam-ly/ghc
  • timbobbarnes/ghc
  • GovanifY/ghc
  • shanth2600/ghc
  • gliboc/ghc
  • duog/ghc
  • moxonsghost/ghc
  • zander/ghc
  • masaeedu/ghc
  • georgefst/ghc
  • guibou/ghc
  • nicuveo/ghc
  • mdebruijne/ghc
  • stjordanis/ghc
  • emiflake/ghc
  • wygulmage/ghc
  • frasertweedale/ghc
  • coot/ghc
  • aratamizuki/ghc
  • tsandstr/ghc
  • mrBliss/ghc
  • Anton-Latukha/ghc
  • tadfisher/ghc
  • vapourismo/ghc
  • Sorokin-Anton/ghc
  • basile-henry/ghc
  • trac-mightybyte/ghc
  • AbsoluteNikola/ghc
  • cobrien99/ghc
  • songzh/ghc
  • blamario/ghc
  • aj4ayushjain/ghc
  • trac-utdemir/ghc
  • tangcl/ghc
  • hdgarrood/ghc
  • maerwald/ghc
  • arjun/ghc
  • ratherforky/ghc
  • haskieLambda/ghc
  • EmilGedda/ghc
  • Bogicevic/ghc
  • eddiejessup/ghc
  • kozross/ghc
  • AlistairB/ghc
  • 3Rafal/ghc
  • christiaanb/ghc
  • trac-bit/ghc
  • matsumonkie/ghc
  • trac-parsonsmatt/ghc
  • chisui/ghc
  • jaro/ghc
  • trac-kmiyazato/ghc
  • davidsd/ghc
  • Tritlo/ghc
  • I-B-3/ghc
  • lykahb/ghc
  • AriFordsham/ghc
  • turion1/ghc
  • berberman/ghc
  • christiantakle/ghc
  • zyklotomic/ghc
  • trac-ocramz/ghc
  • CSEdd/ghc
  • doyougnu/ghc
  • mmhat/ghc
  • why-not-try-calmer/ghc
  • plutotulp/ghc
  • kjekac/ghc
  • Manvi07/ghc
  • teo/ghc
  • cactus/ghc
  • CarrieMY/ghc
  • abel/ghc
  • yihming/ghc
  • tsakki/ghc
  • jessicah/ghc
  • oliverbunting/ghc
  • meld/ghc
  • friedbrice/ghc
  • Joald/ghc
  • abarbu/ghc
  • DigitalBrains1/ghc
  • sterni/ghc
  • alexDarcy/ghc
  • hexchain/ghc
  • minimario/ghc
  • zliu41/ghc
  • tommd/ghc
  • jazcarate/ghc
  • peterbecich/ghc
  • alirezaghey/ghc
  • solomon/ghc
  • mikael.urankar/ghc
  • davjam/ghc
  • int-index/ghc
  • MorrowM/ghc
  • nrnrnr/ghc
  • Sonfamm/ghc-test-only
  • afzt1/ghc
  • nguyenhaibinh-tpc/ghc
  • trac-lierdakil/ghc
  • MichaWiedenmann1/ghc
  • jmorag/ghc
  • Ziharrk/ghc
  • trac-MitchellSalad/ghc
  • juampe/ghc
  • jwaldmann/ghc
  • snowleopard/ghc
  • juhp/ghc
  • normalcoder/ghc
  • ksqsf/ghc
  • trac-jberryman/ghc
  • roberth/ghc
  • 1ntEgr8/ghc
  • epworth/ghc
  • MrAdityaAlok/ghc
  • JunmingZhao42/ghc
  • jappeace/ghc
  • trac-Gabriel439/ghc
  • alt-romes/ghc
  • HugoPeters1024/ghc
  • 10ne1/ghc-fork
  • agentultra/ghc
  • Garfield1002/ghc
  • ChickenProp/ghc
  • clyring/ghc
  • MaxHearnden/ghc
  • jumper149/ghc
  • vem/ghc
  • ketzacoatl/ghc
  • Rosuavio/ghc
  • jackohughes/ghc
  • p4l1ly/ghc
  • konsumlamm/ghc
  • shlevy/ghc
  • torsten.schmits/ghc
  • andremarianiello/ghc
  • amesgen/ghc
  • googleson78/ghc
  • InfiniteVerma/ghc
  • uhbif19/ghc
  • yiyunliu/ghc
  • raehik/ghc
  • mrkun/ghc
  • telser/ghc
  • 1Jajen1/ghc
  • slotThe/ghc
  • WinstonHartnett/ghc
  • mpilgrem/ghc
  • dreamsmasher/ghc
  • schuelermine/ghc
  • trac-Viwor/ghc
  • undergroundquizscene/ghc
  • evertedsphere/ghc
  • coltenwebb/ghc
  • oberblastmeister/ghc
  • agrue/ghc
  • lf-/ghc
  • zacwood9/ghc
  • steshaw/ghc
  • high-cloud/ghc
  • SkamDart/ghc
  • PiDelport/ghc
  • maoif/ghc
  • RossPaterson/ghc
  • CharlesTaylor7/ghc
  • ribosomerocker/ghc
  • trac-ramirez7/ghc
  • daig/ghc
  • NicolasT/ghc
  • FinleyMcIlwaine/ghc
  • lawtonnichols/ghc
  • jmtd/ghc
  • ozkutuk/ghc
  • wildsebastian/ghc
  • nikshalark/ghc
  • lrzlin/ghc
  • tobias/ghc
  • fw/ghc
  • hawkinsw/ghc
  • type-dance/ghc
  • rui314/ghc
  • ocharles/ghc
  • wavewave/ghc
  • TheKK/ghc
  • nomeata/ghc
  • trac-csabahruska/ghc
  • jonathanjameswatson/ghc
  • L-as/ghc
  • Axman6/ghc
  • barracuda156/ghc
  • trac-jship/ghc
  • jake-87/ghc
  • meooow/ghc
  • rebeccat/ghc
  • hamana55/ghc
  • Enigmage/ghc
  • kokobd/ghc
  • agevelt/ghc
  • gshen42/ghc
  • chrismwendt/ghc
  • MangoIV/ghc
  • teto/ghc
  • Sookr1/ghc
  • trac-thomasjm/ghc
  • barci2/ghc-dev
  • trac-m4dc4p/ghc
  • dixonary/ghc
  • breakerzirconia/ghc
  • alexsio27444/ghc
  • glocq/ghc
  • sourabhxyz/ghc
  • ryantrinkle/ghc
  • Jade/ghc
  • scedfaliako/ghc
  • martijnbastiaan/ghc
  • trac-george.colpitts/ghc
  • ammarbinfaisal/ghc
  • mimi.vx/ghc
  • lortabac/ghc
  • trac-zyla/ghc
  • benbellick/ghc
  • aadaa-fgtaa/ghc
  • jvanbruegge/ghc
  • archbung/ghc
  • gilmi/ghc
  • mfonism/ghc
  • alex-mckenna/ghc
  • Ei30metry/ghc
  • DiegoDiverio/ghc
  • jorgecunhamendes/ghc
  • liesnikov/ghc
  • akrmn/ghc
  • trac-simplifierticks/ghc
  • jacco/ghc
  • rhendric/ghc
  • damhiya/ghc
  • ryndubei/ghc
  • DaveBarton/ghc
  • trac-Profpatsch/ghc
  • GZGavinZhao/ghc
  • ncfavier/ghc
  • jameshaydon/ghc
  • ajccosta/ghc
  • dschrempf/ghc
  • cydparser/ghc
  • LinuxUserGD/ghc
  • elodielander/ghc
  • facundominguez/ghc
  • psilospore/ghc
  • lachrimae/ghc
  • dylan-thinnes/ghc-type-errors-plugin
  • hamishmack/ghc
  • Leary/ghc
  • lzszt/ghc
  • lyokha/ghc
  • trac-glaubitz/ghc
  • Rewbert/ghc
  • andreabedini/ghc
  • Jasagredo/ghc
  • sol/ghc
  • OlegAlexander/ghc
  • trac-sthibaul/ghc
  • avdv/ghc
  • Wendaolee/ghc
  • ur4t/ghc
  • daylily/ghc
  • boltzmannrain/ghc
  • mmzk1526/ghc
  • trac-fizzixnerd/ghc
  • soulomoon/ghc
  • rwmjones/ghc
  • j14i/ghc
  • tracsis/ghc
  • gesh/ghc
  • flip101/ghc
  • eldritch-cookie/ghc
  • LemonjamesD/ghc
  • pgujjula/ghc
  • skeuchel/ghc
  • noteed/ghc
  • gulin.serge/ghc
  • Torrekie/ghc
  • jlwoodwa/ghc
  • ayanamists/ghc
  • husong998/ghc
  • trac-edmundnoble/ghc
  • josephf/ghc
  • contrun/ghc
  • baulig/ghc
  • edsko/ghc
  • mzschr/ghc-issue-24732
  • ulidtko/ghc
  • Arsen/ghc
  • trac-sjoerd_visscher/ghc
  • crumbtoo/ghc
  • L0neGamer/ghc
  • DrewFenwick/ghc
  • benz0li/ghc
  • MaciejWas/ghc
  • jordanrule/ghc
  • trac-qqwy/ghc
  • LiamGoodacre/ghc
  • isomorpheme/ghc
  • trac-danidiaz/ghc
  • Kariim/ghc
  • MTaimoorZaeem/ghc
  • hololeap/ghc
  • ticat-fp/ghc
  • meritamen/ghc
  • criskell/ghc
  • trac-kraai/ghc
  • aergus/ghc
  • jdral/ghc
  • SamB/ghc
  • Tristian/ghc
  • ywgrit/ghc
  • KatsuPatrick/ghc
  • OsePedro/ghc
  • mpscholten/ghc
  • fp/ghc
  • zaquest/ghc
  • fangyi-zhou/ghc
  • augyg/ghc
640 results
Show changes
Commits on Source (128)
Showing
with 702 additions and 748 deletions
......@@ -444,7 +444,7 @@ hadrian-multi:
# workaround for docker permissions
- sudo chown ghc:ghc -R .
variables:
GHC_FLAGS: "-Werror=-Wno-error=incomplete-record-selectors -Wwarn=deprecations"
GHC_FLAGS: "-Werror=-Wno-error=incomplete-record-selectors -Wwarn=deprecations -Wwarn=unused-imports"
# -Wno-error=incomplete-record-selectors is present because -Wall now
# includes -Wincomplete-record-selectors, and hadrian-multi has many, many
# warnings about incomplete record selectors. A better fix would be to
......@@ -1235,6 +1235,7 @@ ghcup-metadata-release:
# No explicit needs for release pipeline as we assume we need everything and everything will pass.
extends: .ghcup-metadata
script:
- nix shell -f .gitlab/rel_eng -c ghcup-metadata --release-mode --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" --fragment
- nix shell -f .gitlab/rel_eng -c ghcup-metadata --release-mode --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" > "metadata_test.yaml"
rules:
- if: '$RELEASE_JOB == "yes"'
......
......@@ -326,13 +326,19 @@ function fetch_cabal() {
local base_url="https://downloads.haskell.org/~cabal/cabal-install-$v/"
case "$(uname)" in
Darwin) cabal_url="$base_url/cabal-install-$v-x86_64-apple-darwin17.7.0.tar.xz" ;;
FreeBSD) cabal_url="$base_url/cabal-install-$v-x86_64-freebsd13.tar.xz" ;;
FreeBSD) cabal_url="$base_url/cabal-install-$v-x86_64-freebsd14.tar.xz" ;;
*) fail "don't know where to fetch cabal-install for $(uname)"
esac
echo "Fetching cabal-install from $cabal_url"
curl "$cabal_url" > cabal.tar.xz
tmp="$(tar -tJf cabal.tar.xz | head -n1)"
$TAR -xJf cabal.tar.xz
mv cabal "$toolchain/bin"
# Check if the bindist has directory structure
if [[ "$tmp" = "cabal" ]]; then
mv cabal "$toolchain/bin"
else
mv "$tmp/cabal" "$toolchain/bin"
fi
;;
esac
end_section "fetch cabal"
......@@ -954,7 +960,7 @@ if [ "${CI_COMMIT_BRANCH:-}" == "master" ] && [ "${CI_PROJECT_PATH:-}" == "ghc/
fi
fi
if [ -n "${IGNORE_PERF_FAILURES:-}" ]; then
RUNTEST_ARGS="--ignore-perf-failures=$IGNORE_PERF_FAILURES"
RUNTEST_ARGS=( "${RUNTEST_ARGS[@]:-}" "--ignore-perf-failures=$IGNORE_PERF_FAILURES" )
fi
if [[ -z ${BIGNUM_BACKEND:-} ]]; then BIGNUM_BACKEND=gmp; fi
......
......@@ -101,7 +101,7 @@ There are two different modes this script can operate in:
data Opsys
= Linux LinuxDistro
| Darwin
| FreeBSD13
| FreeBSD14
| Windows deriving (Eq)
data LinuxDistro
......@@ -293,7 +293,7 @@ runnerTag arch (Linux _) =
runnerTag AArch64 Darwin = "aarch64-darwin"
runnerTag Amd64 Darwin = "x86_64-darwin-m1"
runnerTag Amd64 Windows = "new-x86_64-windows"
runnerTag Amd64 FreeBSD13 = "x86_64-freebsd13"
runnerTag Amd64 FreeBSD14 = "x86_64-freebsd14"
runnerTag _ _ = error "Invalid arch/opsys"
tags :: Arch -> Opsys -> BuildConfig -> [String]
......@@ -326,7 +326,7 @@ distroName Rocky8 = "rocky8"
opsysName :: Opsys -> String
opsysName (Linux distro) = "linux-" ++ distroName distro
opsysName Darwin = "darwin"
opsysName FreeBSD13 = "freebsd13"
opsysName FreeBSD14 = "freebsd14"
opsysName Windows = "windows"
archName :: Arch -> String
......@@ -423,15 +423,19 @@ brokenTest :: TestName -- ^ test name
brokenTest test _why = "BROKEN_TESTS" =: test
opsysVariables :: Arch -> Opsys -> Variables
opsysVariables _ FreeBSD13 = mconcat
opsysVariables _ FreeBSD14 = mconcat
[ -- N.B. we use iconv from ports as I see linker errors when we attempt
-- to use the "native" iconv embedded in libc as suggested by the
-- porting guide [1].
-- [1] https://www.freebsd.org/doc/en/books/porters-handbook/using-iconv.html)
"CONFIGURE_ARGS" =: "--with-gmp-includes=/usr/local/include --with-gmp-libraries=/usr/local/lib --with-iconv-includes=/usr/local/include --with-iconv-libraries=/usr/local/lib"
, "HADRIAN_ARGS" =: "--docs=no-sphinx"
"CONFIGURE_ARGS" =: "--with-iconv-includes=/usr/local/include --with-iconv-libraries=/usr/local/lib"
, "CONFIGURE_ARGS" =: "--with-system-libffi --with-ffi-includes=/usr/local/include --with-ffi-libraries=/usr/local/lib"
, "CONFIGURE_ARGS" =: "--with-gmp-includes=/usr/local/include --with-gmp-libraries=/usr/local/lib"
-- Prefer to use the system's clang-based toolchain and not gcc
, "CC" =: "cc"
, "CXX" =: "c++"
, "GHC_VERSION" =: "9.6.4"
, "CABAL_INSTALL_VERSION" =: "3.10.2.0"
, "CABAL_INSTALL_VERSION" =: "3.10.3.0"
]
opsysVariables arch (Linux distro) = distroVariables arch distro
opsysVariables AArch64 (Darwin {}) =
......@@ -1140,6 +1144,11 @@ darwin =
, fastCI (standardBuilds AArch64 Darwin)
]
freebsd_jobs :: [JobGroup Job]
freebsd_jobs =
[ addValidateRule FreeBSDLabel (standardBuilds Amd64 FreeBSD14)
]
alpine_x86 :: [JobGroup Job]
alpine_x86 =
[ -- Fully static build, in theory usable on any linux distribution.
......@@ -1213,6 +1222,7 @@ job_groups =
++ alpine_x86
++ alpine_aarch64
++ cross_jobs
++ freebsd_jobs
mkPlatform :: Arch -> Opsys -> String
......
......@@ -1082,6 +1082,72 @@
"ac_cv_func_utimensat": "no"
}
},
"nightly-x86_64-freebsd14-validate": {
"after_script": [
".gitlab/ci.sh save_cache",
".gitlab/ci.sh save_test_output",
".gitlab/ci.sh clean",
"cat ci_timings"
],
"allow_failure": false,
"artifacts": {
"expire_in": "8 weeks",
"paths": [
"ghc-x86_64-freebsd14-validate.tar.xz",
"junit.xml",
"unexpected-test-output.tar.gz"
],
"reports": {
"junit": "junit.xml"
},
"when": "always"
},
"cache": {
"key": "x86_64-freebsd14-$CACHE_REV",
"paths": [
"cabal-cache",
"toolchain"
]
},
"dependencies": [],
"image": null,
"needs": [
{
"artifacts": false,
"job": "hadrian-ghc-in-ghci"
}
],
"rules": [
{
"if": "(\"true\" == \"true\") && ($RELEASE_JOB != \"yes\") && ($NIGHTLY)",
"when": "on_success"
}
],
"script": [
".gitlab/ci.sh setup",
".gitlab/ci.sh configure",
".gitlab/ci.sh build_hadrian",
".gitlab/ci.sh test_hadrian"
],
"stage": "full-build",
"tags": [
"x86_64-freebsd14"
],
"variables": {
"BIGNUM_BACKEND": "gmp",
"BIN_DIST_NAME": "ghc-x86_64-freebsd14-validate",
"BUILD_FLAVOUR": "validate",
"CABAL_INSTALL_VERSION": "3.10.3.0",
"CC": "cc",
"CONFIGURE_ARGS": "--with-iconv-includes=/usr/local/include --with-iconv-libraries=/usr/local/lib --with-system-libffi --with-ffi-includes=/usr/local/include --with-ffi-libraries=/usr/local/lib --with-gmp-includes=/usr/local/include --with-gmp-libraries=/usr/local/lib --enable-strict-ghc-toolchain-check",
"CXX": "c++",
"GHC_VERSION": "9.6.4",
"INSTALL_CONFIGURE_ARGS": "--enable-strict-ghc-toolchain-check",
"RUNTEST_ARGS": "",
"TEST_ENV": "x86_64-freebsd14-validate",
"XZ_OPT": "-9"
}
},
"nightly-x86_64-linux-alpine3_12-int_native-validate+fully_static": {
"after_script": [
".gitlab/ci.sh save_cache",
......@@ -3844,6 +3910,74 @@
"ac_cv_func_utimensat": "no"
}
},
"release-x86_64-freebsd14-release": {
"after_script": [
".gitlab/ci.sh save_cache",
".gitlab/ci.sh save_test_output",
".gitlab/ci.sh clean",
"cat ci_timings"
],
"allow_failure": false,
"artifacts": {
"expire_in": "1 year",
"paths": [
"ghc-x86_64-freebsd14-release.tar.xz",
"junit.xml",
"unexpected-test-output.tar.gz"
],
"reports": {
"junit": "junit.xml"
},
"when": "always"
},
"cache": {
"key": "x86_64-freebsd14-$CACHE_REV",
"paths": [
"cabal-cache",
"toolchain"
]
},
"dependencies": [],
"image": null,
"needs": [
{
"artifacts": false,
"job": "hadrian-ghc-in-ghci"
}
],
"rules": [
{
"if": "(\"true\" == \"true\") && ($RELEASE_JOB == \"yes\") && ($NIGHTLY == null)",
"when": "on_success"
}
],
"script": [
".gitlab/ci.sh setup",
".gitlab/ci.sh configure",
".gitlab/ci.sh build_hadrian",
".gitlab/ci.sh test_hadrian"
],
"stage": "full-build",
"tags": [
"x86_64-freebsd14"
],
"variables": {
"BIGNUM_BACKEND": "gmp",
"BIN_DIST_NAME": "ghc-x86_64-freebsd14-release",
"BUILD_FLAVOUR": "release",
"CABAL_INSTALL_VERSION": "3.10.3.0",
"CC": "cc",
"CONFIGURE_ARGS": "--with-iconv-includes=/usr/local/include --with-iconv-libraries=/usr/local/lib --with-system-libffi --with-ffi-includes=/usr/local/include --with-ffi-libraries=/usr/local/lib --with-gmp-includes=/usr/local/include --with-gmp-libraries=/usr/local/lib --enable-strict-ghc-toolchain-check",
"CXX": "c++",
"GHC_VERSION": "9.6.4",
"HADRIAN_ARGS": "--hash-unit-ids",
"IGNORE_PERF_FAILURES": "all",
"INSTALL_CONFIGURE_ARGS": "--enable-strict-ghc-toolchain-check",
"RUNTEST_ARGS": "",
"TEST_ENV": "x86_64-freebsd14-release",
"XZ_OPT": "-9"
}
},
"release-x86_64-linux-alpine3_12-int_native-release+fully_static": {
"after_script": [
".gitlab/ci.sh save_cache",
......@@ -5222,6 +5356,71 @@
"ac_cv_func_utimensat": "no"
}
},
"x86_64-freebsd14-validate": {
"after_script": [
".gitlab/ci.sh save_cache",
".gitlab/ci.sh save_test_output",
".gitlab/ci.sh clean",
"cat ci_timings"
],
"allow_failure": false,
"artifacts": {
"expire_in": "2 weeks",
"paths": [
"ghc-x86_64-freebsd14-validate.tar.xz",
"junit.xml",
"unexpected-test-output.tar.gz"
],
"reports": {
"junit": "junit.xml"
},
"when": "always"
},
"cache": {
"key": "x86_64-freebsd14-$CACHE_REV",
"paths": [
"cabal-cache",
"toolchain"
]
},
"dependencies": [],
"image": null,
"needs": [
{
"artifacts": false,
"job": "hadrian-ghc-in-ghci"
}
],
"rules": [
{
"if": "((($ONLY_JOBS) && ($ONLY_JOBS =~ /.*\\bx86_64-freebsd14-validate(\\s|$).*/)) || (($ONLY_JOBS == null) && ((($CI_MERGE_REQUEST_LABELS =~ /.*full-ci.*/) || ($CI_MERGE_REQUEST_LABELS =~ /.*marge_bot_batch_merge_job.*/) || ($CI_COMMIT_BRANCH == \"master\") || ($CI_COMMIT_BRANCH =~ /ghc-[0-9]+\\.[0-9]+/)) || ($CI_MERGE_REQUEST_LABELS =~ /.*FreeBSD.*/)))) && ($RELEASE_JOB != \"yes\") && ($NIGHTLY == null)",
"when": "on_success"
}
],
"script": [
".gitlab/ci.sh setup",
".gitlab/ci.sh configure",
".gitlab/ci.sh build_hadrian",
".gitlab/ci.sh test_hadrian"
],
"stage": "full-build",
"tags": [
"x86_64-freebsd14"
],
"variables": {
"BIGNUM_BACKEND": "gmp",
"BIN_DIST_NAME": "ghc-x86_64-freebsd14-validate",
"BUILD_FLAVOUR": "validate",
"CABAL_INSTALL_VERSION": "3.10.3.0",
"CC": "cc",
"CONFIGURE_ARGS": "--with-iconv-includes=/usr/local/include --with-iconv-libraries=/usr/local/lib --with-system-libffi --with-ffi-includes=/usr/local/include --with-ffi-libraries=/usr/local/lib --with-gmp-includes=/usr/local/include --with-gmp-libraries=/usr/local/lib --enable-strict-ghc-toolchain-check",
"CXX": "c++",
"GHC_VERSION": "9.6.4",
"INSTALL_CONFIGURE_ARGS": "--enable-strict-ghc-toolchain-check",
"RUNTEST_ARGS": "",
"TEST_ENV": "x86_64-freebsd14-validate"
}
},
"x86_64-linux-alpine3_12-int_native-validate+fully_static": {
"after_script": [
".gitlab/ci.sh save_cache",
......
......@@ -132,7 +132,7 @@ def fetch_artifacts(release: str, pipeline_id: int,
for f in doc_files:
subprocess.run(['tar', '-xf', f, '-C', dest])
logging.info(f'extracted docs {f} to {dest}')
index_path = destdir / 'index.html'
index_path = destdir / 'docs' / 'index.html'
index_path.replace(dest / 'index.html')
elif job.name == 'hackage-doc-tarball':
dest = dest_dir / 'hackage_docs'
......
......@@ -243,7 +243,9 @@ def mk_new_yaml(release_mode, version, date, pipeline_type, job_map):
, "unknown_versioning" : centos7 }
, "Linux_Fedora" : { ">= 33": fedora33
, "unknown_versioning": centos7 }
, "Linux_RedHat" : { "unknown_versioning": centos7 }
, "Linux_RedHat" : { "< 9": centos7
, ">= 9": fedora33
, "unknown_versioning": fedora33 }
, "Linux_UnknownLinux" : { "unknown_versioning": rocky8 }
, "Darwin" : { "unknown_versioning" : darwin_x86 }
, "Windows" : { "unknown_versioning" : windows }
......
......@@ -59,8 +59,10 @@ usage() {
echo " prepare_docs prepare the documentation directory"
echo " upload_docs upload documentation downloads.haskell.org"
echo " upload upload the tarballs and documentation to downloads.haskell.org"
echo " set_symlink <symlink>"
echo " set the given symlink (e.g. latest) to the current version"
echo " purge_all purge entire release from the CDN"
echo " purge_file file purge a given file from the CDN"
echo " purge_file <file> purge a given file from the CDN"
echo " verify verify the signatures in this directory"
echo
}
......@@ -200,6 +202,14 @@ function upload_docs() {
"$GHC_TREE/.gitlab/rel_eng/upload_ghc_libs.py" upload --docs=hackage_docs ${args[@]}
}
function set_symlink() {
local SYMLINK="$1"
# Check to make sure that the indicated version actually exists.
curl "https://downloads.haskell.org/ghc/$ver" > /dev/null || (echo "$ver doesn't exist"; exit 1)
echo -e "rm ghc/$SYMLINK\nln -s $ver ghc/$SYMLINK" | sftp ghc@downloads-origin.haskell.org
curl -X PURGE "http://downloads.haskell.org/~ghc/$SYMLINK"
}
if [ "x$1" == "x" ]; then
recompress
gen_hashes
......
......@@ -104,7 +104,7 @@
url = https://gitlab.haskell.org/ghc/libffi-tarballs.git
ignore = untracked
[submodule "gmp-tarballs"]
path = libraries/ghc-bignum/gmp/gmp-tarballs
path = libraries/ghc-internal/gmp/gmp-tarballs
url = https://gitlab.haskell.org/ghc/gmp-tarballs.git
[submodule "libraries/exceptions"]
path = libraries/exceptions
......
......@@ -4,6 +4,7 @@
{-# LANGUAGE TupleSections, NamedFieldPuns #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE LambdaCase #-}
-- -----------------------------------------------------------------------------
--
......@@ -422,7 +423,7 @@ import GHC.Types.PkgQual
import GHC.Types.Unique.FM
import GHC.Unit
import GHC.Unit.Env
import GHC.Unit.Env as UnitEnv
import GHC.Unit.External
import GHC.Unit.Finder
import GHC.Unit.Module.ModIface
......@@ -431,6 +432,7 @@ import GHC.Unit.Module.ModDetails
import GHC.Unit.Module.ModSummary
import GHC.Unit.Module.Graph
import GHC.Unit.Home.ModInfo
import qualified GHC.Unit.Home.Graph as HUG
import GHC.Settings
import Control.Applicative ((<|>))
......@@ -453,6 +455,7 @@ import System.Environment ( getEnv, getProgName )
import System.Exit ( exitWith, ExitCode(..) )
import System.FilePath
import System.IO.Error ( isDoesNotExistError )
import GHC.Unit.Home.PackageTable
-- %************************************************************************
-- %* *
......@@ -666,7 +669,7 @@ setUnitDynFlagsNoCheck uid dflags1 = do
, homeUnitEnv_home_unit = Just home_unit
}
let unit_env = ue_updateHomeUnitEnv upd uid (hsc_unit_env hsc_env)
let unit_env = UnitEnv.ue_updateHomeUnitEnv upd uid (hsc_unit_env hsc_env)
let dflags = updated_dflags
......@@ -684,7 +687,7 @@ setUnitDynFlagsNoCheck uid dflags1 = do
let !unit_env1 =
if homeUnitId_ dflags /= uid
then
ue_renameUnitId
UnitEnv.renameUnitId
uid
(homeUnitId_ dflags)
unit_env0
......@@ -731,7 +734,7 @@ setTopSessionDynFlags dflags = do
wasmInterpTargetPlatform = targetPlatform dflags,
wasmInterpProfiled = profiled,
wasmInterpHsSoSuffix = way_tag ++ dynLibSuffix (ghcNameVersion dflags),
wasmInterpUnitState = ue_units $ hsc_unit_env hsc_env
wasmInterpUnitState = ue_homeUnitState $ hsc_unit_env hsc_env
}
pure $ Just $ Interp (ExternalInterp $ ExtWasm $ ExtInterpState cfg s) loader lookup_cache
......@@ -825,7 +828,7 @@ setProgramDynFlags_ invalidate_needed dflags = do
let cached_unit_dbs = homeUnitEnv_unit_dbs homeUnitEnv
dflags = homeUnitEnv_dflags homeUnitEnv
old_hpt = homeUnitEnv_hpt homeUnitEnv
home_units = unitEnv_keys (ue_home_unit_graph old_unit_env)
home_units = HUG.allUnits (ue_home_unit_graph old_unit_env)
(dbs,unit_state,home_unit,mconstants) <- liftIO $ initUnits logger dflags cached_unit_dbs home_units
......@@ -838,7 +841,7 @@ setProgramDynFlags_ invalidate_needed dflags = do
, homeUnitEnv_home_unit = Just home_unit
}
let dflags1 = homeUnitEnv_dflags $ unitEnv_lookup (ue_currentUnit old_unit_env) home_unit_graph
let dflags1 = homeUnitEnv_dflags $ HUG.unitEnv_lookup (ue_currentUnit old_unit_env) home_unit_graph
let unit_env = UnitEnv
{ ue_platform = targetPlatform dflags1
, ue_namever = ghcNameVersion dflags1
......@@ -1419,12 +1422,14 @@ getModuleGraph = liftM hsc_mod_graph getSession
-- | Return @True@ \<==> module is loaded.
isLoaded :: GhcMonad m => ModuleName -> m Bool
isLoaded m = withSession $ \hsc_env ->
return $! isJust (lookupHpt (hsc_HPT hsc_env) m)
isLoaded m = withSession $ \hsc_env -> liftIO $ do
hmi <- lookupHpt (hsc_HPT hsc_env) m
return $! isJust hmi
isLoadedModule :: GhcMonad m => UnitId -> ModuleName -> m Bool
isLoadedModule uid m = withSession $ \hsc_env ->
return $! isJust (lookupHug (hsc_HUG hsc_env) uid m)
isLoadedModule uid m = withSession $ \hsc_env -> liftIO $ do
hmi <- HUG.lookupHug (hsc_HUG hsc_env) uid m
return $! isJust hmi
-- | Return the bindings for the current interactive session.
getBindings :: GhcMonad m => m [TyThing]
......@@ -1495,7 +1500,7 @@ availsToGlobalRdrEnv hsc_env mod avails
getHomeModuleInfo :: HscEnv -> Module -> IO (Maybe ModuleInfo)
getHomeModuleInfo hsc_env mdl =
case lookupHugByModule mdl (hsc_HUG hsc_env) of
HUG.lookupHugByModule mdl (hsc_HUG hsc_env) >>= \case
Nothing -> return Nothing
Just hmi -> do
let details = hm_details hmi
......@@ -1806,9 +1811,9 @@ lookupQualifiedModule NoPkgQual mod_name = withSession $ \hsc_env -> do
lookupQualifiedModule pkgqual mod_name = findQualifiedModule pkgqual mod_name
lookupLoadedHomeModule :: GhcMonad m => UnitId -> ModuleName -> m (Maybe Module)
lookupLoadedHomeModule uid mod_name = withSession $ \hsc_env -> do
liftIO $ trace_if (hsc_logger hsc_env) (text "lookupLoadedHomeModule" <+> ppr mod_name <+> ppr uid)
case lookupHug (hsc_HUG hsc_env) uid mod_name of
lookupLoadedHomeModule uid mod_name = withSession $ \hsc_env -> liftIO $ do
trace_if (hsc_logger hsc_env) (text "lookupLoadedHomeModule" <+> ppr mod_name <+> ppr uid)
HUG.lookupHug (hsc_HUG hsc_env) uid mod_name >>= \case
Just mod_info -> return (Just (mi_module (hm_iface mod_info)))
_not_a_home_module -> return Nothing
......@@ -1842,8 +1847,7 @@ getGHCiMonad :: GhcMonad m => m Name
getGHCiMonad = fmap (ic_monad . hsc_IC) getSession
getHistorySpan :: GhcMonad m => History -> m SrcSpan
getHistorySpan h = withSession $ \hsc_env ->
return $ GHC.Runtime.Eval.getHistorySpan hsc_env h
getHistorySpan h = withSession $ \hsc_env -> liftIO $ GHC.Runtime.Eval.getHistorySpan hsc_env h
obtainTermFromVal :: GhcMonad m => Int -> Bool -> Type -> a -> m Term
obtainTermFromVal bound force ty a = withSession $ \hsc_env ->
......
......@@ -574,9 +574,9 @@ gHC_INTERNAL_CONTROL_MONAD_ZIP :: Module
gHC_INTERNAL_CONTROL_MONAD_ZIP = mkGhcInternalModule (fsLit "GHC.Internal.Control.Monad.Zip")
gHC_INTERNAL_NUM_INTEGER, gHC_INTERNAL_NUM_NATURAL, gHC_INTERNAL_NUM_BIGNAT :: Module
gHC_INTERNAL_NUM_INTEGER = mkBignumModule (fsLit "GHC.Num.Integer")
gHC_INTERNAL_NUM_NATURAL = mkBignumModule (fsLit "GHC.Num.Natural")
gHC_INTERNAL_NUM_BIGNAT = mkBignumModule (fsLit "GHC.Num.BigNat")
gHC_INTERNAL_NUM_INTEGER = mkGhcInternalModule (fsLit "GHC.Internal.Bignum.Integer")
gHC_INTERNAL_NUM_NATURAL = mkGhcInternalModule (fsLit "GHC.Internal.Bignum.Natural")
gHC_INTERNAL_NUM_BIGNAT = mkGhcInternalModule (fsLit "GHC.Internal.Bignum.BigNat")
gHC_INTERNAL_BASE, gHC_INTERNAL_ENUM,
gHC_INTERNAL_GHCI, gHC_INTERNAL_GHCI_HELPERS, gHC_CSTRING, gHC_INTERNAL_DATA_STRING,
......@@ -686,9 +686,6 @@ mAIN_NAME = mkModuleNameFS (fsLit "Main")
mkPrimModule :: FastString -> Module
mkPrimModule m = mkModule primUnit (mkModuleNameFS m)
mkBignumModule :: FastString -> Module
mkBignumModule m = mkModule bignumUnit (mkModuleNameFS m)
mkGhcInternalModule :: FastString -> Module
mkGhcInternalModule m = mkGhcInternalModule_ (mkModuleNameFS m)
......
......@@ -106,7 +106,7 @@ assembleBCOs interp profile proto_bcos tycons top_strs modbreaks spt_entries = d
bcos' <- mallocStrings interp bcos
return CompiledByteCode
{ bc_bcos = bcos'
, bc_itbls = itblenv
, bc_itbls = itblenv
, bc_ffis = concatMap protoBCOFFIs proto_bcos
, bc_strs = top_strs
, bc_breaks = modbreaks
......@@ -178,11 +178,12 @@ assembleOneBCO interp profile pbco = do
return ubco'
assembleBCO :: Platform -> ProtoBCO Name -> IO UnlinkedBCO
assembleBCO platform (ProtoBCO { protoBCOName = nm
, protoBCOInstrs = instrs
, protoBCOBitmap = bitmap
, protoBCOBitmapSize = bsize
, protoBCOArity = arity }) = do
assembleBCO platform
(ProtoBCO { protoBCOName = nm
, protoBCOInstrs = instrs
, protoBCOBitmap = bitmap
, protoBCOBitmapSize = bsize
, protoBCOArity = arity }) = do
-- pass 1: collect up the offsets of the local labels.
let asm = mapM_ (assembleI platform) instrs
......@@ -527,6 +528,10 @@ assembleI platform i = case i of
, SmallOp tickx, SmallOp infox
, Op np
]
#if MIN_VERSION_rts(1,0,3)
BCO_NAME name -> do np <- lit [BCONPtrStr name]
emit bci_BCO_NAME [Op np]
#endif
where
literal (LitLabel fs _) = litlabel fs
......
{-# LANGUAGE CPP #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE LambdaCase #-}
{-# OPTIONS_GHC -funbox-strict-fields #-}
......@@ -27,6 +27,10 @@ import GHC.Runtime.Heap.Layout ( StgWord )
import Data.Int
import Data.Word
#if MIN_VERSION_rts(1,0,3)
import Data.ByteString (ByteString)
#endif
import GHC.Stack.CCS (CostCentre)
import GHC.Stg.Syntax
......@@ -229,6 +233,22 @@ data BCInstr
!Word16 -- breakpoint info index
(RemotePtr CostCentre)
#if MIN_VERSION_rts(1,0,3)
-- | A "meta"-instruction for recording the name of a BCO for debugging purposes.
-- These are ignored by the interpreter but helpfully printed by the disassmbler.
| BCO_NAME !ByteString
#endif
{- Note [BCO_NAME]
~~~~~~~~~~~~~~~
The BCO_NAME instruction is a debugging-aid enabled with the -fadd-bco-name flag.
When enabled the bytecode assembler will prepend a BCO_NAME instruction to every
generated bytecode object capturing the STG name of the binding the BCO implements.
This is then printed by the bytecode disassembler, allowing bytecode objects to be
readily correlated with their STG and Core source.
-}
-- -----------------------------------------------------------------------------
-- Printing bytecode instructions
......@@ -383,6 +403,9 @@ instance Outputable BCInstr where
<+> text "<tick_module>" <+> ppr tickx
<+> text "<info_module>" <+> ppr infox
<+> text "<cc>"
#if MIN_VERSION_rts(1,0,3)
ppr (BCO_NAME nm) = text "BCO_NAME" <+> text (show nm)
#endif
......@@ -487,3 +510,6 @@ bciStackUse SLIDE{} = 0
bciStackUse MKAP{} = 0
bciStackUse MKPAP{} = 0
bciStackUse PACK{} = 1 -- worst case is PACK 0 words
#if MIN_VERSION_rts(1,0,3)
bciStackUse BCO_NAME{} = 0
#endif
......@@ -24,8 +24,6 @@ data CmmConfig = CmmConfig
, cmmExternalDynamicRefs :: !Bool -- ^ Generate code to link against dynamic libraries
, cmmDoCmmSwitchPlans :: !Bool -- ^ Should the Cmm pass replace Stg switch statements
, cmmSplitProcPoints :: !Bool -- ^ Should Cmm split proc points or not
, cmmAllowMul2 :: !Bool -- ^ Does this platform support mul2
, cmmOptConstDivision :: !Bool -- ^ Should we optimize constant divisors
}
-- | retrieve the target Cmm platform
......
......@@ -7,7 +7,6 @@ module GHC.Cmm.MachOp
, pprMachOp, isCommutableMachOp, isAssociativeMachOp
, isComparisonMachOp, maybeIntComparison, machOpResultType
, machOpArgReps, maybeInvertComparison, isFloatComparison
, isCommutableCallishMachOp
-- MachOp builders
, mo_wordAdd, mo_wordSub, mo_wordEq, mo_wordNe,mo_wordMul, mo_wordSQuot
......@@ -846,17 +845,3 @@ machOpMemcpyishAlign op = case op of
MO_Memmove align -> Just align
MO_Memcmp align -> Just align
_ -> Nothing
isCommutableCallishMachOp :: CallishMachOp -> Bool
isCommutableCallishMachOp op =
case op of
MO_x64_Add -> True
MO_x64_Mul -> True
MO_x64_Eq -> True
MO_x64_Ne -> True
MO_x64_And -> True
MO_x64_Or -> True
MO_x64_Xor -> True
MO_S_Mul2 _ -> True
MO_U_Mul2 _ -> True
_ -> False
......@@ -5,52 +5,27 @@
-- (c) The University of Glasgow 2006
--
-----------------------------------------------------------------------------
{-# LANGUAGE TupleSections #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE PatternSynonyms #-}
module GHC.Cmm.Opt (
constantFoldNode,
constantFoldExpr,
cmmMachOpFold,
cmmMachOpFoldM,
Opt, runOpt
cmmMachOpFoldM
) where
import GHC.Prelude
import GHC.Cmm.Dataflow.Block
import GHC.Cmm.Utils
import GHC.Cmm
import GHC.Cmm.Config
import GHC.Types.Unique.DSM
import GHC.Utils.Misc
import GHC.Utils.Panic
import GHC.Platform
import Data.Maybe
import Data.Word
import GHC.Exts (oneShot)
import Control.Monad
constantFoldNode :: CmmNode e x -> Opt (CmmNode e x)
constantFoldNode (CmmUnsafeForeignCall (PrimTarget op) res args)
= traverse constantFoldExprOpt args >>= cmmCallishMachOpFold op res
constantFoldNode node
= mapExpOpt constantFoldExprOpt node
constantFoldExprOpt :: CmmExpr -> Opt CmmExpr
constantFoldExprOpt e = wrapRecExpOpt f e
where
f (CmmMachOp op args)
= do
cfg <- getConfig
case cmmMachOpFold (cmmPlatform cfg) op args of
CmmMachOp op' args' -> fromMaybe (CmmMachOp op' args') <$> cmmMachOpFoldOptM cfg op' args'
e -> pure e
f (CmmRegOff r 0) = pure (CmmReg r)
f (CmmLit (CmmInt x rep)) = pure (CmmLit $ CmmInt (narrowU rep x) rep)
f e = pure e
constantFoldNode :: Platform -> CmmNode e x -> CmmNode e x
constantFoldNode platform = mapExp (constantFoldExpr platform)
constantFoldExpr :: Platform -> CmmExpr -> CmmExpr
constantFoldExpr platform = wrapRecExp f
......@@ -321,7 +296,7 @@ cmmMachOpFoldM platform cmp [CmmMachOp conv [x], CmmLit (CmmInt i _)]
maybe_comparison (MO_S_Le _) rep False = Just (MO_U_Le rep)
maybe_comparison _ _ _ = Nothing
-- We can often do something with constants of 0, 1 and (-1) ...
-- We can often do something with constants of 0 and 1 ...
-- See Note [Comparison operators]
cmmMachOpFoldM platform mop [x, y@(CmmLit (CmmInt 0 _))]
......@@ -392,8 +367,6 @@ cmmMachOpFoldM platform mop [x, (CmmLit (CmmInt n _))]
MO_Mul rep
| Just p <- exactLog2 n ->
Just $! (cmmMachOpFold platform (MO_Shl rep) [x, CmmLit (CmmInt p $ wordWidth platform)])
-- The optimization for division by power of 2 is technically duplicated, but since at least one other part of ghc uses
-- the pure `constantFoldExpr` this remains
MO_U_Quot rep
| Just p <- exactLog2 n ->
Just $! (cmmMachOpFold platform (MO_U_Shr rep) [x, CmmLit (CmmInt p $ wordWidth platform)])
......@@ -402,19 +375,46 @@ cmmMachOpFoldM platform mop [x, (CmmLit (CmmInt n _))]
Just $! (cmmMachOpFold platform (MO_And rep) [x, CmmLit (CmmInt (n - 1) rep)])
MO_S_Quot rep
| Just p <- exactLog2 n,
CmmReg _ <- x ->
CmmReg _ <- x -> -- We duplicate x in signedQuotRemHelper, hence require
-- it is a reg. FIXME: remove this restriction.
Just $! (cmmMachOpFold platform (MO_S_Shr rep)
[signedQuotRemHelper platform n x rep p, CmmLit (CmmInt p $ wordWidth platform)])
[signedQuotRemHelper rep p, CmmLit (CmmInt p $ wordWidth platform)])
MO_S_Rem rep
| Just p <- exactLog2 n,
CmmReg _ <- x ->
CmmReg _ <- x -> -- We duplicate x in signedQuotRemHelper, hence require
-- it is a reg. FIXME: remove this restriction.
-- We replace (x `rem` 2^p) by (x - (x `quot` 2^p) * 2^p).
-- Moreover, we fuse MO_S_Shr (last operation of MO_S_Quot)
-- and MO_S_Shl (multiplication by 2^p) into a single MO_And operation.
Just $! (cmmMachOpFold platform (MO_Sub rep)
[x, cmmMachOpFold platform (MO_And rep)
[signedQuotRemHelper platform n x rep p, CmmLit (CmmInt (- n) rep)]])
[signedQuotRemHelper rep p, CmmLit (CmmInt (- n) rep)]])
_ -> Nothing
where
-- In contrast with unsigned integers, for signed ones
-- shift right is not the same as quot, because it rounds
-- to minus infinity, whereas quot rounds toward zero.
-- To fix this up, we add one less than the divisor to the
-- dividend if it is a negative number.
--
-- to avoid a test/jump, we use the following sequence:
-- x1 = x >> word_size-1 (all 1s if -ve, all 0s if +ve)
-- x2 = y & (divisor-1)
-- result = x + x2
-- this could be done a bit more simply using conditional moves,
-- but we're processor independent here.
--
-- we optimise the divide by 2 case slightly, generating
-- x1 = x >> word_size-1 (unsigned)
-- return = x + x1
signedQuotRemHelper :: Width -> Integer -> CmmExpr
signedQuotRemHelper rep p = CmmMachOp (MO_Add rep) [x, x2]
where
bits = fromIntegral (widthInBits rep) - 1
shr = if p == 1 then MO_U_Shr rep else MO_S_Shr rep
x1 = CmmMachOp shr [x, CmmLit (CmmInt bits $ wordWidth platform)]
x2 = if p == 1 then x1 else
CmmMachOp (MO_And rep) [x1, CmmLit (CmmInt (n-1) rep)]
-- ToDo (#7116): optimise floating-point multiplication, e.g. x*2.0 -> x+x
-- Unfortunately this needs a unique supply because x might not be a
......@@ -448,533 +448,3 @@ That's what the constant-folding operations on comparison operators do above.
isPicReg :: CmmExpr -> Bool
isPicReg (CmmReg (CmmGlobal (GlobalRegUse PicBaseReg _))) = True
isPicReg _ = False
canOptimizeDivision :: CmmConfig -> Width -> Bool
canOptimizeDivision cfg rep = cmmOptConstDivision cfg &&
-- we can either widen the arguments to simulate mul2 or use mul2 directly for the platform word size
(rep < wordWidth platform || (rep == wordWidth platform && cmmAllowMul2 cfg))
where platform = cmmPlatform cfg
-- -----------------------------------------------------------------------------
-- Folding callish machops
cmmCallishMachOpFold :: CallishMachOp -> [CmmFormal] -> [CmmActual] -> Opt (CmmNode O O)
cmmCallishMachOpFold op res args =
fromMaybe (CmmUnsafeForeignCall (PrimTarget op) res args) <$> (getConfig >>= \cfg -> cmmCallishMachOpFoldM cfg op res args)
cmmCallishMachOpFoldM :: CmmConfig -> CallishMachOp -> [CmmFormal] -> [CmmActual] -> Opt (Maybe (CmmNode O O))
-- If possible move the literals to the right, the following cases assume that to be the case
cmmCallishMachOpFoldM cfg op res [x@(CmmLit _),y]
| isCommutableCallishMachOp op && not (isLit y) = cmmCallishMachOpFoldM cfg op res [y,x]
-- Both arguments are literals, replace with the result
cmmCallishMachOpFoldM _ op res [CmmLit (CmmInt x _), CmmLit (CmmInt y _)]
= case op of
MO_S_Mul2 rep
| [rHiNeeded,rHi,rLo] <- res -> do
let resSz = widthInBits rep
resVal = (narrowS rep x) * (narrowS rep y)
high = resVal `shiftR` resSz
low = narrowS rep resVal
isHiNeeded = high /= low `shiftR` resSz
isHiNeededVal = if isHiNeeded then 1 else 0
prependNode $! CmmAssign (CmmLocal rHiNeeded) (CmmLit $ CmmInt isHiNeededVal rep)
prependNode $! CmmAssign (CmmLocal rHi) (CmmLit $ CmmInt high rep)
pure . Just $! CmmAssign (CmmLocal rLo) (CmmLit $ CmmInt low rep)
MO_U_Mul2 rep
| [rHi,rLo] <- res -> do
let resSz = widthInBits rep
resVal = (narrowU rep x) * (narrowU rep y)
high = resVal `shiftR` resSz
low = narrowU rep resVal
prependNode $! CmmAssign (CmmLocal rHi) (CmmLit $ CmmInt high rep)
pure . Just $! CmmAssign (CmmLocal rLo) (CmmLit $ CmmInt low rep)
MO_S_QuotRem rep
| [rQuot, rRem] <- res,
y /= 0 -> do
let (q,r) = quotRem (narrowS rep x) (narrowS rep y)
prependNode $! CmmAssign (CmmLocal rQuot) (CmmLit $ CmmInt q rep)
pure . Just $! CmmAssign (CmmLocal rRem) (CmmLit $ CmmInt r rep)
MO_U_QuotRem rep
| [rQuot, rRem] <- res,
y /= 0 -> do
let (q,r) = quotRem (narrowU rep x) (narrowU rep y)
prependNode $! CmmAssign (CmmLocal rQuot) (CmmLit $ CmmInt q rep)
pure . Just $! CmmAssign (CmmLocal rRem) (CmmLit $ CmmInt r rep)
_ -> pure Nothing
-- 0, 1 or -1 as one of the constants
cmmCallishMachOpFoldM _ op res [_, CmmLit (CmmInt 0 _)]
= case op of
-- x * 0 == 0
MO_S_Mul2 rep
| [rHiNeeded, rHi, rLo] <- res -> do
prependNode $! CmmAssign (CmmLocal rHiNeeded) (CmmLit $ CmmInt 0 rep)
prependNode $! CmmAssign (CmmLocal rHi) (CmmLit $ CmmInt 0 rep)
pure . Just $! CmmAssign (CmmLocal rLo) (CmmLit $ CmmInt 0 rep)
-- x * 0 == 0
MO_U_Mul2 rep
| [rHi, rLo] <- res -> do
prependNode $! CmmAssign (CmmLocal rHi) (CmmLit $ CmmInt 0 rep)
pure . Just $! CmmAssign (CmmLocal rLo) (CmmLit $ CmmInt 0 rep)
_ -> pure Nothing
cmmCallishMachOpFoldM _ op res [CmmLit (CmmInt 0 _), _]
= case op of
-- 0 quotRem d == (0,0)
MO_S_QuotRem rep
| [rQuot, rRem] <- res -> do
prependNode $! CmmAssign (CmmLocal rQuot) (CmmLit $ CmmInt 0 rep)
pure . Just $! CmmAssign (CmmLocal rRem) (CmmLit $ CmmInt 0 rep)
-- 0 quotRem d == (0,0)
MO_U_QuotRem rep
| [rQuot,rRem] <- res -> do
prependNode $! CmmAssign (CmmLocal rQuot) (CmmLit $ CmmInt 0 rep)
pure . Just $! CmmAssign (CmmLocal rRem) (CmmLit $ CmmInt 0 rep)
_ -> pure Nothing
cmmCallishMachOpFoldM cfg op res [x, CmmLit (CmmInt 1 _)]
= case op of
-- x * 1 == x -- Note: The high word needs to be a sign extension of the low word, so we use a sign extending shift
MO_S_Mul2 rep
| [rHiNeeded, rHi, rLo] <- res -> do
let platform = cmmPlatform cfg
wordRep = wordWidth platform
repInBits = toInteger $ widthInBits rep
prependNode $! CmmAssign (CmmLocal rHiNeeded) (CmmLit $ CmmInt 0 rep)
prependNode $! CmmAssign (CmmLocal rHi) (cmmMachOpFold platform (MO_S_Shr rep) [x, CmmLit $ CmmInt (repInBits - 1) wordRep])
pure . Just $! CmmAssign (CmmLocal rLo) x
-- x * 1 == x
MO_U_Mul2 rep
| [rHi, rLo] <- res -> do
prependNode $! CmmAssign (CmmLocal rHi) (CmmLit $ CmmInt 0 rep)
pure . Just $! CmmAssign (CmmLocal rLo) x
-- x quotRem 1 == (x, 0)
MO_S_QuotRem rep
| [rQuot, rRem] <- res -> do
prependNode $! CmmAssign (CmmLocal rQuot) x
pure . Just $! CmmAssign (CmmLocal rRem) (CmmLit $ CmmInt 0 rep)
-- x quotRem 1 == (x, 0)
MO_U_QuotRem rep
| [rQuot, rRem] <- res -> do
prependNode $! CmmAssign (CmmLocal rQuot) x
pure . Just $! CmmAssign (CmmLocal rRem) (CmmLit $ CmmInt 0 rep)
_ -> pure Nothing
-- handle quotRem with a constant divisor
cmmCallishMachOpFoldM cfg op res [n, CmmLit (CmmInt d' _)]
= case op of
MO_S_QuotRem rep
| Just p <- exactLog2 d,
[rQuot,rRem] <- res -> do
n' <- intoRegister n (cmmBits rep)
-- first prepend the optimized division by a power 2
prependNode $! CmmAssign (CmmLocal rQuot)
(cmmMachOpFold platform (MO_S_Shr rep)
[signedQuotRemHelper platform d n' rep p, CmmLit (CmmInt p $ wordWidth platform)])
-- then output an optimized remainder by a power of 2
pure . Just $! CmmAssign (CmmLocal rRem)
(cmmMachOpFold platform (MO_Sub rep)
[n', cmmMachOpFold platform (MO_And rep)
[signedQuotRemHelper platform d n' rep p, CmmLit (CmmInt (- d) rep)]])
| canOptimizeDivision cfg rep,
d /= (-1), d /= 0, d /= 1,
[rQuot,rRem] <- res -> do
-- we are definitely going to use n multiple times, so put it into a register
n' <- intoRegister n (cmmBits rep)
-- generate an optimized (signed) division of n by d
q <- generateDivisionBySigned platform cfg rep n' d
-- we also need the result multiple times to calculate the remainder
q' <- intoRegister q (cmmBits rep)
prependNode $! CmmAssign (CmmLocal rQuot) q'
-- The remainder now becomes n - q * d
pure . Just $! CmmAssign (CmmLocal rRem) $ CmmMachOp (MO_Sub rep) [n', CmmMachOp (MO_Mul rep) [q', CmmLit $ CmmInt d rep]]
where
platform = cmmPlatform cfg
d = narrowS rep d'
MO_U_QuotRem rep
| Just p <- exactLog2 d,
[rQuot,rRem] <- res -> do
-- first prepend the optimized division by a power 2
prependNode $! CmmAssign (CmmLocal rQuot) $ CmmMachOp (MO_U_Shr rep) [n, CmmLit (CmmInt p $ wordWidth platform)]
-- then output an optimized remainder by a power of 2
pure . Just $! CmmAssign (CmmLocal rRem) $ CmmMachOp (MO_And rep) [n, CmmLit (CmmInt (d - 1) rep)]
| canOptimizeDivision cfg rep,
d /= 0, d /= 1,
[rQuot,rRem] <- res -> do
-- we are definitely going to use n multiple times, so put it into a register
n' <- intoRegister n (cmmBits rep)
-- generate an optimized (unsigned) division of n by d
q <- generateDivisionByUnsigned platform cfg rep n' d
-- we also need the result multiple times to calculate the remainder
q' <- intoRegister q (cmmBits rep)
prependNode $! CmmAssign (CmmLocal rQuot) q'
-- The remainder now becomes n - q * d
pure . Just $! CmmAssign (CmmLocal rRem) $ CmmMachOp (MO_Sub rep) [n', CmmMachOp (MO_Mul rep) [q', CmmLit $ CmmInt d rep]]
where
platform = cmmPlatform cfg
d = narrowU rep d'
_ -> pure Nothing
cmmCallishMachOpFoldM _ _ _ _ = pure Nothing
-- -----------------------------------------------------------------------------
-- Specialized constant folding for MachOps which sometimes need to expand into multiple nodes
cmmMachOpFoldOptM :: CmmConfig -> MachOp -> [CmmExpr] -> Opt (Maybe CmmExpr)
cmmMachOpFoldOptM cfg op [n, CmmLit (CmmInt d' _)] =
case op of
MO_S_Quot rep
-- recheck for power of 2 division. This may not be handled by cmmMachOpFoldM if n is not in a register
| Just p <- exactLog2 d -> do
n' <- intoRegister n (cmmBits rep)
pure . Just $! cmmMachOpFold platform (MO_S_Shr rep)
[ signedQuotRemHelper platform d n' rep p
, CmmLit (CmmInt p $ wordWidth platform)
]
| canOptimizeDivision cfg rep,
d /= (-1), d /= 0, d /= 1 -> Just <$!> generateDivisionBySigned platform cfg rep n d
where d = narrowS rep d'
MO_S_Rem rep
-- recheck for power of 2 remainder. This may not be handled by cmmMachOpFoldM if n is not in a register
| Just p <- exactLog2 d -> do
n' <- intoRegister n (cmmBits rep)
pure . Just $! cmmMachOpFold platform (MO_Sub rep)
[ n'
, cmmMachOpFold platform (MO_And rep)
[ signedQuotRemHelper platform d n' rep p
, CmmLit (CmmInt (- d) rep)
]
]
| canOptimizeDivision cfg rep,
d /= (-1), d /= 0, d /= 1 -> do
n' <- intoRegister n (cmmBits rep)
-- first generate the division
q <- generateDivisionBySigned platform cfg rep n' d
-- then calculate the remainder by n - q * d
pure . Just $! CmmMachOp (MO_Sub rep) [n', CmmMachOp (MO_Mul rep) [q, CmmLit $ CmmInt d rep]]
where d = narrowS rep d'
MO_U_Quot rep
-- No need to recheck power of 2 division because cmmMachOpFoldM always handles that case
| canOptimizeDivision cfg rep,
d /= 0, d /= 1, Nothing <- exactLog2 d -> Just <$!> generateDivisionByUnsigned platform cfg rep n d
where d = narrowU rep d'
MO_U_Rem rep
-- No need to recheck power of 2 remainder because cmmMachOpFoldM always handles that case
| canOptimizeDivision cfg rep,
d /= 0, d /= 1, Nothing <- exactLog2 d -> do
n' <- intoRegister n (cmmBits rep)
-- first generate the division
q <- generateDivisionByUnsigned platform cfg rep n d
-- then calculate the remainder by n - q * d
pure . Just $! CmmMachOp (MO_Sub rep) [n', CmmMachOp (MO_Mul rep) [q, CmmLit $ CmmInt d rep]]
where d = narrowU rep d'
_ -> pure Nothing
where platform = cmmPlatform cfg
cmmMachOpFoldOptM _ _ _ = pure Nothing
-- -----------------------------------------------------------------------------
-- Utils for prepending new nodes
-- Move an expression into a register to possibly use it multiple times
intoRegister :: CmmExpr -> CmmType -> Opt CmmExpr
intoRegister e@(CmmReg _) _ = pure e
intoRegister expr ty = do
u <- getUniqueM
let reg = LocalReg u ty
CmmReg (CmmLocal reg) <$ prependNode (CmmAssign (CmmLocal reg) expr)
prependNode :: CmmNode O O -> Opt ()
prependNode n = Opt $ \_ xs -> pure (xs ++ [n], ())
-- -----------------------------------------------------------------------------
-- Division by constants utils
-- Helper for division by a power of 2
-- In contrast with unsigned integers, for signed ones
-- shift right is not the same as quot, because it rounds
-- to minus infinity, whereas quot rounds toward zero.
-- To fix this up, we add one less than the divisor to the
-- dividend if it is a negative number.
--
-- to avoid a test/jump, we use the following sequence:
-- x1 = x >> word_size-1 (all 1s if -ve, all 0s if +ve)
-- x2 = y & (divisor-1)
-- result = x + x2
-- this could be done a bit more simply using conditional moves,
-- but we're processor independent here.
--
-- we optimize the divide by 2 case slightly, generating
-- x1 = x >> word_size-1 (unsigned)
-- return = x + x1
signedQuotRemHelper :: Platform -> Integer -> CmmExpr -> Width -> Integer -> CmmExpr
signedQuotRemHelper platform n x rep p = CmmMachOp (MO_Add rep) [x, x2]
where
bits = fromIntegral (widthInBits rep) - 1
shr = if p == 1 then MO_U_Shr rep else MO_S_Shr rep
x1 = CmmMachOp shr [x, CmmLit (CmmInt bits $ wordWidth platform)]
x2 = if p == 1 then x1 else
CmmMachOp (MO_And rep) [x1, CmmLit (CmmInt (n-1) rep)]
{- Note: [Division by constants]
Integer division is floor(n / d), the goal is to find m,p
such that floor((m * n) / 2^p) = floor(n / d).
The idea being: n/d = n * (1/d). But we cannot store 1/d in an integer without
some error, so we choose some 2^p / d such that the error ends up small and
thus vanishes when we divide by 2^p again.
The algorithm below to generate these numbers is taken from Hacker's Delight
Second Edition Chapter 10 "Integer division by constants". The chapter also
contains proof that this method does indeed produce correct results.
However this is a much more literal interpretation of the algorithm,
which we can use because of the unbounded Integer type. Hacker's Delight
also provides a much more complex algorithm which computes these numbers
without the need to exceed the word size, but that is not necessary here.
-}
generateDivisionBySigned :: Platform -> CmmConfig -> Width -> CmmExpr -> Integer -> Opt CmmExpr
-- Sanity checks, division will generate incorrect results or undesirable code for these cases
-- cmmMachOpFoldM and cmmMachOpFoldOptM should have already handled these cases!
generateDivisionBySigned _ _ _ _ 0 = panic "generate signed division with 0"
generateDivisionBySigned _ _ _ _ 1 = panic "generate signed division with 1"
generateDivisionBySigned _ _ _ _ (-1) = panic "generate signed division with -1"
generateDivisionBySigned _ _ _ _ d | Just _ <- exactLog2 d = panic $ "generate signed division with " ++ show d
generateDivisionBySigned platform _cfg rep n divisor = do
-- We only duplicate n' if we actually need to add/subtract it, so we may not need it in a register
n' <- if sign == 0 then pure n else intoRegister n resRep
-- Set up mul2
(shift', qExpr) <- mul2 n'
-- add/subtract n if necessary
let qExpr' = case sign of
1 -> CmmMachOp (MO_Add rep) [qExpr, n']
-1 -> CmmMachOp (MO_Sub rep) [qExpr, n']
_ -> qExpr
qExpr'' <- intoRegister (cmmMachOpFold platform (MO_S_Shr rep) [qExpr', CmmLit $ CmmInt shift' wordRep]) resRep
-- Lastly add the sign of the quotient to correct for negative results
pure $! cmmMachOpFold platform
(MO_Add rep) [qExpr'', cmmMachOpFold platform (MO_U_Shr rep) [qExpr'', CmmLit $ CmmInt (toInteger $ widthInBits rep - 1) wordRep]]
where
resRep = cmmBits rep
wordRep = wordWidth platform
(magic, sign, shift) = divisionMagicS rep divisor
-- generate the multiply with the magic number
mul2 n
-- Using mul2 for sub-word sizes regresses for signed integers only
| rep == wordWidth platform = do
(r1, r2, r3) <- (,,) <$> getUniqueM <*> getUniqueM <*> getUniqueM
let rg1 = LocalReg r1 resRep
resReg = LocalReg r2 resRep
rg3 = LocalReg r3 resRep
res <- CmmReg (CmmLocal resReg) <$ prependNode (CmmUnsafeForeignCall (PrimTarget (MO_S_Mul2 rep)) [rg1, resReg, rg3] [n, CmmLit $ CmmInt magic rep])
pure (shift, res)
-- widen the register and multiply without the MUL2 instruction
-- if we don't need an additional add after this we can combine the shifts
| otherwise = pure (if sign == 0 then 0 else shift, res)
where
wordRep = wordWidth platform
-- (n * magic) >> widthInBits + (if sign == 0 then shift else 0) -- With conversion in between to not overflow
res = cmmMachOpFold platform (MO_SS_Conv wordRep rep)
[ cmmMachOpFold platform (MO_S_Shr wordRep)
[ cmmMachOpFold platform (MO_Mul wordRep)
[ cmmMachOpFold platform (MO_SS_Conv rep wordRep) [n]
, CmmLit $ CmmInt magic wordRep
]
-- Check if we need to generate an add/subtract later. If not we can combine this with the postshift
, CmmLit $ CmmInt ((if sign == 0 then toInteger shift else 0) + (toInteger $ widthInBits rep)) wordRep
]
]
-- See hackers delight for how and why this works (chapter in note [Division by constants])
divisionMagicS :: Width -> Integer -> (Integer, Integer, Integer)
divisionMagicS rep divisor = (magic, sign, toInteger $ p - wSz)
where
sign = if divisor > 0
then if magic < 0 then 1 else 0
else if magic < 0 then 0 else -1
wSz = widthInBits rep
ad = abs divisor
t = (1 `shiftL` (wSz - 1)) + if divisor > 0 then 0 else 1
anc = t - 1 - rem t ad
go p'
| twoP > anc * (ad - rem twoP ad) = p'
| otherwise = go (p' + 1)
where twoP = 1 `shiftL` p'
p = go wSz
am = (twoP + ad - rem twoP ad) `quot` ad
where twoP = 1 `shiftL` p
magic = narrowS rep $ if divisor > 0 then am else -am
generateDivisionByUnsigned :: Platform -> CmmConfig -> Width -> CmmExpr -> Integer -> Opt CmmExpr
-- Sanity checks, division will generate incorrect results or undesirable code for these cases
-- cmmMachOpFoldM and cmmMachOpFoldOptM should have already handled these cases!
generateDivisionByUnsigned _ _ _ _ 0 = panic "generate signed division with 0"
generateDivisionByUnsigned _ _ _ _ 1 = panic "generate signed division with 1"
generateDivisionByUnsigned _ _ _ _ d | Just _ <- exactLog2 d = panic $ "generate signed division with " ++ show d
generateDivisionByUnsigned platform cfg rep n divisor = do
-- We only duplicate n' if we actually need to add/subtract it, so we may not need it in a register
n' <- if not needsAdd -- Invariant: We also never preshift if we need an add, thus we don't need n in a register
then pure $! cmmMachOpFold platform (MO_U_Shr rep) [n, CmmLit $ CmmInt preShift wordRep]
else intoRegister n resRep
-- Set up mul2
(postShift', qExpr) <- mul2 n'
-- add/subtract n if necessary
let qExpr' = if needsAdd
-- This is qExpr + (n - qExpr) / 2 = (qExpr + n) / 2 but with a guarantee that it'll not overflow
then cmmMachOpFold platform (MO_Add rep)
[ cmmMachOpFold platform (MO_U_Shr rep)
[ cmmMachOpFold platform (MO_Sub rep) [n', qExpr]
, CmmLit $ CmmInt 1 wordRep
]
, qExpr
]
else qExpr
-- If we already divided by 2 in the add, remember to shift one bit less
-- Hacker's Delight, Edition 2 Page 234: postShift > 0 if we needed an add, except if the divisor
-- is 1, which we checked for above
finalShift = if needsAdd then postShift' - 1 else postShift'
-- apply the final postShift
pure $! cmmMachOpFold platform (MO_U_Shr rep) [qExpr', CmmLit $ CmmInt finalShift wordRep]
where
resRep = cmmBits rep
wordRep = wordWidth platform
(preShift, magic, needsAdd, postShift) =
let withPre = divisionMagicU rep True divisor
noPre = divisionMagicU rep False divisor
in case (withPre, noPre) of
-- Use whatever does not cause us to take the expensive case
((_, _, False, _), (_, _, True, _)) -> withPre
-- If we cannot avoid the expensive case, don't bother with the pre shift
_ -> noPre
-- generate the multiply with the magic number
mul2 n
| rep == wordWidth platform || (cmmAllowMul2 cfg && needsAdd) = do
(r1, r2) <- (,) <$> getUniqueM <*> getUniqueM
let rg1 = LocalReg r1 resRep
resReg = LocalReg r2 resRep
res <- CmmReg (CmmLocal resReg) <$ prependNode (CmmUnsafeForeignCall (PrimTarget (MO_U_Mul2 rep)) [resReg, rg1] [n, CmmLit $ CmmInt magic rep])
pure (postShift, res)
| otherwise = do
pure (if needsAdd then postShift else 0, res)
where
wordRep = wordWidth platform
-- (n * magic) >> widthInBits + (if sign == 0 then shift else 0) -- With conversion in between to not overflow
res = cmmMachOpFold platform (MO_UU_Conv wordRep rep)
[ cmmMachOpFold platform (MO_U_Shr wordRep)
[ cmmMachOpFold platform (MO_Mul wordRep)
[ cmmMachOpFold platform (MO_UU_Conv rep wordRep) [n]
, CmmLit $ CmmInt magic wordRep
]
-- Check if we need to generate an add later. If not we can combine this with the postshift
, CmmLit $ CmmInt ((if needsAdd then 0 else postShift) + (toInteger $ widthInBits rep)) wordRep
]
]
-- See hackers delight for how and why this works (chapter in note [Division by constants])
-- The preshift isn't described there, but the idea is:
-- If a divisor d has n trailing zeros, then d is a multiple of 2^n. Since we want to divide x by d
-- we can also calculate (x / 2^n) / (d / 2^n) which may then not require an extra addition.
--
-- The addition performs: quotient + dividend, but we need to avoid overflows, so we actually need to
-- calculate: quotient + (dividend - quotient) / 2 = (quotient + dividend) / 2
-- Thus if the preshift can avoid all of this, we have 1 operation in place of 3.
--
-- The decision to use the preshift is made somewhere else, here we only report if the addition is needed
divisionMagicU :: Width -> Bool -> Integer -> (Integer, Integer, Bool, Integer)
divisionMagicU rep doPreShift divisor = (toInteger zeros, magic, needsAdd, toInteger $ p - wSz)
where
wSz = widthInBits rep
zeros = if doPreShift then countTrailingZeros $ fromInteger @Word64 divisor else 0
d = divisor `shiftR` zeros
ones = ((1 `shiftL` wSz) - 1) `shiftR` zeros
nc = ones - rem (ones - d) d
go p'
| twoP > nc * (d - 1 - rem (twoP - 1) d) = p'
| otherwise = go (p' + 1)
where twoP = 1 `shiftL` p'
p = go wSz
m = (twoP + d - 1 - rem (twoP - 1) d) `quot` d
where twoP = 1 `shiftL` p
needsAdd = d < 1 `shiftL` (p - wSz)
magic = if needsAdd then m - (ones + 1) else m
-- -----------------------------------------------------------------------------
-- Opt monad
newtype Opt a = OptI { runOptI :: CmmConfig -> [CmmNode O O] -> UniqDSM ([CmmNode O O], a) }
-- | Pattern synonym for 'Opt', as described in Note [The one-shot state
-- monad trick].
pattern Opt :: (CmmConfig -> [CmmNode O O] -> UniqDSM ([CmmNode O O], a)) -> Opt a
pattern Opt f <- OptI f
where Opt f = OptI . oneShot $ \cfg -> oneShot $ \out -> f cfg out
{-# COMPLETE Opt #-}
runOpt :: CmmConfig -> Opt a -> UniqDSM ([CmmNode O O], a)
runOpt cf (Opt g) = g cf []
getConfig :: Opt CmmConfig
getConfig = Opt $ \cf xs -> pure (xs, cf)
instance Functor Opt where
fmap f (Opt g) = Opt $ \cf xs -> fmap (fmap f) (g cf xs)
instance Applicative Opt where
pure a = Opt $ \_ xs -> pure (xs, a)
ff <*> fa = do
f <- ff
f <$> fa
instance Monad Opt where
Opt g >>= f = Opt $ \cf xs -> do
(ys, a) <- g cf xs
runOptI (f a) cf ys
instance MonadGetUnique Opt where
getUniqueM = Opt $ \_ xs -> (xs,) <$> getUniqueDSM
mapForeignTargetOpt :: (CmmExpr -> Opt CmmExpr) -> ForeignTarget -> Opt ForeignTarget
mapForeignTargetOpt exp (ForeignTarget e c) = flip ForeignTarget c <$> exp e
mapForeignTargetOpt _ m@(PrimTarget _) = pure m
wrapRecExpOpt :: (CmmExpr -> Opt CmmExpr) -> CmmExpr -> Opt CmmExpr
wrapRecExpOpt f (CmmMachOp op es) = traverse (wrapRecExpOpt f) es >>= f . CmmMachOp op
wrapRecExpOpt f (CmmLoad addr ty align) = wrapRecExpOpt f addr >>= \newAddr -> f (CmmLoad newAddr ty align)
wrapRecExpOpt f e = f e
mapExpOpt :: (CmmExpr -> Opt CmmExpr) -> CmmNode e x -> Opt (CmmNode e x)
mapExpOpt _ f@(CmmEntry{}) = pure f
mapExpOpt _ m@(CmmComment _) = pure m
mapExpOpt _ m@(CmmTick _) = pure m
mapExpOpt f (CmmUnwind regs) = CmmUnwind <$> traverse (traverse (traverse f)) regs
mapExpOpt f (CmmAssign r e) = CmmAssign r <$> f e
mapExpOpt f (CmmStore addr e align) = CmmStore <$> f addr <*> f e <*> pure align
mapExpOpt f (CmmUnsafeForeignCall tgt fs as) = CmmUnsafeForeignCall <$> mapForeignTargetOpt f tgt <*> pure fs <*> traverse f as
mapExpOpt _ l@(CmmBranch _) = pure l
mapExpOpt f (CmmCondBranch e ti fi l) = f e >>= \newE -> pure (CmmCondBranch newE ti fi l)
mapExpOpt f (CmmSwitch e ids) = flip CmmSwitch ids <$> f e
mapExpOpt f n@CmmCall {cml_target=tgt} = f tgt >>= \newTgt -> pure n{cml_target = newTgt}
mapExpOpt f (CmmForeignCall tgt fs as succ ret_args updfr intrbl)
= do
newTgt <- mapForeignTargetOpt f tgt
newAs <- traverse f as
pure $ CmmForeignCall newTgt fs newAs succ ret_args updfr intrbl
......@@ -137,12 +137,9 @@ cpsTop logger platform cfg dus proc =
dump Opt_D_dump_cmm_sp "Layout Stack" g
----------- Sink and inline assignments --------------------------------
(g, dus) <- {-# SCC "sink" #-} -- See Note [Sinking after stack layout]
if cmmOptSink cfg
then pure $ runUniqueDSM dus $ cmmSink cfg g
else return (g, dus)
dump Opt_D_dump_cmm_sink "Sink assignments" g
g <- {-# SCC "sink" #-} -- See Note [Sinking after stack layout]
condPass (cmmOptSink cfg) (cmmSink platform) g
Opt_D_dump_cmm_sink "Sink assignments"
------------- CAF analysis ----------------------------------------------
let cafEnv = {-# SCC "cafAnal" #-} cafAnal platform call_pps l g
......
......@@ -20,8 +20,6 @@ import GHC.Platform.Regs
import GHC.Platform
import GHC.Types.Unique.FM
import GHC.Types.Unique.DSM
import GHC.Cmm.Config
import Data.List (partition)
import Data.Maybe
......@@ -152,10 +150,9 @@ type Assignments = [Assignment]
-- y = e2
-- x = e1
cmmSink :: CmmConfig -> CmmGraph -> UniqDSM CmmGraph
cmmSink cfg graph = ofBlockList (g_entry graph) <$> sink mapEmpty blocks
cmmSink :: Platform -> CmmGraph -> CmmGraph
cmmSink platform graph = ofBlockList (g_entry graph) $ sink mapEmpty $ blocks
where
platform = cmmPlatform cfg
liveness = cmmLocalLivenessL platform graph
getLive l = mapFindWithDefault emptyLRegSet l liveness
......@@ -163,41 +160,11 @@ cmmSink cfg graph = ofBlockList (g_entry graph) <$> sink mapEmpty blocks
join_pts = findJoinPoints blocks
sink :: LabelMap Assignments -> [CmmBlock] -> UniqDSM [CmmBlock]
sink _ [] = pure []
sink sunk (b:bs) = do
-- Now sink and inline in this block
(prepend, last_fold) <- runOpt cfg $ constantFoldNode last
(middle', assigs) <- walk cfg (ann_middles ++ annotate platform live_middle prepend) (mapFindWithDefault [] lbl sunk)
let (final_last, assigs') = tryToInline platform live last_fold assigs
-- Now, drop any assignments that we will not sink any further.
(dropped_last, assigs'') = dropAssignments platform drop_if init_live_sets assigs'
drop_if :: (LocalReg, CmmExpr, AbsMem)
-> [LRegSet] -> (Bool, [LRegSet])
drop_if a@(r,rhs,_) live_sets = (should_drop, live_sets')
where
should_drop = conflicts platform a final_last
|| not (isTrivial platform rhs) && live_in_multi live_sets r
|| r `elemLRegSet` live_in_joins
live_sets' | should_drop = live_sets
| otherwise = map upd live_sets
upd set | r `elemLRegSet` set = set `unionLRegSet` live_rhs
| otherwise = set
live_rhs = foldRegsUsed platform (flip insertLRegSet) emptyLRegSet rhs
final_middle = foldl' blockSnoc middle' dropped_last
sunk' = mapUnion sunk $
mapFromList [ (l, filterAssignments platform (getLive l) assigs'')
| l <- succs ]
(blockJoin first final_middle final_last :) <$> sink sunk' bs
sink :: LabelMap Assignments -> [CmmBlock] -> [CmmBlock]
sink _ [] = []
sink sunk (b:bs) =
-- pprTrace "sink" (ppr lbl) $
blockJoin first final_middle final_last : sink sunk' bs
where
lbl = entryLabel b
(first, middle, last) = blockSplit b
......@@ -211,6 +178,11 @@ cmmSink cfg graph = ofBlockList (g_entry graph) <$> sink mapEmpty blocks
live_middle = gen_killL platform last live
ann_middles = annotate platform live_middle (blockToList middle)
-- Now sink and inline in this block
(middle', assigs) = walk platform ann_middles (mapFindWithDefault [] lbl sunk)
fold_last = constantFoldNode platform last
(final_last, assigs') = tryToInline platform live fold_last assigs
-- We cannot sink into join points (successors with more than
-- one predecessor), so identify the join points and the set
-- of registers live in them.
......@@ -228,6 +200,31 @@ cmmSink cfg graph = ofBlockList (g_entry graph) <$> sink mapEmpty blocks
(_one:_two:_) -> True
_ -> False
-- Now, drop any assignments that we will not sink any further.
(dropped_last, assigs'') = dropAssignments platform drop_if init_live_sets assigs'
drop_if :: (LocalReg, CmmExpr, AbsMem)
-> [LRegSet] -> (Bool, [LRegSet])
drop_if a@(r,rhs,_) live_sets = (should_drop, live_sets')
where
should_drop = conflicts platform a final_last
|| not (isTrivial platform rhs) && live_in_multi live_sets r
|| r `elemLRegSet` live_in_joins
live_sets' | should_drop = live_sets
| otherwise = map upd live_sets
upd set | r `elemLRegSet` set = set `unionLRegSet` live_rhs
| otherwise = set
live_rhs = foldRegsUsed platform (flip insertLRegSet) emptyLRegSet rhs
final_middle = foldl' blockSnoc middle' dropped_last
sunk' = mapUnion sunk $
mapFromList [ (l, filterAssignments platform (getLive l) assigs'')
| l <- succs ]
{- TODO: enable this later, when we have some good tests in place to
measure the effect and tune it.
......@@ -302,7 +299,7 @@ filterAssignments platform live assigs = reverse (go assigs [])
-- * a list of assignments that will be placed *after* that block.
--
walk :: CmmConfig
walk :: Platform
-> [(LRegSet, CmmNode O O)] -- nodes of the block, annotated with
-- the set of registers live *after*
-- this node.
......@@ -312,39 +309,36 @@ walk :: CmmConfig
-- Earlier assignments may refer
-- to later ones.
-> UniqDSM ( Block CmmNode O O -- The new block
, Assignments -- Assignments to sink further
)
-> ( Block CmmNode O O -- The new block
, Assignments -- Assignments to sink further
)
walk cfg nodes assigs = go nodes emptyBlock assigs
walk platform nodes assigs = go nodes emptyBlock assigs
where
platform = cmmPlatform cfg
go [] block as = pure (block, as)
go [] block as = (block, as)
go ((live,node):ns) block as
-- discard nodes representing dead assignment
| shouldDiscard node live = go ns block as
| otherwise = do
(prepend, node1) <- runOpt cfg $ constantFoldNode node
if not (null prepend)
then go (annotate platform live (prepend ++ [node1]) ++ ns) block as
else do
let -- Inline assignments
(node2, as1) = tryToInline platform live node1 as
-- Drop any earlier assignments conflicting with node2
(dropped, as') = dropAssignmentsSimple platform
(\a -> conflicts platform a node2) as1
-- Walk over the rest of the block. Includes dropped assignments
block' = foldl' blockSnoc block dropped `blockSnoc` node2
(prepend2, node3) <- runOpt cfg $ constantFoldNode node2
if | not (null prepend2) -> go (annotate platform live (prepend2 ++ [node3]) ++ ns) block as
-- sometimes only after simplification we can tell we can discard the node.
-- See Note [Discard simplified nodes]
| noOpAssignment node3 -> go ns block as
-- Pick up interesting assignments
| Just a <- shouldSink platform node3 -> go ns block (a : as1)
-- Try inlining, drop assignments and move on
| otherwise -> go ns block' as'
-- sometimes only after simplification we can tell we can discard the node.
-- See Note [Discard simplified nodes]
| noOpAssignment node2 = go ns block as
-- Pick up interesting assignments
| Just a <- shouldSink platform node2 = go ns block (a : as1)
-- Try inlining, drop assignments and move on
| otherwise = go ns block' as'
where
-- Simplify node
node1 = constantFoldNode platform node
-- Inline assignments
(node2, as1) = tryToInline platform live node1 as
-- Drop any earlier assignments conflicting with node2
(dropped, as') = dropAssignmentsSimple platform
(\a -> conflicts platform a node2) as1
-- Walk over the rest of the block. Includes dropped assignments
block' = foldl' blockSnoc block dropped `blockSnoc` node2
{- Note [Discard simplified nodes]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -209,8 +209,9 @@ tsanTarget fn formals args =
tsanStore :: Env
-> CmmType -> CmmExpr
-> Block CmmNode O O
tsanStore env ty addr =
mkUnsafeCall env ftarget [] [addr]
tsanStore env ty addr
| typeWidth ty < W128 = mkUnsafeCall env ftarget [] [addr]
| otherwise = emptyBlock
where
ftarget = tsanTarget fn [] [AddrHint]
w = widthInBytes (typeWidth ty)
......@@ -219,8 +220,9 @@ tsanStore env ty addr =
tsanLoad :: Env
-> AlignmentSpec -> CmmType -> CmmExpr
-> Block CmmNode O O
tsanLoad env align ty addr =
mkUnsafeCall env ftarget [] [addr]
tsanLoad env align ty addr
| typeWidth ty < W128 = mkUnsafeCall env ftarget [] [addr]
| otherwise = emptyBlock
where
ftarget = tsanTarget fn [] [AddrHint]
w = widthInBytes (typeWidth ty)
......
......@@ -120,6 +120,11 @@ avxEnabled = do
config <- getConfig
return (ncgAvxEnabled config)
avx2Enabled :: NatM Bool
avx2Enabled = do
config <- getConfig
return (ncgAvx2Enabled config)
cmmTopCodeGen
:: RawCmmDecl
-> NatM [NatCmmDecl (Alignment, RawCmmStatics) Instr]
......@@ -972,6 +977,7 @@ getRegister' _ _ (CmmMachOp mop []) =
getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
avx <- avxEnabled
avx2 <- avx2Enabled
case mop of
MO_F_Neg w -> sse2NegCode w x
......@@ -1069,7 +1075,14 @@ getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
| otherwise
-> vector_float_broadcast_sse l w x
MO_V_Broadcast l w
-> vector_int_broadcast l w x
| avx2, l * widthInBits w `elem` [128, 256] -- AVX-512 is not supported for now
-> vector_int_broadcast_avx2 l w x
MO_V_Broadcast 16 W8 -> vector_int8x16_broadcast x
MO_V_Broadcast 8 W16 -> vector_int16x8_broadcast x
MO_V_Broadcast 4 W32 -> vector_int32x4_broadcast x
MO_V_Broadcast 2 W64 -> vector_int64x2_broadcast x
MO_V_Broadcast {}
-> pprPanic "Unsupported integer vector broadcast operation for: " (pdoc platform x)
-- Binary MachOps
MO_Add {} -> incorrectOperands
......@@ -1234,29 +1247,70 @@ getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
code = SHUF fmt (ImmInt 0) (OpReg dst) dst
return $ Fixed fmt dst (exp `snocOL` code)
vector_int_broadcast :: Length
-> Width
-> CmmExpr
-> NatM Register
vector_int_broadcast len W64 expr = do
vector_int_broadcast_avx2 :: Length
-> Width
-> CmmExpr
-> NatM Register
vector_int_broadcast_avx2 len w expr = do
(reg, exp) <- getNonClobberedReg expr
let (movFormat, fmt) = case w of
W8 -> (II32, VecFormat len FmtInt8)
W16 -> (II32, VecFormat len FmtInt16)
W32 -> (II32, VecFormat len FmtInt32)
W64 -> (II64, VecFormat len FmtInt64)
_ -> pprPanic "Broadcast not supported for: " (pdoc platform expr)
code dst = exp `snocOL`
-- VPBROADCAST from GPR requires AVX-512,
-- so we use an additional MOVD.
(MOVD movFormat (OpReg reg) (OpReg dst)) `snocOL`
(VPBROADCAST fmt fmt (OpReg dst) dst)
return $ Any fmt code
vector_int8x16_broadcast :: CmmExpr
-> NatM Register
vector_int8x16_broadcast expr = do
(reg, exp) <- getNonClobberedReg expr
let fmt = VecFormat len FmtInt64
let fmt = VecFormat 16 FmtInt8
return $ Any fmt (\dst -> exp `snocOL`
(MOVD II64 (OpReg reg) (OpReg dst)) `snocOL`
(PUNPCKLQDQ fmt (OpReg dst) dst)
(MOVD II32 (OpReg reg) (OpReg dst)) `snocOL`
(PUNPCKLBW fmt (OpReg dst) dst) `snocOL`
(PUNPCKLWD (VecFormat 8 FmtInt16) (OpReg dst) dst) `snocOL`
(PSHUFD fmt (ImmInt 0x00) (OpReg dst) dst)
)
vector_int_broadcast len W32 expr = do
vector_int16x8_broadcast :: CmmExpr
-> NatM Register
vector_int16x8_broadcast expr = do
(reg, exp) <- getNonClobberedReg expr
let fmt = VecFormat len FmtInt32
let fmt = VecFormat 8 FmtInt16
return $ Any fmt (\dst -> exp `snocOL`
(MOVD II32 (OpReg reg) (OpReg dst)) `snocOL`
(PUNPCKLWD fmt (OpReg dst) dst) `snocOL`
(PSHUFD fmt (ImmInt 0x00) (OpReg dst) dst)
)
vector_int_broadcast _ _ _ =
sorry "Unsupported Integer vector broadcast operation; please use -fllvm."
vector_int32x4_broadcast :: CmmExpr
-> NatM Register
vector_int32x4_broadcast expr = do
(reg, exp) <- getNonClobberedReg expr
let fmt = VecFormat 4 FmtInt32
return $ Any fmt (\dst -> exp `snocOL`
(MOVD II32 (OpReg reg) (OpReg dst)) `snocOL`
(PSHUFD fmt (ImmInt 0x00) (OpReg dst) dst)
)
vector_int64x2_broadcast :: CmmExpr
-> NatM Register
vector_int64x2_broadcast expr = do
(reg, exp) <- getNonClobberedReg expr
let fmt = VecFormat 2 FmtInt64
return $ Any fmt (\dst -> exp `snocOL`
(MOVD II64 (OpReg reg) (OpReg dst)) `snocOL`
(PUNPCKLQDQ fmt (OpReg dst) dst)
)
getRegister' platform is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps
sse4_1 <- sse4_1Enabled
avx <- avxEnabled
case mop of
MO_F_Eq _ -> condFltReg is32Bit EQQ x y
......@@ -1327,8 +1381,14 @@ getRegister' platform is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps
MO_VF_Extract l W64 -> vector_float_extract l W64 x y
MO_VF_Extract {} -> incorrectOperands
MO_V_Extract l W64 -> vector_int_extract_sse l W64 x y
-- SIMD NCG TODO: W32, W16, W8
MO_V_Extract 16 W8 | sse4_1 -> vector_int_extract_pextr 16 W8 x y
| otherwise -> vector_int8x16_extract_sse2 x y
MO_V_Extract 8 W16 -> vector_int_extract_pextr 8 W16 x y -- PEXTRW (SSE2)
MO_V_Extract 4 W32 | sse4_1 -> vector_int_extract_pextr 4 W32 x y
| otherwise -> vector_int32x4_extract_sse2 x y
MO_V_Extract 2 W64 | sse4_1 -> vector_int_extract_pextr 2 W64 x y
| otherwise -> vector_int64x2_extract_sse2 x y
-- SIMD NCG TODO: 256/512-bit vector
MO_V_Extract {} -> needLlvm mop
MO_VF_Add l w | avx -> vector_float_op_avx VA_Add l w x y
......@@ -1680,27 +1740,92 @@ getRegister' platform is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps
= pprPanic "Unsupported SSE floating-point vector extract" (pdoc platform c $$ pdoc platform e $$ ppr w)
-----------------------
vector_int_extract_sse :: Length
-> Width
-> CmmExpr
-> CmmExpr
-> NatM Register
vector_int_extract_sse l@2 W64 expr (CmmLit lit)
-- PEXTRW ("to GPR" variant) is an SSE2 instruction,
-- whereas PEXTR{B,D,Q} and PEXTRW ("to memory" variant) require SSE4.1.
vector_int_extract_pextr :: Length
-> Width
-> CmmExpr
-> CmmExpr
-> NatM Register
vector_int_extract_pextr l w expr (CmmLit (CmmInt i _))
| 0 <= i, i < toInteger l
= do
(r, exp) <- getSomeReg expr -- vector registers are never clobbered by an instruction
let (scalarFormat, vectorFormat) = case w of
W8 -> (II32, VecFormat l FmtInt8)
W16 -> (II32, VecFormat l FmtInt16)
W32 -> (II32, VecFormat l FmtInt32)
W64 -> (II64, VecFormat l FmtInt64)
_ -> sorry "Unsupported vector format"
code dst = exp `snocOL`
(PEXTR scalarFormat vectorFormat (ImmInteger i) r (OpReg dst))
return (Any scalarFormat code)
vector_int_extract_pextr _ _ _ i
= pprPanic "Unsupported offset" (pdoc platform i)
vector_int8x16_extract_sse2 :: CmmExpr
-> CmmExpr
-> NatM Register
vector_int8x16_extract_sse2 expr (CmmLit (CmmInt i _))
| 0 <= i, i < 16
= do
(r, exp) <- getSomeReg expr
let code dst =
case i `quotRem` 2 of
(j, 0) -> exp `snocOL`
(PEXTR II32 (VecFormat 8 FmtInt16) (ImmInteger j) r (OpReg dst)) -- PEXTRW
(j, _) -> exp `snocOL`
(PEXTR II32 (VecFormat 8 FmtInt16) (ImmInteger j) r (OpReg dst)) `snocOL` -- PEXTRW
(SHR II32 (OpImm (ImmInt 8)) (OpReg dst))
return (Any II8 code)
vector_int8x16_extract_sse2 _ offset
= pprPanic "Unsupported offset" (pdoc platform offset)
vector_int32x4_extract_sse2 :: CmmExpr
-> CmmExpr
-> NatM Register
vector_int32x4_extract_sse2 expr (CmmLit (CmmInt i _))
| 0 <= i, i < 4
= do
(r, exp) <- getSomeReg expr
let fmt = VecFormat 4 FmtInt32
tmp <- getNewRegNat fmt
let code dst =
case i of
0 -> exp `snocOL`
(MOVD FF32 (OpReg r) (OpReg dst))
1 -> exp `snocOL`
(PSHUFD fmt (ImmInt 0b01_01_01_01) (OpReg r) tmp) `snocOL` -- tmp <- (r[1],r[1],r[1],r[1])
(MOVD FF32 (OpReg tmp) (OpReg dst))
2 -> exp `snocOL`
(PSHUFD fmt (ImmInt 0b11_10_11_10) (OpReg r) tmp) `snocOL` -- tmp <- (r[2],r[3],r[2],r[3])
(MOVD FF32 (OpReg tmp) (OpReg dst))
_ -> exp `snocOL`
(PSHUFD fmt (ImmInt 0b11_11_11_11) (OpReg r) tmp) `snocOL` -- tmp <- (r[3],r[3],r[3],r[3])
(MOVD FF32 (OpReg tmp) (OpReg dst))
return (Any II32 code)
vector_int32x4_extract_sse2 _ offset
= pprPanic "Unsupported offset" (pdoc platform offset)
vector_int64x2_extract_sse2 :: CmmExpr
-> CmmExpr
-> NatM Register
vector_int64x2_extract_sse2 expr (CmmLit lit)
= do
(r, exp) <- getSomeReg expr
let fmt = VecFormat l FmtInt64
let fmt = VecFormat 2 FmtInt64
tmp <- getNewRegNat fmt
let code dst =
case lit of
CmmInt 0 _ -> exp `snocOL`
(MOVD II64 (OpReg r) (OpReg dst))
(MOVD FF64 (OpReg r) (OpReg dst))
CmmInt 1 _ -> exp `snocOL`
(MOVHLPS fmt r tmp) `snocOL`
(MOVD II64 (OpReg tmp) (OpReg dst))
(MOVD FF64 (OpReg tmp) (OpReg dst))
_ -> panic "Error in offset while unpacking"
return (Any II64 code)
vector_int_extract_sse _ w c e
= pprPanic "Unsupported SSE floating-point vector extract" (pdoc platform c $$ pdoc platform e $$ ppr w)
vector_int64x2_extract_sse2 _ offset
= pprPanic "Unsupported offset" (pdoc platform offset)
vector_shuffle_float :: Length -> Width -> CmmExpr -> CmmExpr -> [Int] -> NatM Register
vector_shuffle_float l w v1 v2 is = do
......@@ -1799,7 +1924,14 @@ getRegister' platform _is32Bit (CmmMachOp mop [x, y, z]) = do -- ternary MachOps
--
-- - add support for FloatX8, FloatX16.
MO_VF_Insert l W64 -> vector_double_insert avx l x y z
MO_V_Insert l W64 -> vector_int_insert_sse l W64 x y z
MO_V_Insert 16 W8 | sse4_1 -> vector_int_insert_pinsr 16 W8 x y z
| otherwise -> vector_int8x16_insert_sse2 x y z
MO_V_Insert 8 W16 -> vector_int_insert_pinsr 8 W16 x y z -- PINSRW (SSE2)
MO_V_Insert 4 W32 | sse4_1 -> vector_int_insert_pinsr 4 W32 x y z
| otherwise -> vector_int32x4_insert_sse2 x y z
MO_V_Insert 2 W64 | sse4_1 -> vector_int_insert_pinsr 2 W64 x y z
| otherwise -> vector_int64x2_insert_sse2 x y z
MO_V_Insert _ _ -> sorry "Unsupported integer vector insert operation; please use -fllvm"
_other -> pprPanic "getRegister(x86) - ternary CmmMachOp (1)"
(pprMachOp mop)
......@@ -1894,20 +2026,113 @@ getRegister' platform _is32Bit (CmmMachOp mop [x, y, z]) = do -- ternary MachOps
-- SIMD NCG TODO:
--
-- - only supports Int64X2, add support for everything else:
-- (Int32X{4,2}, Int16X{8,4,2}, Int8X{16,8,4,2})
vector_int_insert_sse :: HasCallStack => Length
-> Width
-> CmmExpr
-> CmmExpr
-> CmmExpr
-> NatM Register
-- Int64X2
vector_int_insert_sse len@2 W64 vecExpr valExpr (CmmLit offset)
-- - only supports 128-bit vector types (Int64X2, Int32X4, Int16X8, Int8X16),
-- add support for 256-bit and 512-bit vector types.
-- PINSRW is an SSE2 instruction, whereas PINSR{B,D,Q} require SSE4.1.
vector_int_insert_pinsr :: HasCallStack => Length
-> Width
-> CmmExpr
-> CmmExpr
-> CmmExpr
-> NatM Register
vector_int_insert_pinsr len w vecExpr valExpr (CmmLit (CmmInt offset _))
| 0 <= offset, offset < toInteger len
= do
(valReg, valExp) <- getNonClobberedReg valExpr
vecCode <- getAnyReg vecExpr
let (scalarFormat, vectorFormat) = case w of
W8 -> (II32, VecFormat len FmtInt8)
W16 -> (II32, VecFormat len FmtInt16)
W32 -> (II32, VecFormat len FmtInt32)
W64 -> (II64, VecFormat len FmtInt64)
_ -> sorry "Unsupported vector format"
code dst = valExp `appOL`
(vecCode dst) `snocOL`
(PINSR scalarFormat vectorFormat (ImmInteger offset) (OpReg valReg) dst)
return $ Any vectorFormat code
vector_int_insert_pinsr _ _ _ _ offset = pprPanic "MO_V_Insert: unsupported offset" (pdoc platform offset)
vector_int8x16_insert_sse2 :: CmmExpr
-> CmmExpr
-> CmmExpr
-> NatM Register
vector_int8x16_insert_sse2 vecExpr valExpr (CmmLit (CmmInt offset _))
| 0 <= offset, offset < 16
= do
(valReg, valExp) <- getNonClobberedReg valExpr
vecCode <- getAnyReg vecExpr
tmp <- getNewRegNat II32
let vectorFormat = VecFormat 16 FmtInt8
code dst
= case offset `quotRem` 2 of
(j, 0) -> valExp `appOL`
(vecCode dst) `snocOL`
(PEXTR II32 (VecFormat 8 FmtInt16) (ImmInteger j) dst (OpReg tmp)) `snocOL` -- PEXTRW
(AND II32 (OpImm (ImmInt 0xff00)) (OpReg tmp)) `snocOL`
(MOVZxL II8 (OpReg valReg) (OpReg valReg)) `snocOL`
(OR II32 (OpReg valReg) (OpReg tmp)) `snocOL`
(PINSR II32 (VecFormat 8 FmtInt16) (ImmInteger j) (OpReg tmp) dst) -- PINSRW
(j, _) -> valExp `appOL`
(vecCode dst) `snocOL`
(PEXTR II32 (VecFormat 8 FmtInt16) (ImmInteger j) dst (OpReg tmp)) `snocOL` -- PEXTRW
(MOVZxL II8 (OpReg tmp) (OpReg tmp)) `snocOL`
(SHL II32 (OpImm (ImmInt 8)) (OpReg valReg)) `snocOL`
(OR II32 (OpReg valReg) (OpReg tmp)) `snocOL`
(PINSR II32 (VecFormat 8 FmtInt16) (ImmInteger j) (OpReg tmp) dst) -- PINSRW
return $ Any vectorFormat code
vector_int8x16_insert_sse2 _ _ offset = pprPanic "MO_V_Insert: unsupported offset" (pdoc platform offset)
vector_int32x4_insert_sse2 :: CmmExpr
-> CmmExpr
-> CmmExpr
-> NatM Register
vector_int32x4_insert_sse2 vecExpr valExpr (CmmLit (CmmInt offset _))
| 0 <= offset, offset < 4
= do
(valReg, valExp) <- getNonClobberedReg valExpr
vecCode <- getAnyReg vecExpr
-- Since SSE2 does not have an integer vector instruction to achieve this,
-- we are forced to either use floating-point vector instructions
-- or lots of integer vector instructions. (sigh)
let floatVectorFormat = VecFormat 4 FmtFloat
tmp1 <- getNewRegNat floatVectorFormat
tmp2 <- getNewRegNat floatVectorFormat
let vectorFormat = VecFormat 4 FmtInt32
code dst
= case offset of
0 -> valExp `appOL`
(vecCode dst) `snocOL`
(MOVD II32 (OpReg valReg) (OpReg tmp1)) `snocOL`
(MOV floatVectorFormat (OpReg tmp1) (OpReg dst)) -- MOVSS; dst <- (tmp1[0],dst[1],dst[2],dst[3])
1 -> valExp `appOL`
(vecCode tmp1) `snocOL`
(MOVD II32 (OpReg valReg) (OpReg dst)) `snocOL` -- dst <- (val,0,0,0)
(PUNPCKLQDQ vectorFormat (OpReg tmp1) dst) `snocOL` -- dst <- (dst[0],dst[1],tmp1[0],tmp1[1])
(SHUF floatVectorFormat (ImmInt 0b11_10_00_10) (OpReg tmp1) dst) -- SHUFPS; dst <- (dst[2],dst[0],tmp1[2],tmp1[3])
2 -> valExp `appOL`
(vecCode dst) `snocOL`
(MOVD II32 (OpReg valReg) (OpReg tmp1)) `snocOL` -- tmp1 <- (val,0,0,0)
(MOVU floatVectorFormat (OpReg dst) (OpReg tmp2)) `snocOL` -- MOVUPS; tmp2 <- dst
(SHUF floatVectorFormat (ImmInt 0b01_00_01_11) (OpReg tmp1) tmp2) `snocOL` -- SHUFPS; tmp2 <- (tmp2[3],tmp2[1],tmp1[0],tmp1[1])
(SHUF floatVectorFormat (ImmInt 0b00_10_01_00) (OpReg tmp2) dst) -- SHUFPS; dst <- (dst[0],dst[1],tmp2[2],tmp2[0])
_ -> valExp `appOL`
(vecCode dst) `snocOL`
(MOVD II32 (OpReg valReg) (OpReg tmp1)) `snocOL` -- tmp1 <- (val,0,0,0)
(SHUF floatVectorFormat (ImmInt 0b11_10_01_00) (OpReg dst) tmp1) `snocOL` -- SHUFPS; tmp1 <- (tmp1[0],tmp1[1],dst[2],dst[3])
(SHUF floatVectorFormat (ImmInt 0b00_10_01_00) (OpReg tmp1) dst) -- SHUFPS; dst <- (dst[0],dst[1],tmp1[2],tmp1[0])
return $ Any vectorFormat code
vector_int32x4_insert_sse2 _ _ offset = pprPanic "MO_V_Insert: unsupported offset" (pdoc platform offset)
vector_int64x2_insert_sse2 :: CmmExpr
-> CmmExpr
-> CmmExpr
-> NatM Register
vector_int64x2_insert_sse2 vecExpr valExpr (CmmLit offset)
= do
(valReg, valExp) <- getNonClobberedReg valExpr
(vecReg, vecExp) <- getSomeReg vecExpr -- NB: vector regs never clobbered by instruction
let fmt = VecFormat len FmtInt64
let fmt = VecFormat 2 FmtInt64
tmp <- getNewRegNat fmt
let code dst
= case offset of
......@@ -1923,8 +2148,7 @@ getRegister' platform _is32Bit (CmmMachOp mop [x, y, z]) = do -- ternary MachOps
(PUNPCKLQDQ fmt (OpReg tmp) dst)
_ -> pprPanic "MO_V_Insert Int64X2: unsupported offset" (ppr offset)
in return $ Any fmt code
vector_int_insert_sse _ _ _ _ _ =
sorry "Unsupported integer vector insert operation; please use -fllvm"
vector_int64x2_insert_sse2 _ _ offset = pprPanic "MO_V_Insert Int64X2: unsupported offset" (pdoc platform offset)
getRegister' _ _ (CmmMachOp mop (_:_:_:_:_)) =
pprPanic "getRegister(x86): MachOp with >= 4 arguments" (text $ show mop)
......
......@@ -288,8 +288,11 @@ data Instr
-- NOTE: Instructions follow the AT&T syntax
-- Constructors and deconstructors
| VBROADCAST Format Operand Reg
| VPBROADCAST Format Format Operand Reg -- scalar format, vector format, source, destination
| VEXTRACT Format Imm Reg Operand
| INSERTPS Format Imm Operand Reg
| PINSR Format Format Imm Operand Reg -- scalar format, vector format, offset, scalar src, vector
| PEXTR Format Format Imm Reg Operand -- scalar format, vector format, offset, vector src, scalar dst
-- move operations
......@@ -327,6 +330,8 @@ data Instr
| MOVHLPS Format Reg Reg
| UNPCKL Format Operand Reg
| PUNPCKLQDQ Format Operand Reg
| PUNPCKLWD Format Operand Reg
| PUNPCKLBW Format Operand Reg
-- Shift
| PSLLDQ Format Operand Reg
......@@ -476,6 +481,7 @@ regUsageOfInstr platform instr
-- vector instructions
VBROADCAST fmt src dst -> mkRU (use_R fmt src []) [mk fmt dst]
VPBROADCAST sFmt vFmt src dst -> mkRU (use_R sFmt src []) [mk vFmt dst]
VEXTRACT fmt _off src dst -> usageRW fmt (OpReg src) dst
INSERTPS fmt (ImmInt off) src dst
-> mkRU ((use_R fmt src []) ++ [mk fmt dst | not doesNotReadDst]) [mk fmt dst]
......@@ -488,6 +494,10 @@ regUsageOfInstr platform instr
where pos = ( off `shiftR` 4 ) .&. 0b11
INSERTPS fmt _off src dst
-> mkRU ((use_R fmt src []) ++ [mk fmt dst]) [mk fmt dst]
PINSR sFmt vFmt _off src dst
-> mkRU (use_R sFmt src [mk vFmt dst]) [mk vFmt dst]
PEXTR sFmt vFmt _off src dst
-> usageRW' vFmt sFmt (OpReg src) dst
VMOVU fmt src dst -> usageRW fmt src dst
MOVU fmt src dst -> usageRW fmt src dst
......@@ -530,6 +540,10 @@ regUsageOfInstr platform instr
-> mkRU (use_R fmt src [mk fmt dst]) [mk fmt dst]
PUNPCKLQDQ fmt src dst
-> mkRU (use_R fmt src [mk fmt dst]) [mk fmt dst]
PUNPCKLWD fmt src dst
-> mkRU (use_R fmt src [mk fmt dst]) [mk fmt dst]
PUNPCKLBW fmt src dst
-> mkRU (use_R fmt src [mk fmt dst]) [mk fmt dst]
MINMAX _ _ fmt src dst
-> usageRM fmt src dst
......@@ -739,10 +753,16 @@ patchRegsOfInstr platform instr env
-- vector instructions
VBROADCAST fmt src dst -> VBROADCAST fmt (patchOp src) (env dst)
VPBROADCAST fmt1 fmt2 src dst
-> VPBROADCAST fmt1 fmt2 (patchOp src) (env dst)
VEXTRACT fmt off src dst
-> VEXTRACT fmt off (env src) (patchOp dst)
INSERTPS fmt off src dst
-> INSERTPS fmt off (patchOp src) (env dst)
PINSR fmt1 fmt2 off src dst
-> PINSR fmt1 fmt2 off (patchOp src) (env dst)
PEXTR fmt1 fmt2 off src dst
-> PEXTR fmt1 fmt2 off (env src) (patchOp dst)
VMOVU fmt src dst -> VMOVU fmt (patchOp src) (patchOp dst)
MOVU fmt src dst -> MOVU fmt (patchOp src) (patchOp dst)
......@@ -779,6 +799,10 @@ patchRegsOfInstr platform instr env
-> UNPCKL fmt (patchOp src) (env dst)
PUNPCKLQDQ fmt src dst
-> PUNPCKLQDQ fmt (patchOp src) (env dst)
PUNPCKLWD fmt src dst
-> PUNPCKLWD fmt (patchOp src) (env dst)
PUNPCKLBW fmt src dst
-> PUNPCKLBW fmt (patchOp src) (env dst)
MINMAX minMax ty fmt src dst
-> MINMAX minMax ty fmt (patchOp src) (patchOp dst)
......