DPH Matrix product memory usage

This report is from the post at Haskell-cafe "DPH matrix product", I'm reporting it here so developers can define if it's a bug or not and its priority.

On a (I think) standar implementation of matrix product on DPH I notice an excessive use of system memory. At execution time, on matrices of size 300*300 the program does finish (although it is very slow), but on 600*600 it consumes GBs of RAM until the process is aborted.

This is the system information:

Ubuntu 12.04 32-bit
Intel® Core™2 Duo CPU T5270 @ 1.40GHz × 2
2.9 GiB RAM

GHC version:

GHC 7.4.1

DPH libraries:

dph-base-0.6.1.1
(dph-lifted-base-0.6.1.1)
(dph-lifted-vseg-0.6.1.2)
(dph-prim-interface-0.6.1.1)
(dph-prim-par-0.6.1.1)
(dph-prim-seq-0.6.1.1)

Compilation flags:

I'm using two combinations of flags, taken from different sources. In both cases results are identical:

From https://github.com/ghc/packages-dph: -rtsopts -threaded -fllvm -optlo-O3 -Odph -fcpr-off -fno-liberate-case -package dph-lifted-vseg
From dph-examples: -rtsopts -threaded -fllvm -Odph -package dph-lifted-vseg -fcpr-off -fno-liberate-case -fsimpl-tick-factor=1000

Execution flags:

+RTS -N

Tests:

Computing the product of two 400*400 matrices takes 6.037993 seconds.
Computing the product of two 600*600 matrices yields "out of memory (requested 1728053248 bytes)".

DPH code:

{-# LANGUAGE ParallelArrays, ParallelListComp #-}

{-# OPTIONS -fvectorise #-}

module DPH_mmult_wrapper (matMult_wrapper, Matrix_wrapper) where

import qualified Prelude

import Data.Array.Parallel

import Data.Array.Parallel.Prelude.Double as D

import Data.Array.Parallel.Prelude.Int as I

type MMultType = Double

type Matrix = [:[:MMultType:]:]

type MVector = [:MMultType:]

type Matrix_wrapper = PArray (PArray MMultType)

-- matMult_wrapper assumes mB is already transposed

{-# NOINLINE matMult_wrapper #-}

matMult_wrapper :: Matrix_wrapper -> Matrix_wrapper -> Matrix_wrapper

matMult_wrapper mA mB = toPArrayP (mapP toPArrayP (matMult (fromNestedPArrayP mA) (fromNestedPArrayP mB)))

matMult :: Matrix -> Matrix -> Matrix

matMult mA mB = mapP (\row -> mapP (\col -> dotp row col) mB) mA

dotp :: MVector -> MVector -> MMultType

dotp row col = D.sumP (zipWithP (D.*) row col)

I'm reporting this as I think it is the kind of problems intended to be solved in the last definition of the internal DPH structure (the one from "Work Efficient Higher-Order Vectorisation" paper).

If there is any information missing, please comment and I will update the report.

Thanks.

Edited Mar 09, 2019 by Ian Lynagh -

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

DPH Matrix product memory usage