Skip to content

System.IO.openTempFile does not scale

In search of a bug in darcs http://bugs.darcs.net/issue2364 i've notice very bad property of openTempFile: it's pattern is very predictable and has O(n^2) of already created temp files.

Predictability allows very fun bugs survive in buggy programs, like:

  thread1:
    (fn, fh) <- openTempFile "." "hello"
    renameFile fn "something"
    -- some time after
    when (some_rare_buggy_condition) $
        -- oops, reused temp name, but too late, other thread killed it
        writeFileFile fn
  thread2:
    (fn, fh) <- openTempFile "." "hello"
    workWithFn fn -- nobody should touch it, right?

It's very hard to debug data corruption when all temp files are named "foo{pid}" and sometimes "foo{pid+1}".

And more serious bug: the more threads you have trying to create similar temps performance drops significantly:

Attached program shows the following numbers:

$ time ./bench-temps same 2000

real    0m2.795s
user    0m1.516s
sys     0m1.190s

$ time ./bench-temps diff 2000

real    0m0.161s
user    0m0.043s
sys     0m0.115s

It's O(N^2) growing open() storm.

https://github.com/ghc/ghc/blob/master/libraries/base/System/IO.hs#L465

    FileExists -> findTempName (x + 1)

This is the source of the problem. I'd suggest always using random name for it. For portability reasons I suggest adding at least insecure random rand() value from C library.

That way we will succeed in opening temp file at the first attempt.

Trac metadata
Trac field Value
Version 7.8.2
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information