System.IO.openTempFile does not scale
In search of a bug in darcs http://bugs.darcs.net/issue2364 i've notice very bad property of openTempFile: it's pattern is very predictable and has O(n^2) of already created temp files.
Predictability allows very fun bugs survive in buggy programs, like:
thread1:
(fn, fh) <- openTempFile "." "hello"
renameFile fn "something"
-- some time after
when (some_rare_buggy_condition) $
-- oops, reused temp name, but too late, other thread killed it
writeFileFile fn
thread2:
(fn, fh) <- openTempFile "." "hello"
workWithFn fn -- nobody should touch it, right?
It's very hard to debug data corruption when
all temp files are named "foo{pid}" and sometimes "foo{pid+1}".
And more serious bug: the more threads you have trying to create similar temps performance drops significantly:
Attached program shows the following numbers:
$ time ./bench-temps same 2000
real 0m2.795s
user 0m1.516s
sys 0m1.190s
$ time ./bench-temps diff 2000
real 0m0.161s
user 0m0.043s
sys 0m0.115s
It's O(N^2) growing open() storm.
https://github.com/ghc/ghc/blob/master/libraries/base/System/IO.hs#L465
FileExists -> findTempName (x + 1)
This is the source of the problem. I'd suggest always using random name for it. For portability reasons I suggest adding at least insecure random rand() value from C library.
That way we will succeed in opening temp file at the first attempt.
Trac metadata
| Trac field | Value |
|---|---|
| Version | 7.8.2 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture |