GHC's file locking mechanism not prepared for close() returning -1
There is a bug in GHC file locking mechanism which manifests itself when close() returns -1. The reasons for the problem, as I see it, are:
-
asymmetry in the locking mechanism - creating the lock by device/inode, but
releasing by fd
-
not handling exceptions properly in hClose
-
trying to do the job of the operating system and/or trying to correct the
behavior of the operating system - but this one is a criticism of Haskell
Report, not of GHC
I will demonstrate the scenario on an example program:
main = do
try $ do
h <- openFile "A" AppendMode
-- As part of opening A, GHC adds a lock to the writeLock table
-- (libraries/base/cbits/lockFile.c).
hClose h
-- Assume that close() returned -1 when performing hClose, but released
-- the file descriptor anyway (this happened for us).
-- In hClose_help (libraries/base/GHC/Handle.hs) an exception is
-- raised by (throwErrnoIfMinus1Retry_ "hClose" ...).
-- As a result, (unlockFile fd) is not performed, which leaves
-- a lock on A's device and inode. Further attempts to open A
-- will fail, of course, but let's look at a more interesting scenario.
-- Remember that h's underlying file descriptor was released and is
-- available.
do
h1 <- openFile "B" AppendMode
-- Assume that "B" is opened under the same file descriptor that was earlier
-- assigned to "A".
-- OpenFile locks "B", and the lock for it is put at the end of writeLock
-- table.
hClose h1
-- What happens here is that unlockFile removes the first lock matching h1's
-- file descriptor. But this is the lock for A!
-- The lock for B is left intact.
do
h1 <- openFile "B" AppendMode
-- openFile fails because of the stray lock for B.
hClose h1
A can see the following solutions to this problem:
-
handle the exception properly. I think this should be straightforward, but there may
be some hidden dangers - after all, when close() fails you already have a strange situation.
Perhaps we should check if the file descriptor is still in use with fstat()? On the other
hand, it shouldn't make much difference as long as the Handle is marked as closed.
-
remove all locks with matching fd - this would only fix the A/B problem, and it would be
a bit awkward - after all, we probably don't want to have duplicated fds in the lock tables
If we can agree on the acceptable solution, I could try fixing it myself. This would be a good exercise in working with GHC's darcs repository.
BTW, functions lockFile and unlockFile seem to be quite expensive:
- they use linear search on the lock tables
- when deleting an entry unlockFile moves all subsequent entries up
Perhaps it would be a good idea to get rid of locking requirements from haskell-prime?
Trac metadata
Trac field | Value |
---|---|
Version | 6.6.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | libraries/base |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | tomasz.zielonka@gmail.com |
Operating system | Multiple |
Architecture | Unknown |