1. 28 Dec, 2016 1 commit
    • Tamar Christina's avatar
      Fix various issues with testsuite code on Windows · a3704409
      Tamar Christina authored
      Summary:
      Previously we would make direct calls to `diff` using `os.system`.
      On Windows `os.system` is implemented using the standard
      idiom `CreateProcess .. WaitForSingleObject ..`.
      
      This again runs afoul with the `_exec` behaviour on Windows. So we ran
      into some trouble where sometimes `diff` would return before it's done.
      
      On tests which run multiple ways, such as `8086` what happens is that
      we think the diff is done and continue. The next way tries to set things
      up again by removing any previous directory. This would then fail with
      and error saying the directory can't be removed. Which is true, because
      the previous diff code/child is still running.
      
      We shouldn't make any external calls to anything using `os.system`.
      Instead just use `runCmd` which uses `timeout`. This also ensures that if
      we hit the cygwin bug where diff or any other utility hangs, we kill it and
      continue and not hang the entire test and leave hanging processes.
      
      Further more we also:
      I...
      a3704409
  2. 27 Dec, 2016 1 commit
  3. 23 Dec, 2016 1 commit
    • Tamar Christina's avatar
      Allow timeout to kill entire process tree. · efc4a166
      Tamar Christina authored
      Summary:
      we spawn the child processes with handle inheritance on. So they inherit the std handles.
      The problem is that the job handle gets inherited too.
      So the `JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE` doesn't get used since there are
      open handles to the job in the children.
      
      We then terminate the top level process which is `sh` but leaves the children around.
      
      This explicitly disallows the inheritance of the job and events handle.
      
      Test Plan: ./validate
      
      Reviewers: austin, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie, #ghc_windows_task_force
      
      Differential Revision: https://phabricator.haskell.org/D2895
      
      GHC Trac Issues: #13004
      efc4a166
  4. 30 Nov, 2016 1 commit
    • Tamar Christina's avatar
      Fix testsuite threading, timeout, encoding and performance issues on Windows · 0ce59be3
      Tamar Christina authored
      In a land far far away, a project called Cygwin was born.
      Cygwin used newlib as it's standard C library implementation.
      
      But Cygwin wanted to emulate POSIX systems as closely as possible.
      So it implemented `execv` using the Windows function `spawnve`.
      
      Specifically
      
      ```
      spawnve (_P_OVERLAY, path, argv, cur_environ ())
      ```
      
      `_P_OVERLAY` is crucial, as it makes the function behave *sort of*
      like execv on linux. the child process replaces the original process.
      
      With one major difference because of the difference in process models
      on Windows: the original process signals the caller that it's done.
      
      this is why the file is still locked. because it's still running,
      control was returned because the parent process was destroyed,
      but the child is still running.
      
      I think it's just pure dumb luck, that the older runtimes are slow
      enough to give the process time to terminate before we tried deleting
      the file.  Which explains why you do have sporadic failures even on
      older runtimes like 2.5.0, of a test or two (like T7307).
      
      So this patch fixes a couple of things. I leverage the existing
      `timeout.exe` to implement a workaround for this issue.
      
      a) The old timeout used to start the process then assign it to the job.
         This is slightly faulty since child processes are only assigned to a
         job is their parent were assigned at the time they started. So this
         was a race condition. I now create the process suspended, assign it
         to the job and then resume it. Which means all child processes are
         not running under the same job.
      
      b) First things, Is to prevent dangling child processes. I mark the job
         with `JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE` so when the last process in
         the job is done, it insures all processes under the job are killed.
      
      c) Secondly, I change the way we wait for results. Instead of waiting
         for the parent process to terminate, I wait for the job itself to
         terminate.
      
         There's a slight subtlety there, we can't wait on the job itself.
         Instead we have to create an I/O Completion port and wait for signals
         on it.  See
         https://blogs.msdn.microsoft.com/oldnewthing/20130405-00/?p=4743
      
      This fixes the issues on all runtimes for me and makes T7307 pass
      consistenly.
      
      The threading was also simplified by hiding all the locking in a single
      semaphore and a completion class. Futhermore some additional error
      reporting was added.
      
      For encoding the testsuite now no longer passes a file handle to the
      subprocess since on windows, sh.exe seems to acquire a lock on the file
      that is not released in a timely fashion.
      
      I suspect this because cygwin seems to emulate console handles by
      creating file handles and using those for std handles. So when we give
      it an existing file handle it just locks the file. I what's happening is
      that it's not releasing the handle until all shared cygwin processes are
      dead. Which explains why it worked in single threaded mode.
      
      So now instead we pass a pipe and do not interpret the resulting data.
      
      Any bytes written to stdin or read out of stdout/stderr are done so in
      binary mode and we do not interpret the data. The reason for this is
      that we have encoding tests in GHC which pass invalid utf-8. If we try
      to handle the data as text then python will throw an exception instead
      of a test comparison failing.
      
      Also I have fixed the ability to override `PYTHON` when calling `make
      tests`. This now works the same as with `.\validate`.
      
      Finally, after cleaning up the locks I was able to make the abort
      behavior work correctly as I believe it was intended: when you press
      Ctrl+C and send an interrupt signal, the testsuite finishes the active
      tests and then gracefully exits showing you a report of the progress it
      did make. So using Ctrl+C will not just *die* as it did before.
      
      These changes lift the restriction on which python version you use
      (msys/mingw) or which runtime or python 3 or python 2.  All combinations
      should now be supported.
      
      Test Plan:
      PATH=/usr/local/bin:/mingw64/bin:$APPDATA/cabal/bin:$PATH &&
      PYTHON=/usr/bin/python THREADS=9 make test
      THREADS=9 make test
      PATH=/usr/local/bin:/mingw64/bin:$APPDATA/cabal/bin:$PATH &&
      PYTHON=/usr/bin/python ./validate --quiet --testsuite-only
      
      Reviewers: erikd, RyanGlScott, bgamari, austin
      
      Subscribers: jrtc27, mpickering, thomie, #ghc_windows_task_force
      
      Differential Revision: https://phabricator.haskell.org/D2684
      
      GHC Trac Issues: #12725, #12554, #12661, #12004
      0ce59be3
  5. 17 Oct, 2016 1 commit
  6. 28 Jun, 2016 1 commit
  7. 21 Jul, 2015 1 commit
  8. 28 Apr, 2014 1 commit
  9. 20 Jun, 2012 1 commit
  10. 15 Nov, 2011 1 commit
  11. 18 Oct, 2011 1 commit
  12. 13 Oct, 2009 1 commit
  13. 22 Dec, 2008 1 commit
  14. 03 Aug, 2008 1 commit
  15. 23 Jun, 2008 1 commit
  16. 20 Jan, 2008 1 commit
  17. 04 Mar, 2007 1 commit
  18. 17 Sep, 2006 1 commit
    • brianlsmith's avatar
      Allow testsuite to run under MSYS/MinGW using native Python (not Cygwin Python). · e1bdf63f
      brianlsmith authored
      This patch is based on a similar one "Enable timeout in Windows
      and don't require cygwin python" by Esa Ilari Vuokko. It seems
      like timeout is always built on Windows so I rearranged the logic
      there to make the code clearer, Esa's patch required the user to
      uncomment the MinGW-specific logic in order for it to work; this
      patch does not have the MinGW-specific logic commented out.
      
      I tested this on the trunk in Ubuntu 6.06 on i686 (VMWare).
      I tested this on the trunk and ghc-6.6 branch on Windows i686.
      e1bdf63f
  19. 23 Mar, 2006 1 commit
    • Simon Marlow's avatar
      attempt to work around restrictions with fork() & pthreads · 4f4f12e5
      Simon Marlow authored
      In the child process, call exec() directly instead of using
      System.Cmd.system, which involves another fork()/exec() and a
      non-blocking wait.  The problem is that in a forked child of a
      threaded process, it isn't safe to do much except exec() according to
      POSIX.  In fact calling pthread_create() in the child causes the
      pthread library to fail with an error on FreeBSD.
      4f4f12e5
  20. 23 Nov, 2005 1 commit
  21. 11 Nov, 2005 1 commit
  22. 04 Aug, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-08-04 12:22:17 by simonmar] · 01a7c0fe
      simonmar authored
      A better timeout.  This one starts a new session for the child
      process, and attempts to kill the entire group when the time expires
      (previously we only killed the direct child, if the child itself had
      spawned more processes these would continue to run).
      
      The new scheme is only for Unix, presumably we have to do something
      different on Windows.
      
      Code partly from Ian Lynagh.
      01a7c0fe
  23. 04 Feb, 2005 1 commit