1. 01 Dec, 2016 1 commit
  2. 30 Nov, 2016 1 commit
    • Tamar Christina's avatar
      Fix testsuite threading, timeout, encoding and performance issues on Windows · 0ce59be3
      Tamar Christina authored
      In a land far far away, a project called Cygwin was born.
      Cygwin used newlib as it's standard C library implementation.
      
      But Cygwin wanted to emulate POSIX systems as closely as possible.
      So it implemented `execv` using the Windows function `spawnve`.
      
      Specifically
      
      ```
      spawnve (_P_OVERLAY, path, argv, cur_environ ())
      ```
      
      `_P_OVERLAY` is crucial, as it makes the function behave *sort of*
      like execv on linux. the child process replaces the original process.
      
      With one major difference because of the difference in process models
      on Windows: the original process signals the caller that it's done.
      
      this is why the file is still locked. because it's still running,
      control was returned because the parent process was destroyed,
      but the child is still running.
      
      I think it's just pure dumb luck, that the older runtimes are slow
      enough to give the process time to terminate before we tried deleting
      the file.  Which explains why you do have sporadic failures even on
      older runtimes like 2.5.0, of a test or two (like T7307).
      
      So this patch fixes a couple of things. I leverage the existing
      `timeout.exe` to implement a workaround for this issue.
      
      a) The old timeout used to start the process then assign it to the job.
         This is slightly faulty since child processes are only assigned to a
         job is their parent were assigned at the time they started. So this
         was a race condition. I now create the process suspended, assign it
         to the job and then resume it. Which means all child processes are
         not running under the same job.
      
      b) First things, Is to prevent dangling child processes. I mark the job
         with `JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE` so when the last process in
         the job is done, it insures all processes under the job are killed.
      
      c) Secondly, I change the way we wait for results. Instead of waiting
         for the parent process to terminate, I wait for the job itself to
         terminate.
      
         There's a slight subtlety there, we can't wait on the job itself.
         Instead we have to create an I/O Completion port and wait for signals
         on it.  See
         https://blogs.msdn.microsoft.com/oldnewthing/20130405-00/?p=4743
      
      This fixes the issues on all runtimes for me and makes T7307 pass
      consistenly.
      
      The threading was also simplified by hiding all the locking in a single
      semaphore and a completion class. Futhermore some additional error
      reporting was added.
      
      For encoding the testsuite now no longer passes a file handle to the
      subprocess since on windows, sh.exe seems to acquire a lock on the file
      that is not released in a timely fashion.
      
      I suspect this because cygwin seems to emulate console handles by
      creating file handles and using those for std handles. So when we give
      it an existing file handle it just locks the file. I what's happening is
      that it's not releasing the handle until all shared cygwin processes are
      dead. Which explains why it worked in single threaded mode.
      
      So now instead we pass a pipe and do not interpret the resulting data.
      
      Any bytes written to stdin or read out of stdout/stderr are done so in
      binary mode and we do not interpret the data. The reason for this is
      that we have encoding tests in GHC which pass invalid utf-8. If we try
      to handle the data as text then python will throw an exception instead
      of a test comparison failing.
      
      Also I have fixed the ability to override `PYTHON` when calling `make
      tests`. This now works the same as with `.\validate`.
      
      Finally, after cleaning up the locks I was able to make the abort
      behavior work correctly as I believe it was intended: when you press
      Ctrl+C and send an interrupt signal, the testsuite finishes the active
      tests and then gracefully exits showing you a report of the progress it
      did make. So using Ctrl+C will not just *die* as it did before.
      
      These changes lift the restriction on which python version you use
      (msys/mingw) or which runtime or python 3 or python 2.  All combinations
      should now be supported.
      
      Test Plan:
      PATH=/usr/local/bin:/mingw64/bin:$APPDATA/cabal/bin:$PATH &&
      PYTHON=/usr/bin/python THREADS=9 make test
      THREADS=9 make test
      PATH=/usr/local/bin:/mingw64/bin:$APPDATA/cabal/bin:$PATH &&
      PYTHON=/usr/bin/python ./validate --quiet --testsuite-only
      
      Reviewers: erikd, RyanGlScott, bgamari, austin
      
      Subscribers: jrtc27, mpickering, thomie, #ghc_windows_task_force
      
      Differential Revision: https://phabricator.haskell.org/D2684
      
      GHC Trac Issues: #12725, #12554, #12661, #12004
      0ce59be3
  3. 25 Nov, 2016 1 commit
  4. 17 Nov, 2016 1 commit
    • Ben Gamari's avatar
      testsuite: Rip out hack for #12554 · 4d4f3533
      Ben Gamari authored
      @Phyx is working on correctly fixing (pun intended) the underlying issue
      that prompted this hack. It turns out that `timeout` it the culprit.
      Moreover, this hack breaks on msys python builds, which don't export
      `WindowsError`.
      
      Test Plan: Validate on Windows with `msys` python.
      
      Reviewers: Phyx, austin
      
      Subscribers: thomie, Phyx
      
      Differential Revision: https://phabricator.haskell.org/D2724
      
      GHC Trac Issues: #12554
      4d4f3533
  5. 17 Oct, 2016 2 commits
    • Ben Gamari's avatar
      testsuite/driver: More Unicode awareness · 7d2df320
      Ben Gamari authored
      Explicitly specify utf8 encoding in a few spots which were failing on
      Windows with Python 3.
      
      Test Plan: Validate
      
      Reviewers: austin, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2602
      
      GHC Trac Issues: #9184
      7d2df320
    • Ben Gamari's avatar
      testsuite: Work around #12554 · 9cb44598
      Ben Gamari authored
      It seems that Python 2.7.11 and "recent" msys2 releases are broken,
      holding open file locks unexpected. This causes rmtree to intermittently
      fail. Even worse, it would fail silently (since we pass
      ignore_errors=True), causing makedirs to fail later.
      
      We now explicitly check for the existence of the test directory before
      attempting to delete it and disable ignore_errors. Moreover, on Windows
      we now try multiple times to rmtree the testdir, working around the
      apparently msys bug.
      
      This is all just terrible, but Phyx and I spent several hours trying to
      track down the issue to no available. The workaround is better than
      nothing.
      9cb44598
  6. 08 Oct, 2016 1 commit
  7. 30 Jun, 2016 1 commit
    • thomie's avatar
      Testsuite: do not depend on sys.stdout.encoding · e8d62711
      thomie authored
      The cause of #12213 is in dump_stdout and dump_stderr:
      
            print(read_no_crs(<filename>))
      
      Commit 6f6f5154 changed read_no_crs to
      return a unicode string. Printing a unicode strings works fine as long
      as sys.stdout.encoding is 'UTF-8'.
      
      There are two reasons why sys.stdout.encoding might not be 'UTF-8'.
      
      * When output is going to a file, sys.stdout and sys.stdout do not respect
        the locale:
      
        $ LC_ALL=en_US.utf8 python -c 'import sys; print(sys.stderr.encoding)'
        UTF-8
        $ LC_ALL=en_US.utf8 python -c 'import sys; print(sys.stderr.encoding)' 2>/dev/null
        None
      
      * When output is going to the terminal, explicitly reopening sys.stdout has
        the side-effect of changing sys.stdout.encoding from 'UTF-8' to 'None'.
      
            sys.stdout = os.fdopen(sys.__stdout__.fileno(), "w", 0)
      
        We currently do this to set a buffersize of 0 (the actual
        buffersize used is irrelevant for the sys.stdout.encoding problem).
      
      Solution: fix dump_stdout and dump_stderr to not use read_no_crs.
      e8d62711
  8. 29 Jun, 2016 2 commits
  9. 28 Jun, 2016 4 commits
    • thomie's avatar
      Testsuite: framework failure improvements (#11165) · 782cacf5
      thomie authored
      * add framework failures to unexpected results list
      * report errors in .T files as framework failures (show in summary)
      * don't report missing tests when framework failures in .T files
      782cacf5
    • thomie's avatar
      Testsuite: cleanup printing of summary · d8e9b876
      thomie authored
      Just use a simple list of tuples, instead of a nested map.
      
      -90 lines of code.
      d8e9b876
    • thomie's avatar
      Testsuite: open/close stdin/stdout/stderr explicitly · 58f0086b
      thomie authored
      This allows run_command's to contain `|`, and `no_stdin` isn't necessary
      anymore.
      
      Unfortunately it doesn't fix T7037 on Windows which I had hoped it would
      (testsuite driver tries to read a file that it just created itself, but
      the OS says it doesn't exist).
      
      The only drawback of this commit is that the command that the testsuite
      prints to the terminal (for debugging purposes) doesn't mention the
      files that stdout and stderr are redirected to anymore. This is probably
      ok.
      
      Update submodule unix.
      
      Differential Revision: https://phabricator.haskell.org/D1234
      58f0086b
    • thomie's avatar
      Testsuite: simplify extra_file handling · 206b4a1d
      thomie authored
      Before, `extra_files(['.hpc/Main.mix'])` meant copy `Main.mix` to
      `<testdir>/.hpc/Main.mix`. This feature wasn't really necessary, so now
      it just means copy `Main.mix` to `<testdir>/Main.mix`. This simplifies
      the implementation.
      
      Some small other cleanups as well. -40 lines of code.
      206b4a1d
  10. 27 Jun, 2016 3 commits
  11. 24 Jun, 2016 1 commit
  12. 20 Jun, 2016 6 commits
  13. 18 Jun, 2016 3 commits
  14. 09 Jun, 2016 1 commit
  15. 08 Jun, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Show sources of cost centers in .prof · d7933cbc
      Ömer Sinan Ağacan authored
      This fixes the problem with duplicate cost-centre names that was
      reported a couple of times before. When a module implements a typeclass
      multiple times for different types, methods of different implementations
      get same cost-centre names and are reported like this:
      
          COST CENTRE MODULE            %time %alloc
      
          CAF         GHC.IO.Handle.FD    0.0   32.8
          CAF         GHC.Read            0.0    1.0
          CAF         GHC.IO.Encoding     0.0    1.8
          showsPrec   Main                0.0    1.2
          readPrec    Main                0.0   19.4
          readPrec    Main                0.0   20.5
          main        Main                0.0   20.2
      
                                                  individual      inherited
          COST CENTRE  MODULE  no.     entries  %time %alloc   %time %alloc
      
          MAIN         MAIN     53          0    0.0    0.2     0.0  100.0
           CAF         Main    105          0    0.0    0.3     0.0   62.5
            readPrec   Main    109          1    0.0    0.6     0.0    0.6
            readPrec   Main    107          1    0.0    0.6     0.0    0.6
            main       Main    106          1    0.0   20.2     0.0   61.0
             ==        Main    114          1    0.0    0.0     0.0    0.0
             ==        Main    113          1    0.0    0.0     0.0    0.0
             showsPrec Main    112          2    0.0    1.2     0.0    1.2
             showsPrec Main    111          2    0.0    0.9     0.0    0.9
             readPrec  Main    110          0    0.0   18.8     0.0   18.8
             readPrec  Main    108          0    0.0   19.9     0.0   19.9
      
      It's not possible to tell from the report which `==` took how long. This
      patch adds one more column at the cost of making outputs wider. The
      report now looks like this:
      
          COST CENTRE MODULE           SRC                       %time %alloc
      
          CAF         GHC.IO.Handle.FD <entire-module>             0.0   32.9
          CAF         GHC.IO.Encoding  <entire-module>             0.0    1.8
          CAF         GHC.Read         <entire-module>             0.0    1.0
          showsPrec   Main             Main_1.hs:7:19-22           0.0    1.2
          readPrec    Main             Main_1.hs:7:13-16           0.0   19.5
          readPrec    Main             Main_1.hs:4:13-16           0.0   20.5
          main        Main             Main_1.hs:(10,1)-(20,20)    0.0   20.2
      
                                                                             individual      inherited
          COST CENTRE  MODULE        SRC                      no. entries  %time %alloc   %time %alloc
      
          MAIN         MAIN          <built-in>                53      0    0.0    0.2     0.0  100.0
           CAF         Main          <entire-module>          105      0    0.0    0.3     0.0   62.5
            readPrec   Main          Main_1.hs:7:13-16        109      1    0.0    0.6     0.0    0.6
            readPrec   Main          Main_1.hs:4:13-16        107      1    0.0    0.6     0.0    0.6
            main       Main          Main_1.hs:(10,1)-(20,20) 106      1    0.0   20.2     0.0   61.0
             ==        Main          Main_1.hs:7:25-26        114      1    0.0    0.0     0.0    0.0
             ==        Main          Main_1.hs:4:25-26        113      1    0.0    0.0     0.0    0.0
             showsPrec Main          Main_1.hs:7:19-22        112      2    0.0    1.2     0.0    1.2
             showsPrec Main          Main_1.hs:4:19-22        111      2    0.0    0.9     0.0    0.9
             readPrec  Main          Main_1.hs:7:13-16        110      0    0.0   18.8     0.0   18.8
             readPrec  Main          Main_1.hs:4:13-16        108      0    0.0   19.9     0.0   19.9
           CAF         Text.Read.Lex <entire-module>          102      0    0.0    0.5     0.0    0.5
      
      To fix failing test cases because of different orderings of cost centres
      (e.g. optimized and non-optimized build printing in different order),
      with this patch we also start sorting cost centres before printing. The
      order depends on 1) entries (more entered cost centres come first) 2)
      names (using strcmp() on cost centre names).
      
      Reviewers: simonmar, austin, erikd, bgamari
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2282
      
      GHC Trac Issues: #11543, #8473, #7105
      d7933cbc
  16. 07 Jun, 2016 1 commit
  17. 25 May, 2016 1 commit
  18. 24 May, 2016 1 commit
  19. 17 May, 2016 2 commits
  20. 30 Apr, 2016 1 commit
  21. 28 Apr, 2016 1 commit
  22. 25 Mar, 2016 1 commit
  23. 29 Feb, 2016 1 commit
    • thomie's avatar
      Testsuite: check actual_prof_file only when needed · e3b9dbf4
      thomie authored
      Might be a little faster. Avoids testing for #6113 (.prof file not
      written when process is killed with any signal but SIGINT) for tests
      that don't have a .prof.sample file (which is almost all of them) when
      running the profiling ways.
      Tests that were failing because of #6113: T8089, overflow1, overflow2 and
      overflow3.
      e3b9dbf4
  24. 25 Feb, 2016 1 commit
  25. 23 Feb, 2016 1 commit