Optimise writing out the .s file
I noticed while working on the new IO library that GHC was writing out the .s file in lots of little chunks. It turns out that this is a result of using multiple printDocs to avoid space leaks in the NCG, where each printDoc is finishing up with an hFlush. What's worse, is that this makes poor use of the optimisation inside printDoc that uses its own buffering to avoid hitting the Handle all the time. So I hacked around this by making the buffering optimisation inside Pretty visible from the outside, for use in the NCG. The changes are quite small.