Another point for Haskell
In Round 1 I described the task: find the number of “Titles” in an HTML file. I started with the Python implementation, and wrote this test:
def test_finds_the_correct_number_of_titles(): assert title_count() == 59
Very simple: I had already downloaded the web page and so this function had to do just two things: (1) read in the file, and then (2) parse the html & count the Title sections.
Not so simple in Haskell
Anyone who’s done Haskell might be able to guess at the problem I was about to have: The very same test, written in Haskell, wouldn’t compile!
it "finds the correct number of titles" $ titleCount `shouldBe` 59
There was no simple way to write that function to do both, read the file and parse it, finally returning a simple integer. I realized that the blocker was Haskell’s famous separation between pure and impure code. Using I/O (reading a file) is an impure task, and so anything combined with it becomes impure. My function’s return value would be locked in an I/O wrapper of sorts.
I got frustrated and thought about dumping Haskell. “Just another example of how it’s too difficult for practical work,” I thought. But then I wondered how hard it would be to read in the file as a fixture in the test, and then call the function? I’d just need to pass the html as a parameter. And yep, this worked:
it "finds the correct number of titles" $ do html <- readFile "nrs.html" titleCount html `shouldBe` 59
As I refactored the code to pass this test, I realized that this is much better: Doing I/O and algorithmic work should always be separate. I had been a little sloppy or lazy in setting up my first task. The app with the Haskell-inspired change will be more reliable and easier to test, regardless which language it ends up being written in.
- Why learning Haskell makes you a better programmer, blog post and Hacker News discussion.
- r/Haskell discussion and disagreement about this post.
- Python vs. Haskell round 1: Test Output.
If that already improves your design, QuickCheck will blow you away.
(The Python port is called hypothesis. Also highly recommended.)
I’m really enjoying this series. At the end of it, of course, one of the two languages may have more ‘points’, but not all points are equal.