Skip to content

Latest commit

 

History

History
146 lines (114 loc) · 6.36 KB

README.md

File metadata and controls

146 lines (114 loc) · 6.36 KB

Memory Usage Profiling With eventlog2html and ghc-debug

Abstract

Understanding and analysing the memory usage of Haskell programs is a notoriously difficult yet important problem. Recent improvements to GHC's profiling capabilities, along with better tooling, has made it much easier to deeply and precisely analyse the memory usage characteristics of even large Haskell programs.

This workshop aims to present two such tools that allow high and low level memory usage analysis of Haskell programs: eventlog2html and ghc-debug. We will learn how to set up and use eventlog2html to generate high-level visuals and statistics of our program's execution. We will also learn how to set up and use ghc-debug to precisely and programmatically explore our program's low-level memory usage profile.

We will examine these tools by using them on several pre-prepared Haskell programs. The workshop aims to be beneficial to Haskell programmers of all levels. Beginner Haskell programmers can expect to gain a deeper understanding of lazy evaluation and the impacts it can have on program performance. Experienced Haskell programmers can expect to gain an understanding of exactly what these tools have to offer and the skills necessary to use these tools on their own Haskell programs.

Before The Workshop

Make sure you have the tools installed and built. You need to use GHC 9.2.4 or greater. After that, in the root of this repository, run:

cabal build all
cabal install eventlog2html
cabal install ghc-debug-brick

And everything should be ready to go.

Workshop Outline

Goal

The primary goal of this workshop is for participants to gain experience and familiarity with the eventlog2html and ghc-debug memory profiling tools.

Prerequisites: Lazy Evaluation, Normal Forms, etc.

A crucial step in the pursuit of understanding the memory usage of Haskell programs is understanding Haskell's semantics as a lazy programming language. While thorough coverage of such semantics is outside the scope of this workshop, I do hope that much of what we cover will be approachable and enlightening to Haskell beginners and experts alike.

A First Look at ghc-debug

The ghc-debug style of debugging is, like Haskell, somewhat unique. In this style, we have a debuggee and a debugger. The debuggee is the application whose heap profile we would like to analyse. The debugger is the application which will actually execute the analysis.

Communication between the debuggee and debugger happens over a socket, where the debuggee simply responds to requests sent by the debugger. Crucially, it is incredibly simple to turn a Haskell application into a debuggee for analysis using ghc-debug debuggers, as we will see later.

With the above in mind, we can introduce ghc-debug as a set of libraries and tools:

  • ghc-debug-stub: A library containing the functions you should include in your program to perform analysis with ghc-debug debuggers.
  • ghc-debug-client: A library containing useful functions for writing your own heap analysis scripts.
  • ghc-debug-brick: An executable terminal user interface application that can connect to any debuggee.

These aren't all of the packages involved, but they are the big three that we care about as users of ghc-debug.

To get started in the workshop, we will be examining the example heap-shapes application as a debuggee using ghc-debug-brick as our debugger. This will serve as an introduction to the ghc-debug style of debugging, and it will cover some examples of evaluation scenarios that will be important later in the workshop.

The Haskell Is Obviously Better at Everything (HIOBE) Index

The HIOBE Index server (in hiobe-index/server) is the application we would like to profile with the eventlog2html and ghc-debug tools. It is a simple scotty web server application that serves data from a sqlite database on various endpoints. The database is already populated with over 70000 rows. We will generate fake traffic for the application which will cause interesting objects to build up on the heap.

For a full description of the HIOBE Index, see its README.

We will spend the rest of the workshop analysing, understanding, and tuning the memory profile of the HIOBE Index server.

If you want to give it a try, run the server with:

cabal run hiobe-server

You should see the classic scotty Setting phasers to stun... output if everything is okay.

Then run the traffic with:

cabal run hiobe-traffic

Some output should start scrolling by reporting various requests to the server.

Trying the -s flag

We know this program has bad space behavior, because I wanted it to. However, we don't know how bad it is. We'll try to get a very high-level view of its profile by using the -s RTS flag, which prints memory usage statistics on program termination. This is usually a great place to start when profiling a Haskell program's space usage.

In our case, we will find that there's nothing really obviously wrong with the reported memory usage, other than it perhaps being a little high. If we run the application for longer or shorter periods of time, however, the reported memory usage remains the same!

Using eventlog2html

But, as I said above, we know this program has space issues. And if we actually didn't know yet, we would find out when we tried to run the HIOBE server in production.

We can dig deeper by having our program emit an eventlog using the -l RTS option. However, to make the eventlog useful for eventlog2html, we need to supply another flag that enables heap profiling! To start, we'll use the -hT flag to tell the RTS to break down the heap profile by closure type.

In the resulting profile, we see big spikes of allocations happening with ARR_WORDS, THUNK, and : closures: area-chart

For the rest of the workshop, we will use eventlog2html and ghc-debug to answer some very precise questions about the HIOBE server's memory profile.