Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP design with language-ext #515

Closed
andyigreg opened this issue Oct 29, 2018 · 15 comments
Closed

FP design with language-ext #515

andyigreg opened this issue Oct 29, 2018 · 15 comments

Comments

@andyigreg
Copy link

This is more a general design question rather than an issue with the language-ext (which is fantastic), but I think my mental block over this issue is stopping me getting the most out of the library. So apologies of this is in the wrong place.

Basically I have been using the library for a couple of years, adding in more and more functional techniques as I learn more, and I have read the book by Enrico. But I frequently run into this clash: fp design principles say push the impure code to the extremeties. And I try to do this. But I often encounter processes where the business logic will calculate something and then I need to conditionally retrieve data based on that something before passing that result to another set of pure functions.

This means I end up with more of a sandwich structure: impure-pure-impure-pure-impure which just feels wrong.

One solution is to try to load everything up front, anticipating what the dependencies might be. Even should this be possible it doesn't seem very efficient from a data access perspective.

Another is to split into multiple processes, but I'm struggling to see how to structure this given that one end of the process is UI and the other is data access. The middle bit is an internal decision. Maybe some event based structure?

Sorry if the question is too vague.

@TysonMN
Copy link
Contributor

TysonMN commented Oct 29, 2018

What logic is so special that it gets decorated with the adjective "business"? I sometimes use the phrase "business logic" (correctly I think), but I don't know a good definition for it.

@andyigreg
Copy link
Author

I use it to refer to any logic that encodes business rules. As opposed to logic that is used to construct a framework, or logic that is used for integrations, for example. My main use of fp is for encoding the rules of a system as imposed by business requirements. At the moment the 'superstructure' of the application is still based around DI, and integrations are achieved by following the api you are provided with and those api's are imperative and often OO.

I don't know if that is how other people use the term.

@andyigreg
Copy link
Author

andyigreg commented Oct 29, 2018

Maybe I should rephrase the question to avoid any terminology issues.

How do we push impure functions to the extremities of an application when parts of the code in the application requires intermediate data, such that you end up with a sandwich structure such as impure-pure-impure-pure-impure? How do we avoid this structure?

@TysonMN
Copy link
Contributor

TysonMN commented Oct 29, 2018

...any logic that encodes business rules.

I don't know a good definition for "business rule" either.

@TysonMN
Copy link
Contributor

TysonMN commented Oct 29, 2018

This means I end up with more of a sandwich structure: impure-pure-impure-pure-impure which just feels wrong.

By definition, a pure function cannot have compile-time dependencies on any impure function. So by this "sandwich structure", you must mean that the code being executed at runtime alternates between pure and impure, right?

@andyigreg
Copy link
Author

...any logic that encodes business rules.

I don't know a good definition for "business rule" either.

I don't know a good definition for lots of things. It doesn't stop me recognising them when I see them!! But I think the terminology is entirely beside the point.

@andyigreg
Copy link
Author

I mean a situation such as a web request. The request entry point must be impure because it encodes all subsequent behaviours both pure and impure. Let's say within that entry point we take data input (impure), and then process that data in some way (pure), use the result of the process to look up some intermediate data (impure), then calculate the final values and construct a model say (pure), then write some values from that model back to the db and output them to the web page (impure). How do we push all the impure parts of that chain to the extremities?

@louthy
Copy link
Owner

louthy commented Oct 29, 2018

@andyigreg I have talked about this before with Free Monads and you can see the working in the AccountingDSL sample and the BankingApp sample.

That is the ultimate in total separation as you're creating a DSL to describe the behaviour and then building an interpreter to do the messy stuff.

It isn't particularly easy to build that stuff in C# though. The alternative is to build a domain specific monad. If you spend any time using Haskell then you'll see the use of bespoke monads for a lot of stuff. Monads are built to provide structure and to hide all the messy real world stuff.

For example, I'm working on a new language for an internal project which has a tokeniser, parser, type-inference, and code-gen. Each stage has its own monad (well, it will do when I'm finished).

  • Parser monad (using the one built into LanguageExt.Parsec)
  • Compiler monad (which carries state and is the 'master' monad for all the others)
  • Infer monad which does the type-inference, but also keeps tabs on constraints and named generic arguments
  • CodeGen monad which deals with generating the compiled result

Each one does a specific job and hides the complexity. But also manages stuff like IO and carrying of state through the process. The result is that I have functions like this:

static Compiler<Unit> compileProject(Seq<string> paths) =>
    from _1 in parseFiles(paths)
    from _2 in parseIncludedFiles
    from _3 in rename
    from _4 in envInit
    from _5 in typeCheck
    from _6 in codeGen
    select unit;

Obviously each step is doing a significant amount of work, but what's happening here is that the source files are being loaded from disk, tokenised and parsed, and then any included files are parsed, a renamer runs, then the core types environment is initialised before running the type-checker, and then the code-generator runs. The monad is managing IO, state, error handling, etc. And the end result is pure, declarative, and abstracted so you can see the important stuff. It's pretty damn readable for something so complex.

When you think of monads you should think of them as having two distinct concepts:

  1. The bound value - this is the actual value held inside the monad. So the int in Option<int>
  2. The container - this is the monad itself, the rules that make Option different from Parser, or Lst different from CodeGen.

If we take a look at a regular function:

    a -> b 

The arrow represents a function (or morphism) from type a to b. Then we can compose it with a function b -> c:

    a -> b -> c

Which gives us a function:

    a -> c

This can be thought of as:

    B f(A a);
    C g(B b);
    
    C h(A a) => g(f(a));

Now that's all great and everything. But we might want to do some logging, or pass through some external state (which means adding lots of additional state arguments to our functions), or do some IO. None of which plays well particularly nicely with our lovely simple function composition.

This is where monads come in. They can be seen as an embellishment to the composed operation on the bound values. The rules of the monad and the implementation of the Bind function for monads is what allow monads to compose.

But, and importantly: because the monad container part of it should be seen as separate from the bound value operations, the container bit can do work which is considered impure, without the bound value operation losing its purity.

Now that might sound a bit too convenient a get-out of jail free card, and in some ways it is. But really, it doesn't matter. What you want to do is bury your IO in the monad and then get over it. A Haskell programmer doesn't think of the getChar function as pure, they don't think they will only ever get the same Char back every time they call it. They absolutely think of it as a getting a value from the world each time.

We can do that, but we have to take into account the consequences of that:

  • If getChar could return a different value each time, then how can we test it?
  • How are we going to handle IO errors?
  • Will we get race conditions?

There are 3 lang-ext built in monads that you can use for inspiration here.

  • Reader which takes an environment (think of it as a snapshot of the world)
  • Writer which as well as the bound value builds a log of values
  • State which manages a state value along with the bound value

They are all pure (in every sense), but you can expand on them to add some IO.

So, let's start with the Reader. As that's a good way to get information into the operation:

    public delegate A Reader<Env, A>(Env env);

So, that defines the reader as a delegate. It takes an Env and returns an A. To make it a monad we need to define Return and Bind.

public static class Reader
{
    public static Reader<Env, A> Return<Env, A>(A value) => 
        _ => value;

    public static Reader<Env, B> Bind<Env, A, B>(this Reader<Env, A> ma, Func<A, Reader<Env, B>> f) =>
        env =>
            f(ma(env))(env);
}

Once you have defined the Bind and Return functions you can very easily make it work with LINQ:

public static class Reader
{
    public static Reader<Env, B> Select<Env, A, B>(this Reader<Env, A> ma, Func<A, B> f) =>
        ma.Bind(a => Return<Env, B>(f(a)));

    public static Reader<Env, C> SelectMany<Env, A, B, C>(
        this Reader<Env, A> ma,
        Func<A, Reader<Env, B>> bind,
        Func<A, B, C> project) =>
            ma.Bind(a => bind(a).Select(b => project(a, b)));
}

Id you look carefully Select and SelectMany are derived from Return and Bind and so you can almost copy n paste that wherever you need it, just change the names from Reader to your bespoke monad type.

So, all the magic is in Bind, you can see it runs the Reader by passing an Env through the delegate. At no point does the Env value change, it's just a static piece of environment that's passed through.

Let's take a look at a concrete example.

    public static class Test
    {
        public static Unit AddLineNumbers(string fileName)
        {
            var lines = File.ReadAllLines(fileName);
            var nlines = AddLineNumbers(lines);
            File.WriteAllLines(fileName, nlines);
            return unit;
        }

        static string[] AddLineNumbers(string[] lines) =>
            lines.Zip(Naturals)
                 .Select(pair => $"{pair.Item2}: {pair.Item1}")
                 .ToArray();

        static IEnumerable<int> Naturals =>
            Enumerable.Range(1, Int32.MaxValue);
    }

I realise this is just an example of IO at the edges, but it doesn't really matter for this example.

We want to try and make that pure, so that the IO is abstracted.

So, let's create an environment for the Reader:

public class World
{
    public readonly Func<string, string[]> ReadAllLines;
    public readonly Func<string, string[], Unit> WriteAllLines;

    public World(Func<string, string[]> readAllLines, Func<string, string[], Unit> writeAllLines)
    {
        ReadAllLines = readAllLines;
        WriteAllLines = writeAllLines;
    }
}

Notice how it captures the two IO functions in the original.

The monad needs to be able to get at its environment, so let's add that:

public static class Reader
{
    public static Reader<Env, Env> Ask<Env>() => 
        env => env;
}

This is so simple, it takes the environment that was in the structure of the monad and makes it into the bound value.

So, now we can add a couple of bespoke functions for ReadAllLines and WriteAllLines:

public static class Reader
{
    public static Reader<World, string[]> ReadAllLines(string fileName) =>
        from env in Ask<World>()
        select env.ReadAllLines(fileName);

    public static Reader<World, Unit> WriteAllLines(string fileName, string[] lines) =>
        from env in Ask<World>()
        select env.WriteAllLines(fileName, lines);
}

So, now rewrite the Test class:

public static class Test
{
    public static Reader<World, Unit> AddLineNumbers(string fileName) =>
        from lines in Reader.ReadAllLines(fileName)
        from _     in Reader.WriteAllLines(fileName, AddLineNumbers(lines))
        select _;

    static string[] AddLineNumbers(string[] lines) =>
        lines.Zip(Naturals)
                .Select(pair => $"{pair.Item2}: {pair.Item1}")
                .ToArray();

    static IEnumerable<int> Naturals =>
        Enumerable.Range(1, Int32.MaxValue);
}

And now that will call the injected functions without you having to pass them through explicitly. The reader is called like so:

var world = new World(
    File.ReadAllLines,
    fun<string, string[]>(File.WriteAllLines));

var result = Test.AddLineNumbers("c:\\temp\\test1.txt")(world);

The use of fun is to deal with the fact that File.WriteAllLines returns a void. This makes it return a Unit.

State

But what about if we want to abstract away from files? And we want to specify a context of some sort to read from? That requires us being able to set some state as well as just read some. So, we'll need to update the monad delegate:

public delegate (S, A) State<S, A>(S state);

It looks similar, but instead of returning an A it returns an (S, A).

Let's also update our World to hold a string container value:

public class World
{
    public readonly Func<string, string[]> ReadAllLines;
    public readonly Func<string, string[], Unit> WriteAllLines;
    public readonly string Container;

    public World(Func<string, string[]> readAllLines, Func<string, string[], Unit> writeAllLines, string container)
    {
        ReadAllLines = readAllLines;
        WriteAllLines = writeAllLines;
        Container = container;
    }

    public World SetContainer(string container) =>
        With(Container: container);

    public World With(
        Func<string, string[]> ReadAllLines = null,
        Func<string, string[], Unit> WriteAllLines = null,
        string Container = null) =>
        new World(
            ReadAllLines ?? this.ReadAllLines,
            WriteAllLines ?? this.WriteAllLines,
            Container ?? this.Container);
}

And create a new monad to work with the new State delegate:

public static class State
{
    public static State<S, A> Return<S, A>(A value) =>
        state => (state, value);

    public static State<S, B> Bind<S, A, B>(this State<S, A> ma, Func<A, State<S, B>> f) =>
        state =>
        {
            var (sa, a) = ma(state);
            return f(a)(sa);
        };
}

Notice how the Return now returns a pair of state and value; and Bind now extracts the state and bound value from calling ma and passed the updated sa state value onto the result of calling the bind function f. This propagates the state value throughout the computation.

Then we copy n paste in our boilerplate LINQ stuff:

public static State<S, B> Select<S, A, B>(this State<S, A> ma, Func<A, B> f) =>
    ma.Bind(a => Return<S, B>(f(a)));

public static State<S, C> SelectMany<S, A, B, C>(
    this State<S, A> ma,
    Func<A, State<S, B>> bind,
    Func<A, B, C> project) =>
        ma.Bind(a => bind(a).Select(b => project(a, b)));

Instead of Ask we will have Get as well as a new function called Put that will put any state value back into the monad structure:

public static class State
{
    public static State<S, S> Get<S>() =>
        state => (state, state);

    public static State<S, Unit> Put<S>(S state) =>
        _ => (state, unit);
}

Now we'll add some functions to make it easier to get and set the World and the container:

public static State<World, World> World =>
    Get<World>();

public static State<World, string> Container =>
    from w in World
    select w.Container;

public static State<World, Unit> SetContainer(string container) =>
    from w in World
    from _ in Put(w.SetContainer(container))
    select _;

Then let's update ReadAllLines and WriteAllLines to be file unaware.

public static State<World, string[]> ReadAllLines =>
    from w in World
    from c in Container
    select w.ReadAllLines(c);

public static State<World, Unit> WriteAllLines(string[] lines) =>
    from w in World
    from c in Container
    select w.WriteAllLines(c, lines);

Now the AddLineNumbers function can look like this:

public static State<World, Unit> AddLineNumbers =>
    from lines in State.ReadAllLines
    from _     in State.WriteAllLines(DoAddLineNumbers(lines))
    select _;

And so all the messiness of files and the outside world is now encapsulated within the monad itself.

You can call this computation with this:

var world = new World(
    File.ReadAllLines,
    fun<string, string[]>(File.WriteAllLines),
    "");

var comp = from _1 in State.SetContainer("c:\\temp\\test1.txt")
            from _2 in Test2.AddLineNumbers
            select unit;

var result = comp(world);

But equally you can call it with mocked IO and container details. And so that allows you to build something that does apparently interleaved IO without having to go crazy by building a Free Monad.

Error handling

But, we can take it further. What about error reporting? It will be difficult to make this work with Option, etc. (well, not difficult, just slightly awkward). And we also might have IO exceptions. So, let's deal with that.

First, let's create an Error type:

public class Error : NewType<Error, string>
{
    public Error(string value) : base(value)
    {
    }
}

Next, let's update the delegate.

public delegate Either<Error, (S, A)> State<S, A>(S state);

And so now it returns either an Error or a (S, A) pair. We could have used Try here, but I just wanted to show some bespoke error behaviour to really highlight the idea that you are building a bespoke monad for your own domain.

So, we'll need to update Bind to understand this new return type:

public static State<S, B> Bind<S, A, B>(this State<S, A> ma, Func<A, State<S, B>> f) =>
    state =>
    {
        try
        {
            return ma(state).Bind(pairA => f(pairA.Item2)(pairA.Item1));
        }
        catch(Exception e)
        {
            return Error.New(e.Message);
        }
    };

Notice how it catches the exceptions, this will be built into every call of the computation, bar one, the initial invocation. So, let's have a Run function to capture that:

public static Either<Error, (S, A)> Run<S, A>(this State<S, A> ma, S state)
{
    try
    {
        return ma(state);
    }
    catch (Exception e)
    {
        return Left(Error.New(e.Message));
    }
}

You could make this a Match function so you don't can call ma.Match(state, ...)instead ofma.Run(state).Match(...)`

Short-cutting

If we also add an Quit function, we can use that to short-cut our whole computation:

public static State<World, A> Quit<A>(string message) => _ =>
    Left(Error.New(message));

Then we can update our Read and Write functions:

public static State<World, string[]> ReadAllLines =>
    from w in World
    from c in Container
    from r in c == ""
        ? State.Quit<string[]>("Container not set")
        : State.ReturnW<string[]>(w.ReadAllLines(c))
    select r;

public static State<World, Unit> WriteAllLines(string[] lines) =>
    from w in World
    from c in Container
    from r in c == ""
        ? State.Quit<Unit>("Container not set")
        : State.ReturnW<Unit>(w.WriteAllLines(c, lines))
    select r;

I also added ReturnW which is the same as Return with the W set to World.

Logging

But why stop there? We could do some logging too. Let's add a log to the World:

public class World
{
    public readonly Func<string, string[]> ReadAllLines;
    public readonly Func<string, string[], Unit> WriteAllLines;
    public readonly string Container;
    public readonly Seq<string> Output;

    public World(Func<string, string[]> readAllLines, Func<string, string[], Unit> writeAllLines, string container, Seq<string> output)
    {
        ReadAllLines = readAllLines;
        WriteAllLines = writeAllLines;
        Container = container;
        Output = output;
    }

    public World Log(string message) =>
        With(Output: Output.Add(message));

    public World SetContainer(string container) =>
        With(Container: container);

    public World With(
        Func<string, string[]> ReadAllLines = null,
        Func<string, string[], Unit> WriteAllLines = null,
        string Container = null,
        Seq<string> Output = null) =>
        new World(
            ReadAllLines ?? this.ReadAllLines,
            WriteAllLines ?? this.WriteAllLines,
            Container ?? this.Container,
            Output ?? this.Output);
}

Add a Log function to State:

public static State<World, Unit> Log(string message) =>
    from w in World
    from _ in Put(w.Log(message))
    select _;

And then update the ReadAllLines and WriteAllLines functions:

public static State<World, string[]> ReadAllLines =>
    from w in World
    from c in Container
    from r in c == ""
        ? State.Quit<string[]>("Container not set")
        : State.ReturnW<string[]>(w.ReadAllLines(c))
    from _ in Log($"Read {r.Length} lines from container: {c}")
    select r;

public static State<World, Unit> WriteAllLines(string[] lines) =>
    from w in World
    from c in Container
    from r in c == ""
        ? State.Quit<Unit>("Container not set")
        : State.ReturnW<Unit>(w.WriteAllLines(c, lines))
    from _ in Log($"Wrote {lines.Length} lines to container: {c}")
    select r;

Conclusion

And so that's it really, that's how to carry state, abstract away from IO, implement telemetry, but also write pure functions. If the fact that IO is still really happening mid-flight bothers you, then the best option is to preload, because even the Free Monad approach requires mid-flight IO.

@andyigreg
Copy link
Author

Wow! Thankyou so much. I’ll have to read that a few more times before it fully sinks in. Mid flight io doesn’t bother me, I just always had a niggle that it didn’t feel right. Now, I don’t think I’m doing it correctly, but once I’ve digested your response then I think I’ll be back on the right track.

@louthy
Copy link
Owner

louthy commented Oct 29, 2018

I think the main points are:

  • Functions and higher order functions are our building blocks
    • They should be pure
  • Functions should take data-types: product and sum-types as immutable values - i.e. records, tuples, and unions - as arguments, and they should also return immutable data-types.
    • Just to make this point clear. Functionality shouldn't be connected to data.
    • Although this seems unnecessarily strict, so I think if you're going to use methods on data-types, they should only depend on data from within the type (that essentially makes them like a pure function that takes an argument of a data-type).
  • Pure functions compose better than all other mechansims of composition - this is where functional programmers win.
  • Functors, monads, foldables, and monoids are the core tools in our composition toolbox. They capture complexity, allow us to build abstractions, and smooth over our primary job of function composition. You can imagine that the code buried inside the Bind function of a monad or the Map function of a functor to be analogous to code buried in a base-class of an inheritance heirarchy. They're all there to reduce the amount of repeated effort and to capture common functionality, but where objects don't compose: functions, functors, and monads do.

It's restrictive compared to all the sillyness you can do with C#. But it protects us from stupid mistakes and in the long run makes it easier for us to trust the code we've written, making it easier to write more complex applications.

@MrYossu
Copy link
Contributor

MrYossu commented Nov 1, 2018

@louthy I am constantly amazed at how much time and effort you put into answering people's questions. I'm struggling with the same issues as @andyigreg and this has been really helpful. I'm still a long way off understanding this properly, but each of these posts gets me closer.

Any chance you could pull the code above into a complete sample? Would make it easier to play with.

Thanks again for all your efforts.

@andyigreg
Copy link
Author

andyigreg commented Nov 3, 2018

@louthy So I’ve read this several times and played around with some code and I want to see if I’ve got the gist of what you were explaining with your examples.

It seems like your primary focus with FP is composability. Pure, honest functions are the gold standard of composition and so even when our functions are not pure we should attempt to construct them in such a way so that they compose as if they were. Functors, monoids and monads are a way to do this.

By having some class that represents state (world in your example), and a monad to work with that state, we can build code that can pretend that it’s pure even though impure operations are going on in the background.

Whereas I took the heuristic “move all IO to the edges of a program” to be interpreted as meaning IO should be like ‘bookends’, performed at the start and end of some process, it seems that you may be suggesting that another alternative is to use the ‘margins’ of the code. Push all the IO (or any other impurity) sideways to the edge of the code, hidden away in the guts of a monad.

This way we can focus on what the code is doing rather than the messy details and noisy obfuscation created when such concerns are explicit in the code.

So sacrificing some level of purity for a higher level of composition is what you are recommending?

@TonyHernandezAtMS
Copy link

@louthy Can you tag this as documentation? So good!

@MrYossu
Copy link
Contributor

MrYossu commented Dec 12, 2018

@louthy Fully agree with @TonyHernandezAtMS and politely repeat my request that you make a full sample of this so we can see how it all fits together. Would be really useful.

Thanks again for your amazing efforts at educating us!

@MrYossu
Copy link
Contributor

MrYossu commented Feb 12, 2019

@louthy Just been trying out the code here, and I get a compiler error on the following...

public static class Reader {
  public static Reader<Env, Env> Ask<Env>() =>
    env => env;
}

The env at the end of the line is underlined in red, and I get a compiler error Cannot implicitly convert type 'Env' to '(Env Value, bool IsFaulted)'.

What did I do wrong? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants