Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate the implementation of RPC service and compilation session #254

Closed
ChrisCummins opened this issue May 10, 2021 · 10 comments
Closed
Assignees
Labels
Enhancement New feature or request
Milestone

Comments

@ChrisCummins
Copy link
Contributor

ChrisCummins commented May 10, 2021

🚀 Feature

Make it easier for users to add support for new compilers by providing base classes that allow users to "fill in the blanks" for interacting with their compiler, without having to implement all of the other RPC interface boilerplate.

Motivation

Top add support for a new compiler, a user must write a compiler service that performs two key functions:

  1. RPC Server: Implements a server that listens on a port for RPC requests that are then dispatched to compilation sessions.
  2. Compilation session: Implements the compilation session interface to do the actual task of applying actions and computing observations.

This division of roles is obvious in the example compiler services. In the python example, the RPC server is implemented here and the compilation session here. The compilation session is where the actual interesting work gets done. The RPC server is boilerplate, and as can be seen in the example, is not an insubstantial amount of code (and must be implemented carefully to avoid concurrency bugs / performance issues). Given that the behavior of the RPC Server is the same across all compiler services, we should provide a builtin implementation so that the user can focus only on implementing the compilation session.

Pitch

Provide helper code for C++ and Python that makes it easier to add support for new compilers.

Python

Add a base class that represents a single interactive session with a compiler:

class CompilationSession:
    """Base class that represents an interactive compilation session."""

    compiler_version: str = ""  # compiler version
    action_spaces: List[ActionSpace] = []  # what your compiler can do
    observation_spaces: List[ObservationSpace] = []  # what features you provide

    def __init__(self, working_dir: Path, action_space_index: int, benchmark: Benchmark):
        """start a new compilation session with the given action space & benchmark"""

    def apply_action(self, action_index: int) -> Tuple[bool, Optional[ActionSpace], bool]:
        """apply an action. Returns a tuple: (end_of_session, new_action_space, action_had_no_effect).
        """

    def get_observation(self, observation_space_index: int) -> Observation:
        """compute an observation"""

    def fork(self) -> CompilationSession:
        """optional. Create a copy of current session state"""

To add support for a new compiler, a user subclasses from this, provides implementations for the abstract methods, and then calls a helper function that will create and launch an RPC service using this class:

from compiler_gym.service import CompilationSession
from compiler_gym.service.runtime import create_and_run_compiler_gym_service

class MyDerivedCompilationSession(CompilationSession):
    ...

if __name__ == "__main__":
    create_and_run_compiler_gym_service(CompilationSession)

The create_and_run_compiler_gym_service() function here performs all of the boilerplate of command line argument parsing, writing to the correct temporary files, handling RPC requests etc.

C++

Very similar to the python interface, only using out-params:

class CompilationSession {
 public:
  virtual string getCompilerVersion() const;
  virtual vector<ActionSpace> getActionSpaces() const = 0;  // what your compiler can do
  virtual vector<ObservationSpace> getObservationSpaces() const = 0;  // features you provide

  [[nodiscard]] virtual Status init(size_t actionSpaceIndex, const Benchmark& benchmark) = 0;

  // apply an action
  [[nodiscard]] virtual Status applyAction(size_t actionIndex, bool* endOfEpisode,
                                                 bool* actionSpaceChanged,
                                                 bool* actionHadNoEffect) = 0;

  // compute an observation
  [[nodiscard]] virtual Status setObservation(size_t observationSpaceIndex,
                                                    Observation* observation) = 0;

  // Optional. Called after all actions / observations in a single step.
  [[nodiscard]] virtual Status endOfActions(bool* endOfEpisode, bool* actionSpaceChanged);

  // Optional. Initialize state from another session.
  [[nodiscard]] virtual Status init(CompilationSession* other);
};

and to use it:

#include "compiler_gym/service/CompilationSession.h"
#include "compiler_gym/service/runtime/Runtime.h"

using namespace compiler_gym;

namespace {

class MyCompilationSession final : public CompilationSession {
 public:
  using CompilationSession::CompilationSession;

  ...
};

}  // namespace

int main(int argc, char** argv) {
  createAndRunCompilerGymService<MyCompilationSession>(argc, argv, "My service");
}
@ChrisCummins ChrisCummins added the Enhancement New feature or request label May 10, 2021
@bwasti
Copy link
Contributor

bwasti commented May 10, 2021

looks good, definitely cleans up a lot of boilerplate

I'm not sure I understand the need for List[ActionSpaces] + actionSpaceIndex on init as opposed to just creating multiple CompilationSessions that have unique ActionSpaces.

@ChrisCummins
Copy link
Contributor Author

ChrisCummins commented May 10, 2021

Ah, that's a good point. The idea was to allow you to cheaply define different subsets of action spaces, like:

action_spaces = [LlvmAllPasses(), LlvmOnlyO3Passes(), ...]

but tbh I'm not sure if that is a common use case, or even if it wouldn't just be better suited by creating different CompilationSessions as you suggested.

So V2 could be:

class CompilationSession:

    ...
    action_space: ActionSpace = ActionSpace()  # what the compiler can do
    ...

    def __init__(self, working_dir: Path, benchmark: Benchmark):
        """start a new compilation session with the given action space & benchmark"""

    ...

@hughleat
Copy link
Contributor

I like it a lot!

With the C++ session, can the out params be references instead of pointers?

In the C++ API, if the action space did change, how do you know to which one it changed?

Is taking action_index future proof? What happens if/when there are non categorical types in the action spaces (I think that can already happen in the AI Gym interface, right?)?

setObservation in C++ API -> getObservation?

Why take actionSpaceIndex when you could just take an action space? Saves users doing the indirection themselves. Same with observation space.

virtual Status init(CompilationSession* other); better is other is const?

I slightly wonder about the format and whether the fixed params shouldn't be passed in to the createAndRun... fn, like the name is already.

@ChrisCummins
Copy link
Contributor Author

With the C++ session, can the out params be references instead of pointers?

Yep, good idea.

In the C++ API, if the action space did change, how do you know to which one it changed?

Ah, good point. In my little demo implementation I added a getActionSpace() method that I forgot to include above. I think its possible to

virtual Status applyAction(size_t actionIndex, bool &endOfEpisode,
                           std::optional<ActionSpace> &newActionSpacce,
                           bool &actionHadNoEffect)

Is taking action_index future proof? What happens if/when there are non categorical types in the action spaces (I think that can already happen in the AI Gym interface, right?)?

Hm, not really. I have an open issue, #52 to add support for non-categorical action spaces, but no concrete plans to work on it. I suppose I could wrap the current integer-only action in an Action protobuf so that future changes to the action space won't affect this interface. That would make it:

    def apply_action(self, action: Action) -> Tuple[bool, Optional[ActionSpace], bool]:
        """apply an action. Returns a tuple: (end_of_session, new_action_space, action_had_no_effect).
        """
        print(action.action_index)  # The index into the current categorical space

setObservation in C++ API -> getObservation?

How about computeObservation? I don't think getX() is a clear name for a method that doesn't return X.

Why take actionSpaceIndex when you could just take an action space? Saves users doing the indirection themselves. Same with observation space.

Consider the computeObservation(...) dispatcher. If we pass in a reference to the observation space message:

// dispatch requires a bunch of string comparisons,
// harder to maintain + refactor
if (observationSpace.name() == "Ir") {
  ...
} else if (observationSpace.name() == "Autophase") {
  ...
}

with indices, you can define an enum to perform that switch, which means the type system will help you:

// fast int switch, + compiler warning if I've missed a case
switch (static_cast<ObservationSpacesEnum>(observationSpaceIndex)) {
  case ObservationSpacesEnum::IR:
    ...
  case ObservationSpacesEnum::AUTOPHASE:
    ...
}

So to me it seems that indirection from observationSpaceIndex -> observationSpace index isn't a priority simply because ObservationSpace / ActionSpace messages aren't really helpful for computing the values, just for describing the spaces.

I'm open to changing this though. :-)

virtual Status init(CompilationSession* other); better is other is const?

I would prefer this, but it would require that whatever APIs the CompilationSession interacts with uses const properly everywhere. A couple of LLVM methods take mutable references for seemingly const behaviors.

I slightly wonder about the format and whether the fixed params shouldn't be passed in to the createAndRun... fn, like the name is already.

Not sure I follow?

@ChrisCummins ChrisCummins added this to the v0.1.9 milestone May 11, 2021
@ChrisCummins
Copy link
Contributor Author

Filed #258 to future-proof the action argument to apply_action()

@hughleat
Copy link
Contributor

hughleat commented May 11, 2021

Consider the computeObservation(...) dispatcher. If we pass in a reference to the observation space message:

Ah, I see. Fair enough.
It would be good if the user didn't have to care about the communication types that the Gym uses. If I were a user, I'd prefer to have a type hierarchy I can extend and let you package things up into transmittable classes. At least as much as possible. So:

struct ObservationSpace<T> {
   virtual string name() const;
   virtual T operator() () = 0;
};

// User code:
struct IRCount: ObservationSpace<int> {
  string name() { return "IRCount"; }
  int operator() () { return ...; }
};

That way I don't have to manually dispatch, you can do it for me after you've found the right one of my observation spaces for me. You can construct what ever you need to transmit by querying a user friendly class, rather than the protobuf (protobufs aren't exactly user friendly).
BTW, I'm aware my suggestion is a lot of work for little gain - I think it should be ignored :-)

ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 11, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 11, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 11, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
@ChrisCummins
Copy link
Contributor Author

ChrisCummins commented May 12, 2021

It would be good if the user didn't have to care about the communication types that the Gym uses. If I were a user, I'd prefer to have a type hierarchy I can extend and let you package things up into transmittable classes.

I agree, though I'm not sure what that type hierarchy would look like. The ObservationSpace example you posted neatly solves the dispatching problem, but I think it might introduce a new problem of figuring out where to keep the compiler "state". What would IRCount::operator() use to figure out the IR count? Does the parent CompilationSession object have the instructions, and the observation space has a back pointer to its owner? That seems a bit circular. On the other hand, if we push the mutable compiler state to every ObservationSpace instance then we come into an issue with duplicating the state across multiple observation spaces.

What I do like about the class hiearchy idea is that it halves the number of methods needed for CompilationSession, as now its enough to just provide a list of observation spaces / action spaces:

template<typename T> // T must be one of: {int64, double, std::string, std::vector<int64>, std::vector<double>}
class ObservationSpace {
 public:
  virtual std::string name() const = 0;
  virtual T operator() () = 0;  // compute an observation
}

template<typename T>  // Future proofing for different action types
class ActionSpace {
 public:
  virtual std::string name() const = 0;
  virtual void operator() (T action) = 0;
}

class CategoricalActionSpace : public ActionSpace<int> {
 public:
  virtual std::vector<std::string> actionNames() const = 0;  // e.g. LLVM pass names
  virtual void operator() (int actionIndex) = 0;
}

class CompilationSession {
 public:
  virtual std::string getCompilerVersion() const;
  // not sure if there's a prettier way to get an iterator over a bunch of polymorphic types
  virtual std::vector<ActionSpace const*> getActionSpaces() const = 0;
  virtual std::vector<ObservationSpace const*> getObservationSpaces() const = 0;

  [[nodiscard]] virtual grpc::Status init(const Benchmark& benchmark) = 0;
}

Happy to keep iterating on this design until we find something that works

ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
This will be used by the CompilationSession runtime to keep track of
the Benchmark protobufs that have been sent by the user to the
service, so that CompilationSession::init() can be passed a benchmark
proto.

This is a generalization of the BenchmarkFactory class that is used by
the LLVM service to keep a bunch of llvm::Modules loaded in
memory. The same class is implemented twice in C++ and Python using
the same semantics and with the same tests.

The cache has a target maximum size based on the number of bytes of
its elements. When this size is reached, benchamrks are evicted using
a random policy. The idea behind random cache eviction is that this
cache will be large enough by default to store a good number of
benchmarks, so exceeding the max cache size implies a training loop in
which random programs are selected from a very large pool, rather than
smaller pool where an LRU policy would be better.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
This will be used by the CompilationSession runtime to keep track of
the Benchmark protobufs that have been sent by the user to the
service, so that CompilationSession::init() can be passed a benchmark
proto.

This is a generalization of the BenchmarkFactory class that is used by
the LLVM service to keep a bunch of llvm::Modules loaded in
memory. The same class is implemented twice in C++ and Python using
the same semantics and with the same tests.

The cache has a target maximum size based on the number of bytes of
its elements. When this size is reached, benchamrks are evicted using
a random policy. The idea behind random cache eviction is that this
cache will be large enough by default to store a good number of
benchmarks, so exceeding the max cache size implies a training loop in
which random programs are selected from a very large pool, rather than
smaller pool where an LRU policy would be better.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
This will be used by the CompilationSession runtime to keep track of
the Benchmark protobufs that have been sent by the user to the
service, so that CompilationSession::init() can be passed a benchmark
proto.

This is a generalization of the BenchmarkFactory class that is used by
the LLVM service to keep a bunch of llvm::Modules loaded in
memory. The same class is implemented twice in C++ and Python using
the same semantics and with the same tests.

The cache has a target maximum size based on the number of bytes of
its elements. When this size is reached, benchamrks are evicted using
a random policy. The idea behind random cache eviction is that this
cache will be large enough by default to store a good number of
benchmarks, so exceeding the max cache size implies a training loop in
which random programs are selected from a very large pool, rather than
smaller pool where an LRU policy would be better.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
This adds a new CompilerGymService class in both C++ and Python that
takes a concrete CompilationSession subclass and provides all of the
runtime logic to start and manage and RPC server that responds to
requests and dispatches to CompilationSession instances.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 12, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
@hughleat
Copy link
Contributor

Does the parent CompilationSession object have the instructions, and the observation space has a back pointer to its owner?

Aye, I think so. Although, you don't have to provide that. Users of the class can do whatever they need to do. Like maybe their compiler isn't thread safe and so they can only support one session per process and everything is global. They can make a back pointer as they see fit when they create their ObservationSpace.

template // Future proofing for different action types
class ActionSpace {
public:
virtual std::string name() const = 0;
virtual T operator() (T action) = 0;

Why return T?

}
class CategoricalActionSpace : public ActionSpace {
public:
virtual std::vectorstd::string actionNames() const = 0; // e.g. LLVM pass names
virtual T operator() (int actionIndex) = 0;

T not defined here.

}```

I'm less sure about this. I have a nagging feeling there's a semantic difference between an observation space and an action space. The former represents a single function, the latter a collection of functions.
The actionIndex gives me the same feeling as the other indices.
Should actions be first class objects and ActionSpaces provide actions?

struct Action {
    virtual string name() const = 0;
    virtual string help() const { return "Perform action " + name()); }
    virtual void operation() () = 0;
};

struct ActionSpace {
    virtual string name() const = 0;
};
struct CategoricalActionSpace: ActionSpace {
    // No virtual methods here
    CategoricalActionSpace(string name, vector<Action> actions) ...
    string name() const { return name; }
    vector<Action> actions() const { return actions; }
    ...
    Action& operator(int key) const { return actions[key]; }
    Action& operator(string key) const { return find_by_name(actions, key); }
};

struct LLVMAction: Action {
    LLVMAction(LLVMCompilerSession& session, string pass) ...
    ...
    static vector<string> passNames = { "dce", "loop-unroll", ... };
    static vector<LLVMAction> actions(LLVMCompilerSession& session) {
        return passNames.map(LLVMAction(session, _)); // Or however C++ does it
    }   ​
};
CategoricalActionSpace llvmActions( "llvm-actions", LLVMAction::actions(session));

// Use an action
llvmActions("dce")();
llvmActions(23)();

I'm not sure if ActionSpace should be templated. For ObservationSpace it works better. You can provide implementations for each for the types you can serialise. For ActionSpace there's more to it, I think, due to the domain specification. I'm not sure though.

BTW, your original design was good - I should be ignored at all costs.

@ChrisCummins
Copy link
Contributor Author

Aye, I think so. Although, you don't have to provide that. Users of the class can do whatever they need to do. Like maybe their compiler isn't thread safe and so they can only support one session per process and everything is global. They can make a back pointer as they see fit when they create their ObservationSpace.

Ah I see, that makes sense. I wonder how much of a boilerplate tradeoff that is - now users will need to make public / add accessors to the internal state of CompilationSession so that their backlinked observation spaces can get at what they need to.

Why return T?

Woops, that's a mistake. Fixed.

I'm less sure about this. I have a nagging feeling there's a semantic difference between an observation space and an action space. The former represents a single function, the latter a collection of functions.
The actionIndex gives me the same feeling as the other indices.
Should actions be first class objects and ActionSpaces provide actions?
...

I like this!

BTW, your original design was good - I should be ignored at all costs.

I'll merge the original design into development but I won't cut a public release until we're happy with it. If were going to introduce a new class hierarchy, now might be a good time for it. Thanks for the suggestions so far!

ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 13, 2021
This will be used by the CompilationSession runtime to keep track of
the Benchmark protobufs that have been sent by the user to the
service, so that CompilationSession::init() can be passed a benchmark
proto.

This is a generalization of the BenchmarkFactory class that is used by
the LLVM service to keep a bunch of llvm::Modules loaded in
memory. The same class is implemented twice in C++ and Python using
the same semantics and with the same tests.

The cache has a target maximum size based on the number of bytes of
its elements. When this size is reached, benchamrks are evicted using
a random policy. The idea behind random cache eviction is that this
cache will be large enough by default to store a good number of
benchmarks, so exceeding the max cache size implies a training loop in
which random programs are selected from a very large pool, rather than
smaller pool where an LRU policy would be better.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 13, 2021
This adds a new CompilerGymService class in both C++ and Python that
takes a concrete CompilationSession subclass and provides all of the
runtime logic to start and manage and RPC server that responds to
requests and dispatches to CompilationSession instances.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 13, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 13, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 14, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 14, 2021
This adds a new CompilerGymService class in both C++ and Python that
takes a concrete CompilationSession subclass and provides all of the
runtime logic to start and manage and RPC server that responds to
requests and dispatches to CompilationSession instances.

Issue facebookresearch#254.
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 17, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 17, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 18, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 18, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 27, 2021
ChrisCummins added a commit to ChrisCummins/CompilerGym that referenced this issue May 27, 2021
@ChrisCummins
Copy link
Contributor Author

Closing this as all of the PRs are now merged, but will keep the discussion going about better ways to structure the action space.

@ChrisCummins ChrisCummins self-assigned this Jul 13, 2021
bwasti pushed a commit to bwasti/CompilerGym that referenced this issue Aug 3, 2021
The CompilationSession class encapsulates an incremental compilation
session.

This is the first in a series of patches that separates the RPC server
from the compilation session.

Issue facebookresearch#254.
bwasti pushed a commit to bwasti/CompilerGym that referenced this issue Aug 3, 2021
This will be used by the CompilationSession runtime to keep track of
the Benchmark protobufs that have been sent by the user to the
service, so that CompilationSession::init() can be passed a benchmark
proto.

This is a generalization of the BenchmarkFactory class that is used by
the LLVM service to keep a bunch of llvm::Modules loaded in
memory. The same class is implemented twice in C++ and Python using
the same semantics and with the same tests.

The cache has a target maximum size based on the number of bytes of
its elements. When this size is reached, benchamrks are evicted using
a random policy. The idea behind random cache eviction is that this
cache will be large enough by default to store a good number of
benchmarks, so exceeding the max cache size implies a training loop in
which random programs are selected from a very large pool, rather than
smaller pool where an LRU policy would be better.

Issue facebookresearch#254.
bwasti pushed a commit to bwasti/CompilerGym that referenced this issue Aug 3, 2021
This adds a new CompilerGymService class in both C++ and Python that
takes a concrete CompilationSession subclass and provides all of the
runtime logic to start and manage and RPC server that responds to
requests and dispatches to CompilationSession instances.

Issue facebookresearch#254.
bwasti pushed a commit to bwasti/CompilerGym that referenced this issue Aug 3, 2021
bwasti pushed a commit to bwasti/CompilerGym that referenced this issue Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants