In this chapter we’ll introduce the final piece of the puzzle that ties together the Repository and Service Layer: the Unit of Work pattern.
If the Repository is our abstraction over the idea of persistent storage, the Unit of Work is our abstraction over the idea of atomic operations. It will allow us to finally, fully, decouple our Service Layer from the data layer.
And we’ll do it using a lovely piece of Python syntax, a context manager.
Here’s how it’ll look in use, when it’s finished:
def allocate(
orderid: str, sku: str, qty: int,
uow: unit_of_work.AbstractUnitOfWork
) -> str:
line = OrderLine(orderid, sku, qty)
with uow: #(1)
batches = uow.batches.list() #(2)
...
batchref = model.allocate(line, batches)
uow.commit() #(3)
-
We’ll start a unit of work as a context manager
-
uow.batches
is the batches repo, so the unit of work provides us access to our permanent storage. -
When we’re done, we commit or roll back our work, using the UOW
The unit of work acts as a single entry point to our persistent storage, and keeps track of what objects were loaded and what the latest state is.[1] This gives us three useful things:
1) It gives us a stable snapshot of the database to work with, so that the objects we use aren’t changing halfway through an operation.
2) It gives us a way to persist all of our changes at once so that if something goes wrong, we don’t end up in an inconsistent state.
3) It offers a simple API to our persistence concerns and gives us a handy place to get a repository.
Here’s a test for a new UnitofWork
(or UoW, which we pronounce "you-wow").
It’s a context manager that allows us to start a transaction, retrieve and get
things from repos, and commit:
def insert_batch(session, ref, sku, qty, eta):
session.execute(
'INSERT INTO batches (reference, sku, _purchased_quantity, eta)'
' VALUES (:ref, :sku, :qty, :eta)',
dict(ref=ref, sku=sku, qty=qty, eta=eta)
)
def get_allocated_batch_ref(session, orderid, sku):
[[orderlineid]] = session.execute(
'SELECT id FROM order_lines WHERE orderid=:orderid AND sku=:sku',
dict(orderid=orderid, sku=sku)
)
[[batchref]] = session.execute(
'SELECT b.reference FROM allocations JOIN batches AS b ON batch_id = b.id'
' WHERE orderline_id=:orderlineid',
dict(orderlineid=orderlineid)
)
return batchref
def test_uow_can_retrieve_a_batch_and_allocate_to_it(session_factory):
session = session_factory()
insert_batch(session, 'batch1', 'HIPSTER-WORKBENCH', 100, None)
session.commit()
uow = unit_of_work.SqlAlchemyUnitOfWork(session_factory) #(1)
with uow:
batch = uow.batches.get(reference='batch1') #(2)
line = model.OrderLine('o1', 'HIPSTER-WORKBENCH', 10)
batch.allocate(line)
uow.commit() #(3)
batchref = get_allocated_batch_ref(session, 'o1', 'HIPSTER-WORKBENCH')
assert batchref == 'batch1'
-
We initialise the Unit of Work using our custom session factory, and get back a
uow
object to use in ourwith
block. -
The UoW gives us access to the batches repository via
uow.batches
-
And we call
commit()
on it when we’re done.
In our tests we’ve implicitly defined an interface for what a unit of work needs to do, let’s make that explicit by using an abstract base class:
class AbstractUnitOfWork(abc.ABC):
def __enter__(self): #(1)
return self #(2)
def __exit__(self, *args): #(2)
self.rollback()
@abc.abstractmethod
def commit(self): #(3)
raise NotImplementedError
@abc.abstractmethod
def rollback(self): #(4)
raise NotImplementedError
def init_repositories(self, batches: repository.AbstractRepository): #(5)
self._batches = batches
@property
def batches(self) -> repository.AbstractRepository: #(5)
return self._batches
-
If you’ve never seen a context manager,
enter
andexit
are the two magic methods that execute when we enter thewith
block and when we exit it. They’re our setup and teardown phases. -
The enter returns
self
, because we want access to theuow
instance and its attributes and methods, inside thewith
block. -
It provides a way to explicitly commit our work
-
If we don’t commit, or if we exit the context manager by raising an error, we do a
rollback
. (the rollback has no effect ifcommit()
has been called. Read on for more discussion of this). -
The other thing we provide is an attribute called
.batches
, which will give us access to the batches repository. Theinit_repositories()
method is needed because different subclasses will want to initialise repositories in slightly different ways, this just gives us a single place to do that.
DEFAULT_SESSION_FACTORY = sessionmaker(bind=create_engine( #(1)
config.get_postgres_uri(),
))
class SqlAlchemyUnitOfWork(AbstractUnitOfWork):
def __init__(self, session_factory=DEFAULT_SESSION_FACTORY):
self.session_factory = session_factory #(1)
def __enter__(self):
self.session = self.session_factory() # type: Session #(2)
self.init_repositories(repository.SqlAlchemyRepository(self.session)) #(2)
return super().__enter__()
def __exit__(self, *args):
super().__exit__(*args)
self.session.close() #(3)
def commit(self): #(4)
self.session.commit()
def rollback(self): #(4)
self.session.rollback()
-
The module defines a default session factory that will connect to postgres, but we allow that to be overriden in our integration tests, so that we can use SQLite instead.
-
The dunder-enter is responsible for starting a database session, and starting a real repository that can use that session
-
We close the session on exit.
-
Finally, we provide concrete
commit()
androllback()
methods that use our database session.
Here’s how we use a fake Unit of Work in our service layer tests
class FakeUnitOfWork(unit_of_work.AbstractUnitOfWork):
def __init__(self):
self.init_repositories(FakeRepository([])) #(1)
self.committed = False #(2)
def commit(self):
self.committed = True #(2)
def rollback(self):
pass
def test_add_batch():
uow = FakeUnitOfWork() #(3)
services.add_batch("b1", "CRUNCHY-ARMCHAIR", 100, None, uow) #(3)
assert uow.batches.get("b1") is not None
assert uow.committed
def test_allocate_returns_allocation():
uow = FakeUnitOfWork() #(3)
services.add_batch("batch1", "COMPLICATED-LAMP", 100, None, uow) #(3)
result = services.allocate("o1", "COMPLICATED-LAMP", 10, uow) #(3)
assert result == "batch1"
...
-
FakeUnitOfWork
andFakeRepository
are tightly coupled, just like the real Unit of Work and Repository classes. That’s fine because we recognise that the objects are collaborators. -
Notice the similarity with the fake
commit()
function fromFakeSession
(which we can now get rid of). But it’s a substantial improvement because we’re now faking out code that we wrote, rather than 3rd party code. Some people say "don’t mock what you don’t own". -
And in our tests, we can instantiate a UoW and pass it to our service layer, instead of a repository and a session, which is considerably less cumbersome.
TODO: Defend the mocking point
And here’s what our new service layer looks like:
def add_batch(
ref: str, sku: str, qty: int, eta: Optional[date],
uow: unit_of_work.AbstractUnitOfWork #(1)
):
with uow:
uow.batches.add(model.Batch(ref, sku, qty, eta))
uow.commit()
def allocate(
orderid: str, sku: str, qty: int,
uow: unit_of_work.AbstractUnitOfWork #(1)
) -> str:
line = OrderLine(orderid, sku, qty)
with uow:
batches = uow.batches.list()
if not is_valid_sku(line.sku, batches):
raise InvalidSku(f'Invalid sku {line.sku}')
batchref = model.allocate(line, batches)
uow.commit()
return batchref
-
Our service layer now only has the one dependency, once again on an abstract Unit of Work.
To convince ourselves that the commit/rollback behavior works, we wrote a couple of tests:
def test_rolls_back_uncommitted_work_by_default(session_factory):
uow = unit_of_work.SqlAlchemyUnitOfWork(session_factory)
with uow:
insert_batch(uow.session, 'batch1', 'MEDIUM-PLINTH', 100, None)
new_session = session_factory()
rows = list(new_session.execute('SELECT * FROM "batches"'))
assert rows == []
def test_rolls_back_on_error(session_factory):
class MyException(Exception):
pass
uow = unit_of_work.SqlAlchemyUnitOfWork(session_factory)
with pytest.raises(MyException):
with uow:
insert_batch(uow.session, 'batch1', 'LARGE-FORK', 100, None)
raise MyException()
new_session = session_factory()
rows = list(new_session.execute('SELECT * FROM "batches"'))
assert rows == []
Tip
|
We haven’t shown it here, but it can be worth testing some of the more "obscure" database behavior, like transactions, against the "real" database, ie the same engine. For now we’re getting away with using SQLite instead of Postgres, but in [chapter_06_aggregate] we’ll switch some of the tests to using the real DB. It’s convenient that our UoW class makes that easy! |
A brief digression on different ways of implementing the UoW pattern.
We could imagine a slightly different version of the UoW, which commits by default, and only rolls back if it spots an exception:
class AbstractUnitOfWork(abc.ABC):
def __enter__(self):
return self
def __exit__(self, exn_type, exn_value, traceback):
if exn_type is None:
self.commit() #(1)
else:
self.rollback() #(2)
-
should we have an implicit commit in the happy path?
-
and roll back only on exception?
It would allow us to save a line of code, and remove the explicit commit from our client code:
def add_batch(ref: str, sku: str, qty: int, eta: Optional[date], start_uow):
with start_uow() as uow:
uow.batches.add(model.Batch(ref, sku, qty, eta))
# uow.commit()
This is a judgement call, but we tend to prefer requiring the explicit commit so that we have to choose when to flush state.
Although it’s an extra line of code this makes the software safe-by-default. The default behavior is to not change anything. In turn, that makes our code easier to reason about because there’s only one code path that leads to changes in the system: total success and an explicit commit. Any other code path, any exception, any early exit from the uow’s scope, leads to a safe state.
Similarly, we prefer "always-rollback" to "only-rollback-on-error," because the former feels easier to understand; rollback rolls back to the last commit, so either the user did one, or we blow their changes away. Harsh but simple.
Here’s a few examples showing the Unit of Work pattern in use. You can see how it leads to simple reasoning about what blocks of code happen together:
Supposing we want to be able to deallocate and then reallocate orders?
def reallocate(line: OrderLine, uow: AbstractUnitOfWork) -> str:
with uow:
batch = uow.batches.get(sku=line.sku)
if batch is None:
raise InvalidSku(f'Invalid sku {line.sku}')
batch.deallocate(line) #(1)
allocate(line) #(2)
uow.commit()
-
If
deallocate()
fails, we don’t want to doallocate()
, obviously. -
But if
allocate()
fails, we probably don’t want to actually commit thedeallocate()
, either.
Our shipping company gives us a call to say that one of the container doors opened and half our sofas have fallen into the Indian Ocean. oops!
def change_batch_quantity(batchref: str, new_qty: int, uow: AbstractUnitOfWork):
with uow:
batch = uow.batches.get(reference=batchref)
batch.change_purchased_quantity(new_qty)
while batch.available_quantity < 0:
line = batch.deallocate_one() #(1)
model.allocate(line)
uow.commit()
-
Here we may need to deallocate any number of lines. If we get a failure at any stage, we probably want to commit none of the changes.
We now have three sets of tests all essentially pointing at the database, test_orm.py, test_repository.py and test_uow.py. Should we throw any away?
└── tests
├── conftest.py
├── e2e
│ └── test_api.py
├── integration
│ ├── test_orm.py
│ ├── test_repository.py
│ └── test_uow.py
├── pytest.ini
└── unit
├── test_allocate.py
├── test_batches.py
└── test_services.py
You should always feel free to throw away tests if you feel they’re not going to add value, longer term. We’d say that test_orm.py was primarily a tool to help us learn SQLAlchemy, so we won’t need that long term, especially if the main things it’s doing are covered in test_repository.py. That last you might keep around, but we could certainly see an argument for just keeping everything at the highest possible level of abstraction (just as we did for the unit tests).
For this chapter, probably the best thing to do is try to implement a
UoW from scratch. You could either follow the model we have quite closely,
or perhaps experiment with separating the UoW (whose responsibilities are
commit()
, rollback()
and providing the .batches
repository) from the
context manager, whose job is to initialise things, and then do the commit
or rollback on exit. If you feel like going all-functional rather than
messing about with all these classes, you could use @contextmanager
from
contextlib
.
We’ve stripped out both the actual UoW and the fakes, as well as paring back the abstract UoW. Why not send us a link to your repo if you come up with something you’re particularly proud of?
Hopefully we’ve convinced you that the Unit of Work is a useful pattern, and hopefully you’ll agree that the context manager is a really nice Pythonic way of visually grouping code into blocks that we want to happen atomically.
This pattern is so useful, in fact, that SQLAlchemy already uses a unit-of-work in the shape of the Session object. The Session object in SqlAlchemy is the way that your application loads data from the database.
Every time you load a new entity from the db, the Session begins to track changes to the entity, and when the Session is flushed, all your changes are persisted together.
Why do we go to the effort of abstracting away the SQLAlchemy session if it already implements the pattern we want?
For one thing, the Session API is rich and supports operations that we don’t
want or need in our domain. Our UnitOfWork
simplifies the Session to its
essential core: it can be started, committed, or thrown away.
For another, we’re using the UnitOfWork
to access our Repository
objects.
This is a neat bit of developer usability that we couldn’t do with a plain
SQLAlchemy Session.
Lastly, we’re motivated again by the dependency inversion principle: our service layer depends on a thin abstraction, and we attach a concrete implementation at the outside edge of the system. This lines up nicely with SQLAlchemy’s own recommendations:
Keep the lifecycle of the session (and usually the transaction) separate and external. The most comprehensive approach, recommended for more substantial applications, will try to keep the details of session, transaction and exception management as far as possible from the details of the program doing its work.
- Unit of Work is an abstraction around data integrity
-
It helps to enforce the consistency of our domain model, and improves performance, by letting us perform a single flush operation at the end of an operation.
- It works closely with the Repository and Service Layer
-
The Unit of Work pattern completes our abstractions over data-access by representing atomic updates. Each of our service-layer use-cases runs in a single unit of work which succeeds or fails as a block.
- This is a lovely case for a context manager
-
Context managers are an idiomatic way of defining scope in Python. We can use a context manager to automatically rollback our work at the end of request which means the system is safe by default.
- SqlAlchemy already implements this pattern
-
We introduce an even simpler abstraction over the SQLAlchemy Session object in order to "narrow" the interface between the ORM and our code. This helps to keep us loosely coupled.
Pros | Cons |
---|---|
|
|