Skip to content
Jens Schauder edited this page Sep 22, 2017 · 4 revisions

Why Spring Data JDBC?

The idea behind Spring Data JDBC is to provide an alternative for users who for one reason or the other don’t want to use JPA but still want to use a relational database and the Spring Data abstraction.

Spring Data JDBC does not intend to be a complete ORM but instead provides a structure for the persistence layer of your application where you can define the details using plain SQL and Java code or third party libraries like MyBatis or JOOQ

Possible reasons to prefer Spring Data JDBC over JPA

Note that for all stated problems there are ways to solve them using JPA. But at some point, one puts more work into working around the framework, then working with the framework at which one might want to consider an alternative.

Enforces non-obvious design choices on the domain model

This article describes a couple thing one should do to one’s domain model in order to get the behavior out of Hibernate.

Behaviour of the EntityManager

The huge amount of StackOverflow questions regarding NonUniqueObjectException or LazyInitializationException indicates that many developers have a hard time understanding the basics of how Hibernate works. While many probably have similar problems with SQL or JDBC concepts like say joins there is an important difference. The chances that you use a join in a way that looks like it’s working for some time and then breaks when some other change somewhere in the application is introduced is slim. Also, when using JPA one still has to understand SQL anyway.

Dirty Tracking: Blessing or Curse?

While the automatic detection of changed enties is nice in many cases. In the cases where it isn’t it is hard to control what get’s persisted and what not.

Event handling is seriously limited

To avoid conflicts with the original database operation that fires the entity lifecycle event (which is still in progress) callback methods should not call EntityMan­ager or Query methods and should not access any other entity objects.

No graceful customization

While one can always fall back on using SQL, it is not feasible to customize just a part of the package, while still using parts orthogonal to that feature. For example one can not easily and precisely control the SQL statement used to updated an entity, without also loosing caching, dirty tracking and cascading.

Architecture

Goals

Good support for a Domain Driven Design approach to software design. Relevant abstractions like Repository and Aggregate and AggregateRoot should be easy to support in a natural way.

Early first version. We need all the input we can get to arrive at a design that supports many use cases. Therefore we should thrive to get something usable out in the hands of the users ASAP.

Encourage good design by making the right choices the easy ones.

Be incrementally customizable, i.e. avoid situations where one has to do either something one doesn’t like or don’t use Spring Data JDBC at all.

Out of Scope

Spring Data JDBC is not intended as a drop in replacement for JPA or similar ORMs. Especially it won’t provide the following features in order to limit complexity.

Dirty Tracking: Dirty tracking requires that entities are somehow controlled by Spring Data JDBC which introduces a complexity we want to avoid. This means that by default it has to assume it doesn’t know what the current state of the database is compared to the current state of an aggregate in memory.

Caching: Caching should not be implemented as part of Spring Data JDBC. Instead repositories should allow to integrate with existing caching solutions.

Transparent Lazy Loading While we might eventually support a special data type to materialize references on request, it should always be very obvious in the source code where the database might get accessed.

How we go from an Object Graph to a list of SQL statements

  1. The class graph gets transformed into a table structure. This table structure is itself an object in memory. On a basic level each entity class becomes a table. In some cases an additional table might get created from a reference. For example when a List or Map gets stored, a special table containing the key/index is required. One may consider the table structure a directed graph, which can be converted to a a list using a Topological Sort

  2. Based on the table structure in list form an object graph gets converted into a list of DbAction`s. Each `DbAction represents a single SQL statement to be issued against the database. Each tuple of (table, object) might result in multiple DbAction`s, for example for an update of a reference we might decide to first delete the relationship and then recreate it with a second `DbAction.

  3. For each DbAction a SQL statment gets generated and executed.

While currently not possible in all cases, eventually a user may customize or replace each step in order to exactly create the statement she wants to use.

Decisions

How to store one to one relation ships

It is assumed that the table for the referenced enity contains the foreign key column.

How to identify the boundaries of an aggregate

A repository needs to identify what should get persisted together. Based on DDD this is an aggregate. So we need a way to identify aggregate bounds. While providing some separation it should still be easy to use.

Options are:

  1. Everything referenced that is an entity is considered part of the current aggregate. At the boundary to a different aggregate, one would store simply an Id of the target entity. Conversion from Id to entity and back is left to client code. This should work for simple client code. Spring Data REST would be challenged by this because it normally allows providing embedded previews of referenced aggregates, which it would not be able to obtain using this approach.

  2. Similar to 1. but use special Id types, that allow automatic conversions to and from entities. From the design point of things a very nice solution, but forces the user to implement an id-type for every entity type.

  3. Similar to 1. but allow to inject a repository into the aggregate root so it can implement additional getters providing access to the referenced aggregates matching between id-properties and entity-getters could be done by name convention. The required injected repositories might make testing of the domain model hard. Also, it might easily cause package cycles.

Current choice: Everything reachable is part of the aggregate

Reason: easy to implement

How to handle circular dependencies

Currently not at all. Probably should be changed