Skip to content
This repository has been archived by the owner on May 1, 2020. It is now read-only.

Implementing a ContentProvider

Sam Bosley edited this page Sep 22, 2015 · 13 revisions

ContentProviders are tricky. While SquiDB can't generate a ContentProvider for you like it can generate models, it does provide some simple yet powerful tools to help you think about and build classes that support exposing your data through a ContentProvider, while reconciling differences between your public-facing API ("contract") and the actual schema you use internally.

Motivation

Let's start with the problem statement:

  1. You have a database containing some tables. These tables are represented by model specs.
  2. You'd like to expose some subset of your data via a ContentProvider. This is represented by your public contract.
  3. You want callers of your ContentProvider to be able to filter query results using selection and selectionArgs based on public contract, but those values/column names may differ from your database schema. Most of these differences revolve around one of three things:
  4. Columns in your database that are not exposed in your contract. Even if you don't declare hidden columns in your contract, callers can guess them, so you need to prevent their usage in projection and selection arguments.
  5. Columns in your contract that don't exist in any table. This could be a value derived as part of a query.
  6. Columns in your contract may have different names than the actual columns in your database. You expect third parties to use these aliased names in projection, selection, and sortOrder arguments, and you don't want to try to parse these to transform back to something that fits your schema.

Defining a Contract

A common practice is to publish a ContentProvider API as a "contract" class, similar to ContactsContract or CalendarContract. The contract contains two key types of information: URIs that can be queried, and the column names that are valid for any given URI. We recommend you think of a URI as giving a name to a specific "perspective" into your data. This could be a filtered subset of some table, a join across multiple tables, or even just some table as-is. The column names you provide for any given URI should match the conceptual perspective of that URI.

Let's create a hypothetical schema and a hypothetical API, and see how we can build a ContentProvider to support our creation. Here are some tables that model (in a very sophisticated way) a university:

TABLE students
    _id, name, enroll_date
TABLE faculty
    _id, name, department, social_security_number
TABLE courses
    _id, code, start_date, end_date, credits, instructor
TABLE course_enrollments
    _id, student_id, course_id, grade

ModelSpec files for each of these can be seen here: Model Specs

Here are the types of URIs a consumer might expect to be able to query (sans the content://authority):

Students Faculty Courses
/students /faculty /courses
/students/# /faculty/# /courses/#
/students/#/courses /faculty/#/courses /courses/#/faculty
/courses/#/students

We have four tables, but there is no sense exposing the course_enrollments table directly as it mainly serves to map a student's participation in various courses over time, so our API only contains 3 entities with top-level paths. Now let's consider the data (or columns) pertaining to each entity that a consumer might expect us to provide:

Students (/students) { _id, name, enrollment_date, credits_earned, gpa }
Faculty (/faculty) { _id, name, department }
Courses (/courses) { _id, code, start_date, end_date, credits, instructor_id }

For Students, we're aliasing "enroll_date" as "enrollment_date"--this isn't a particularly necessary thing to do, but will help illustrate how we deal with this discrepancy. Notice also that for Students and Courses there are some additional columns not defined in the schema (credits_earned, and gpa), but these are data that would be sensible to provide to a consumer. Finally, Faculty does not return some data that should remain private (social_security_number).

Some URIs may need to provide columns from multiple tables, or from tables not exposed directly by the contract. For instance, /students/#/courses corresponds to the list of courses a particular student is taking, so a reasonable collection of columns might be

Courses By Student (/students/#/courses) { _id, code, start_date, end_date, credits, instructor_id, grade }

...where _id comes from the courses table and grade comes from the course_enrollments table. Similarly, if /courses/#/students corresponds to a list of students in a particular course, this might be reasonable

Students By Course (/courses/#/students) { _id, name, enroll_date, credits_earned, gpa, grade }

...where _id comes from the students table, grade comes from the course_enrollments table, and both gpa and credits_earned are calculated columns.

We can now build a contract that looks like this: UniversityContract

Handling Queries

We have our models and our contract, but we still need to deal with the three problems mentioned before: hidden columns, fabricated columns, and aliased columns. All three of these can be solved with a simple idea we call...

The Subquery Pattern

A normal query looks like this:

SELECT [projection] FROM [source] WHERE [selection] ORDER BY [sortOrder]

Where source is the perspective of the data you want to expose. The Subquery Pattern looks like this:

SELECT [projection] FROM (
    SELECT [modified-columns] FROM [source] ...
) AS subquery WHERE [selection] ORDER BY [sortOrder]

Essentially, what was the original source is now itself an inner SELECT, and this inner select can include, exclude, or rename any columns as necessary, as well as derive new columns. (You don't have to use an inner SELECT, it could just as easily be a SQLite table or view that has only the desired columns.) This modified set of columns can now be referenced in the caller's projection, selection, and sortOrder as per the contract.

What we need then is a way to express this inner SELECT as its own model.

ViewModels and "Subquery Models"

Let's look at /students, where we need to alias one column and calculate some other columns by joining on course_enrollments. Here's a model spec for this: ContractStudent spec

This model is based on a view rather than a table. The basics of this are covered in Model specs for SQL views. What's important here are the use of @Alias on the ENROLL_DATE property and the presence of the CREDITS_EARNED and GPA properties, which are derived from functions. The resulting View in the generated file will select all of these with their properly aliased names.

A view is really a pre-defined query, so it is interchangeable with a subquery in the Subquery Pattern. The main difference is that views lead to prettier SQL (and potentially a decrease in SQL parsing overhead), but subqueries don't need to be maintained in the DB if your schema/desired perspective ever changes. Once we add this view to our database, queries for /students simply look like this:

SELECT [projection] FROM view_students WHERE [selection] ORDER BY [sortOrder]

For /students/#/courses, we want to join courses with course_completions. Here's a model spec that accomplishes that: CoursesByStudent spec

This is a "Subquery Model"; it's declared the same way as a ViewModel, but because of the isSubquery = true in the annotation, it creates a named SubqueryTable called SUBQUERY instead of a View in the generated model. You don't need to add this to your database. Queries for /students/#/courses look like this:

SELECT [projection] FROM (
    SELECT courses._id AS _id, courses.code AS code, ..., course_enrollments.grade AS grade
    FROM courses
    INNER JOIN course_enrollments ON (courses._id = course_enrollments.course_id)
) AS subquery WHERE [selection] ORDER BY [sortOrder]

Here you can see the inner SELECT that lends the Subquery Pattern its name. Note that when creating the @ViewQuery for this model, we don't yet know the student's ID, so that's not part of the query. This information is provided by the caller in the URI and will need to be applied at runtime.

ContentProviderQueryBuilder

SquiDB provides a class called ContentProviderQueryBuilder to help form queries using the Subquery Pattern described above. If you have a model that exactly matches the API you want to expose (like ContractStudent), it's dead simple to set up:

public Cursor queryStudents(String[] projection, String selection, String[] selectionArgs, String order) {
    ContentProviderQueryBuilder builder =
            new ContentProviderQueryBuilder(ContractStudent.PROPERTIES, ContractContact.VIEW);
    builder.setStrict(true);
    Query query = builder.build(projection, selection, selectionArgs, sortOrder);
    return database.query(ContractStudent.class, query);
}

Recall that for /students/#/courses we needed to apply a student ID at runtime. Here's what that would look like:

public Cursor queryCoursesByStudent(long studentId, String[] projection, String selection, String[] selectionArgs, String order) {
    ContentProviderQueryBuilder builder =
            new ContentProviderQueryBuilder(CoursesByStudent.PROPERTIES, CoursesByStudent.SUBQUERY);
    builder.setStrict(true);
    Query query = builder.build(projection, selection, selectionArgs, sortOrder);
    query.where(CoursesByStudent.STUDENT_ID.eq(studentId));
    return database.query(CoursesByStudent.class, query);
}

If you need more manual control of the ContentProviderQueryBuilder, you can use the empty constructor and configure it yourself. One good reason for this might be to handle queries on /faculty, where we want to use the Faculty model but not expose a private column. For that we will introduce...

ProjectionMap

Suppose we don't want to create a new model to match our contract's API for /faculty. If we simply call new ContentProviderQueryBuilder(FacultyMember.class), the private column social_security_number is exposed. Instead we can do this:

public Cursor queryFaculty(String[] projection, String selection, String[] selectionArgs, String order) {
    ProjectionMap map = new ProjectionMap();
    map.put(FacultyMember.ID);
    map.put(FacultyMember.NAME);
    map.put(FacultyMember.DEPARTMENT);

    ContentProviderQueryBuilder builder = new ContentProviderQueryBuilder()
            .setStrict(true)
            .setDataSource(FacultyMember.TABLE)
            .setProjectionMap(map);
    Query query = builder.build(projection, selection, selectionArgs, sortOrder);
    return database.query(FacultyMember.class, query);
}

ProjectionMap simply maps strings to Fields/Properties. If the Field/Property already has the appropriate name, as is the case with our FacultyMember properties, you can put it directly into the map and it will be mapped to its current name.

In previous examples, ContentProviderQueryBuilder built this on our behalf with all the properties of the model class, as well as determined the data source directly from the model class. Here we are manually configuring the builder to provide a more restrictive ProjectionMap. When build() is called, the caller's projection is compared to items in this map, and anything not found in the map is either ignored or causes an exception to be thrown, depending on whether the builder has strict mode enabled. (For information on why strict mode is important, see Protecting against SQL Injection.)

Other CRUD Operations

Inserts

Using a ViewModel or "Subquery Model" for different URIs also helps you support write operations. ViewModel provides a mapToModel() method that you can use to split apart a ViewModel instance into its constituent models, even if the ViewModel has aliased some or all of those columns. Here's an example of inserting a new Student, keeping in mind that the values to insert are mapped to contract column names, i.e. the ones in ContractStudent.

public Uri insertStudent(ContentValues values) {
    // The content values are constructed using the Students contract columns as keys
    ContractStudent contractStudent = new ContractStudent(values);

    Student student = contractStudent.mapToModel(new Student());
    validateStudent(student); // Make sure the provided values are sane
    database.createNew(student); // insert the new row

    return Students.getContentUri(student.getId()); // student now has an ID
}

Notice the use of database.createNew(); we could use database.persist() instead, but in that case we should clear the ID of the student before attempting the insert (in case the ContentValues to insert contained a value for _id).

For URIs where inserts need to add data to multiple tables, you can call mapToModel() once for each constituent table model and perform the insert.

Updates and Deletes

As with inserts, ViewModels and "Subquery Models" are useful for update operations. The only wrinkle is that you cannot use the caller's selection directly since it is using contract column names from your ViewModel. To reconcile the rows that would be updated, you will likely construct an IN clause like so:

public int updateStudents(ContentValues values, String selection, String[] selectionArgs) {
    // The content values are constructed using the Students contract columns as keys
    ContractStudent contractStudent = new ContractStudent(values);

    Student template = contractStudent.mapToModel(new Student());
    template.setId(TableModel.NO_ID); // don't allow updating _id
    validateStudent(student); // Make sure the provided values are sane

    Query query = Query.select(ContractStudent.ID)
			.from(ContractStudent.VIEW)
			.where(Criterion.fromRawSelection(selection, selectionArgs));
    int count = database.update(Student.ID.in(query), template);
    return count;
}

Alternatively, you could run the query separately and iterate over the cursor, performing individual updates. This might be preferable if individual write operations can have side effects that need to be resolved. If the selection is empty, you could skip the query and simply use database.updateAll() for the update.

Delete operations also may necessitate these strategies when using a ViewModel or "Subquery Model".

Sample Code

UniversityProvider.java

Other Considerations

For demonstrative purposes, the UniversityProvider sample code implements all operations for as many URIs as possible. In reality, you may not want to support all operations for all the entities. In particular, you may not want to allow deletes at all, or you may want to support them by way of "soft" deletes, where a row is marked as deleted somehow but is not actually removed.

One area that hasn't been discussed is how to modify the course_completions table. We've already decided that this table should not be visible to the outside world, so the public contract should never contain information about it, but we may still want to perform read or write operations on that table through the ContentProvider. In that case, we should use permissions to protect URIs that map to that table. Additionally, there are numerous other operations that would alter this table's data as a side effect, such as updating or removing a course. For now we will leave those implementations as an exercise to the reader.