Skip to content

Version 0.11.0

Compare
Choose a tag to compare
@jtnelson jtnelson released this 04 Mar 15:11
· 52 commits to master since this release
BREAKING CHANGES

There is only PARTIAL COMPATIBILITY between

  • an 0.11.0+ client and an older server, and
  • a 0.11.0+ server and an older client.

Due to changes in Concourse's internal APIs,

  • An older client will receive an error when trying to invoke any audit methods on a 0.11.0+ server.
  • An older server will throw an error message when any audit or review methods are invoked from an 0.11.0+ client.
Storage Format Version 3
  • This version introduces a new, more concise storage format where Database files are now stored as Segments instead of Blocks. In a segment file (.seg), all views of indexed data (primary, secondary, and search) are stored in the same file whereas a separate block file (.blk) was used to store each view of data in the v2 storage format. The process of transporting writes from the Buffer to the Database remains unchanged. When a Buffer page is fully transported, its data is durably synced in a new Segment file on disk.
  • The v3 storage format should reduce the number of data file corruptions because there are fewer moving parts.
  • An upgrade task has been added to automatically copy data to the v3 storage format.
    • The upgrade task will not delete v2 data files, so be mindful that you will need twice the amount of data space available on disk to upgrade. You can safely manually delete the v2 files after the upgrade. If the v2 files remain, a future version of Concourse may automatically delete them for you.
  • In addition to improved data integrity, the v3 storage format brings performance improvements to all operations because of more efficient memory management and smarter usage of asynchronous work queues.
Atomic Commit Timestamps

All the writes in a committed atomic operation (e.g. anything from primitive atomics to user-defined transactions) will now have the same version/timestamp. Previously, when an atomic operation was committed, each write was assigned a distinct version. But, because each atomic write was applied as a distinct state change, it was possible to violate ACID semantics after the fact by performing a partial undo or partial historical read. Now, the version associated with each write is known as the commit version. For non-atomic operations, autocommit is in effect, so each write continues to have a distinct commit version. For atomic operations, the commit version is assigned when the operation is committed and assigned to each atomic write. As a result, all historical reads will either see all or see none of the committed atomic state and undo operations (e.g. clear, revert) will either affect all or affect none of the commited atomic state.

Optimizations
  • The storage engine has been optimized to use less memory when indexing by de-duplicating and reusing equal data components. This drastically reduces the amount of time that the JVM must dedicate to Garbage Collection. Previously, when indexing, the storage engine would allocate new objects to represent data even if equal objects were already buffered in memory.
  • We switched to a more compact in-memory representation of the Inventory, resulting in a reduction of its heap space usage by up to 97.9%. This has an indirect benefit to overall performance and throughput by reducing memory contention that could lead to frequence JVM garbage collection cycles.
  • Improved user-defined transactions by detecting when an attempt is made to atomically commit multiple Writes that toggle the state of a field (e.g. ADD name as jeff in 1, REMOVE name as jeff in 1, ADD name as jeff in 1) and only committing at most one equal Write that is required to obtain the intended state. For example, in the previous example, only 1 write for ADD name as jeff in 1 would be committed.
Performance
  • We improved the performance of commands that sort data by an average of 38.7%. These performance improvements are the result of an new Strategy framework that allows Concourse Server to dynamically choose the most opitmal path for data lookups depending upon the entire context of the command and the state of storage engine. For example, when sorting a result set on key1, Concourse Server will now intelligently decide to lookup the values across key1 using the relevant secondary index if key1 is also a condition key. Alternatively, Concourse Server will decide to lookup the values across key1 using the primary key for each impacted record if key1 is also a being explicitly selected as part of the operation.
  • Search is drastically faster as a result of the improved memory management that comes wth the v3 storage format as well as some other changes to the way that search indexes are read from disk and represented in memory. As a result, search performance is up-to 95.52% faster on real-world data.
New Functionality
  • Added trace functionality to atomically locate and return all the incoming links to one or more records. The incoming links are represented as a mapping from key to a set of records where the key is stored as a Link to the record being traced.
  • Added consolidate functionality to atomically combine data from one or more records into another record. The records from which data is merged are cleared and all references to those cleared records are replaced with the consolidated record on the document-graph.
  • Added the concourse-export framework which provides the Exporter construct for building tools that print data to an OutputStream in accordance with Concourse's multi-valued data format (e.g. a key mapped to multiple values will have those values printed as a delimited list). The Exporters utility class contains built-in exporters for exporting within CSV and Microsoft Excel formats.
  • Added an export CLI that uses the concourse-export framework to export data from Concourse in CSV format to STDOUT or a file.
  • For CrossVersionTests, as an alternative to using the Versions annotation., added the ability to define test versions in a no-arg static method called versions that returns a String[]. Using the static method makes it possible to centrally define the desired test versions in a static variable that is shared across test classes.
  • The server variable in a ClientServerTest (from the concourse-ete-test-core framework) now exposes the server configuration from the prefs() method to facilitate programatic configuration management within tests.
  • Added the ability to configure the location of the access credentials file using the new access_credentials_file preference in concourse.prefs. This makes it possible to store credentials in a more secure directory that is also protected againist instances when the concourse-server installation directory is deleted. Please note that changing the value of access_credentials_file does not migrate existing credentials. By default, credentials are still stored in the .access within the root of the concourse-server installation directory.
  • Added a separate log file for upgrade tasks (log/upgrade.log).
  • Added a mechanism for failed upgrade tasks to automatically perform a rollback that'll reset the system state to be consistent with the state before the task was attempted.
  • Added PrettyLinkedHashMap.of and PrettyLinkedTableMap.of factory methods that accept an analogous Map as a parameter. The input Map is lazily converted into one with a pretty toString format on-demand. In cases where a Map is not expected to be rendered as a String, but should be pretty if it is, these factories return a Map that defers the overhead of prettification until it is necessary.
CCL Support
  • Added support for specifying a CCL Function Statement as a selection/operation key, evaluation key (within a Condition or evaluation value (wthin a Conditon). A function statement can be provided as either the appropriate string form (e.g. function(key), function(key, ccl), key | function, etc) or the appropriate Java Object (e.g. IndexFunction, KeyConditionFunction, ImplicitKeyRecordFunction, etc). The default behaviour when reading is to interpret any string that looks like a function statement as a function statement. To perform a literal read of a string that appears to be a function statement, simply wrap the string in quotes. Finally, a function statement can never be written as a value.
Experimental Features
Compaction
  • Concourse Server can now be configured to compact data files in an effort to optimize storage and improve read performance. When enabled, compaction automatically runs continuously in the background without disrupting data consistency or normal operations (although the impact on operational throughput has yet to be determined). The initial rollout of compaction is intentionally conservative (e.g. the built-in strategy will likely only make changes to a few data files). While this feature is experimental, there is no ability to tune it, but we plan to offer additional preferences to tailor the behaviour in future releases.
  • Additionally, if enabled, performing compaction can be suggested to Concourse Server on an adhoc basis using the new concourse data compact CLI.
    • Compaction can be enabled by setting the enable_compaction preference to true. If this setting is false, Concourse Server will not perform compaction automatically or when suggested to do so.
Search Caching
  • Concouse Server can now be configured to cache search indexes. This feature is currently experimental and turned off by default. Enabling the search cache will further improve the performance of repeated searches by up to 200%, but there is additional overhead that can slightly decrease the throughput of overall data indexing. Decreased indexing throughput may also indirectly affect write performance.
    • The search cache can be enabled by setting the enable_search_cache preference to true.
Verify by Lookup
  • Concourse Server can now be configured to use special lookup records when performing a verify within the Database. In theory, the Database can respond to verifies faster when generating a lookup record because fewer irrelevant revisions are read from disk and processed in memory. However, lookup records are not cached, so repeated attempts to verify data in the same field (e.g. a counter whose value is stored against a single locator/key) or record may be slower.
    • Verify by Lookup can be enabled by setting the enable_verify_by_lookup preference to true.
API Breaks and Deprecations
  • Upgraded to CCL version 3.1.1. Internally, the database engine has switched to using a Compiler instead of a Parser. As a result, the Concourse-specific Parser has been deprecated.
  • It it only possible to upgrade to this version from Concourse 0.10.6+. Previously, it was possible to upgrade to a new version of Concourse from any prior version.
  • Deprecated the ByteBuffers utility class in favor of the same in the accent4j library.
  • Deprecated PrettyLinkedHashMap.newPrettyLinkedHashMap factory methods in favor of PrettyLinkedHashMap.create.
  • Deprecated PrettyLinkedHashMap.setKeyName in favor of PrettyLinkedHashMap.setKeyColumnHeader
  • Deprecated PrettyLinkedHashMap.setValueName in favor of PrettyLinkedHashMap.setValueColumnHeader
  • Deprecated PrettyLinkedTableMap.setRowName in favor of PrettyLinkedHashMap.setIdentifierColumnHeader
  • Deprecated PrettyLinkedTableMap.newPrettyLinkedTableMap factory methods in favor of PrettyLinkedTableMap.create
  • Deprecated the Concourse#audit methods in favor of Concourse#review ones that take similar parameters. A review returns a Map<Timestamp, List<String>> instead of a Map<Timestamp, String> (as is the case with an audit) to account for the fact that a single commit timestamp/version may contain multiple changes.
Bug Fixes
  • Fixed a bug that caused the system version to be set incorrectly when a newly installed instance of Concourse Server (e.g. not upgraded) utilized data directories containing data from an older system version. This bug caused some upgrade tasks to be skipped, placing the system in an unstable state.
  • Fixed a bug that made it possible for Database operations to unexpectedly fail in the rare cases due to a locator mismatch resulting from faulty indexing logic.
  • Fixed a bug in the serialization/deserialization logic for datasets passed between Concourse Server and plugins. This bug caused plugins to fail when performing operations that included non-trivial datasets.
  • Fixed a bug that caused datasets returned from Concourse Server to a plugin to have incorrect missing data when inverted.