Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM-4098: Use the datastore for Session Replay resources #2041

Merged
merged 8 commits into from
Jun 27, 2024

Conversation

jonathanmos
Copy link
Member

What does this PR do?

Use the datastore from Session Replay, so that resource hashes are persisted between sessions.

Motivation

The second part of the effort to persist bitmap hashes between sessions.

Additional Notes

Anything else we should know when reviewing?

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Make sure you discussed the feature or bugfix with the maintaining team in an Issue
  • Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch 2 times, most recently from 56e08c5 to 542d2b1 Compare May 21, 2024 07:49
@codecov-commenter
Copy link

codecov-commenter commented May 21, 2024

Codecov Report

Attention: Patch coverage is 87.50000% with 13 lines in your changes missing coverage. Please review.

Project coverage is 83.22%. Comparing base (d8d0838) to head (fa9f3b1).
Report is 1 commits behind head on develop.

Current head fa9f3b1 differs from pull request most recent head 4641fbf

Please upload reports for the commit 4641fbf to get more accurate results.

Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #2041       +/-   ##
============================================
+ Coverage    69.56%   83.22%   +13.67%     
============================================
  Files          715      511      -204     
  Lines        26553    18119     -8434     
  Branches      4452     2719     -1733     
============================================
- Hits         18469    15079     -3390     
+ Misses        6871     2293     -4578     
+ Partials      1213      747      -466     
Files Coverage Δ
...rnal/persistence/datastore/DataStoreFileHandler.kt 100.00% <100.00%> (ø)
...ernal/persistence/datastore/DatastoreFileReader.kt 95.56% <ø> (ø)
...rnal/persistence/datastore/NoOpDataStoreHandler.kt 100.00% <ø> (ø)
.../sessionreplay/internal/DefaultRecorderProvider.kt 93.10% <100.00%> (+0.06%) ⬆️
...oid/sessionreplay/internal/SessionReplayFeature.kt 100.00% <100.00%> (ø)
...replay/internal/processor/RecordedDataProcessor.kt 97.67% <100.00%> (+0.17%) ⬆️
...nreplay/internal/recorder/SessionReplayRecorder.kt 96.69% <100.00%> (-0.11%) ⬇️
...ernal/resources/ResourceHashesEntryDeserializer.kt 100.00% <100.00%> (ø)
...nternal/resources/ResourceHashesEntrySerializer.kt 50.00% <50.00%> (ø)
.../android/api/storage/datastore/DataStoreHandler.kt 20.00% <0.00%> (-13.33%) ⬇️
... and 2 more

... and 265 files with indirect coverage changes

@jonathanmos jonathanmos marked this pull request as ready for review May 21, 2024 10:02
@jonathanmos jonathanmos requested review from a team as code owners May 21, 2024 10:02
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/add-persistence branch from 6279225 to 564eb64 Compare May 22, 2024 08:16
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch from 542d2b1 to 7986738 Compare May 22, 2024 09:31
@jonathanmos jonathanmos marked this pull request as draft May 26, 2024 18:54
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/add-persistence branch 4 times, most recently from 5e26358 to 921ee19 Compare May 26, 2024 22:11
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch 3 times, most recently from 405c260 to 085ed01 Compare May 29, 2024 09:51
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/add-persistence branch 5 times, most recently from c181326 to f0ceab6 Compare May 30, 2024 15:02
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch 3 times, most recently from 555637d to 8891749 Compare June 5, 2024 11:59
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/add-persistence branch from 05dcf33 to 1c3f646 Compare June 5, 2024 12:16
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch 5 times, most recently from 307e5b3 to 41e1c72 Compare June 7, 2024 09:45
Base automatically changed from jmoskovich/rum-4098/add-persistence to develop June 24, 2024 14:03
@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch 2 times, most recently from bc635ab to 2e2c6f1 Compare June 25, 2024 13:06
@jonathanmos jonathanmos requested review from 0xnm and mariusc83 June 25, 2024 14:00
*/
fun removeValue(
key: String
key: String,
callback: DataStoreWriteCallback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what us the use-case when we are interested in the result?

Copy link
Member Author

@jonathanmos jonathanmos Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For when we remove the expired datastore entry, because this is an async operation and therefore we need to know when it's finished in order to be sure that the initialization of the datastoremanager is complete.

* @param serializer to use to serialize the data.
*/
fun <T : Any> setValue(
key: String,
data: T,
version: Int = 0,
callback: DataStoreWriteCallback,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what us the use-case when we are interested in the result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment for setValue there is no specific use-case, but I felt that it makes sense to also have a callback here, since we have one for removeValue

Comment on lines +90 to +91
val data = ResourceHashesEntry(
lastUpdateDateNs = storedLastUpdateDateNs,
resourceHashes = knownResources.toList()
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not the most efficient design. Imagine you have 1000 hashes, each is 36 characters (UUID length). All chars are ASCII, so each char is 1 byte. It means that once you add 1001th hash, you need to write (and also serialize) all 1000 hashes you wrote before, which is 36KB. So with each write operation you have data overhead of 36KB, instead of storing only the hash you need to.

Copy link
Member Author

@jonathanmos jonathanmos Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after discussion, we'll leave this for now and perhaps improve in a future pr

@jonathanmos jonathanmos requested a review from 0xnm June 26, 2024 14:06

private fun fetchStoredResourceHashes(
onFetchSuccessful: (dataStoreContent: DataStoreContent<ResourceHashesEntry>?) -> Unit,
onFetchFailure: () -> Unit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: there is no need to have this argument, because the only call site of this method gives empty lambda.

Copy link
Member Author

@jonathanmos jonathanmos Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lambda should actually not be empty but should have a call to finishedInitializingManager. I'll add some tests to catch missing calls

private val resourceHashesSerializer: Serializer<ResourceHashesEntry>,
private val resourceHashesDeserializer: Deserializer<String, ResourceHashesEntry>
) {
private val knownResources = Collections.synchronizedSet(mutableSetOf<String>())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: if performance is critical here, it is better to use Collections.newSetFromMap(ConcurrentHashMap<String, Boolean>). Overall the performance of ConcurrentHashMap is better that synchronised map (but there are some differences in behavior).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I've modified this

@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch 6 times, most recently from fa0553f to fa9f3b1 Compare June 27, 2024 11:42
@jonathanmos jonathanmos requested a review from 0xnm June 27, 2024 12:53
Copy link
Member

@0xnm 0xnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. but my concern about writing all the hashes each time we just need to add one more to storage stays, not sure how much it can become a bottleneck in the app with many resources.

@jonathanmos jonathanmos force-pushed the jmoskovich/rum-4098/sessionreplay-datastore branch from fa9f3b1 to 4641fbf Compare June 27, 2024 14:07
@jonathanmos jonathanmos merged commit 018ef71 into develop Jun 27, 2024
19 checks passed
@jonathanmos jonathanmos deleted the jmoskovich/rum-4098/sessionreplay-datastore branch June 27, 2024 15:02
@xgouchet xgouchet added this to the 2.12.x milestone Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants