RUMM-2651 SR - skip new lines and spaces when obfuscating texts #1078

mariusc83 · 2022-10-11T12:34:34Z

What does this PR do?

We need to skip the new lines and spaces characters when applying the obfuscation on the text values to keep the aspect ratio of the replicated text view element in the Web Player.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Make sure you discussed the feature or bugfix with the maintaining team in an Issue
Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

xgouchet · 2022-10-11T13:02:27Z

.../main/kotlin/com/datadog/android/sessionreplay/recorder/mapper/MaskAllTextWireframeMapper.kt

+        return String(
+            CharArray(text.length) {
+                val character = text[it]
+                if (character == CHARACTER_SPACE || character == CHARACTER_NEW_LINE) {


you can use character.isWhitespace() to check this rather than comparing to our own hardcoded values

xgouchet · 2022-10-11T13:03:14Z

...tlin/com/datadog/android/sessionreplay/recorder/mapper/MaskAllTextViewWireframeMapperTest.kt

+            CharArray(text.length) {
+                val character = text[it]
+                if (character == ' ' || character == '\n') {
+                    character
+                } else {
+                    'x'
+                }
+            }


Here you're duplicating the production algorithm, meaning there's no way your test can fail.

yeah I know, this is for all the other tests that are not exactly testing the obfuscation, is just that we are asserting everything in those. For all the obfuscation cases I added the 3 tests below.

xgouchet · 2022-10-11T13:11:06Z

...tlin/com/datadog/android/sessionreplay/recorder/mapper/MaskAllTextViewWireframeMapperTest.kt


        // Then
-        val expectedWireframe = mockButton.toTextWireframe().copy(text = fakeText)
+        val expectedWireframe = mockTextView.toTextWireframe().copy(text = fakeExpectedText)


You should also have a test in the form

val chunk1 = randomText val chunk2 = randomText val spacer = randomWhitespace // forge.forge.aWhitespaceString() val input = chunk1 + spacer + chunk2 val expectedOutput = ("x" * chunk1.length) + spacer + ("x" * chunk2.length)

with this (I don't know if it exist in our repo

operator fun String.times(i: Int) { val builder = StringBuilder(i * length) repeat(i) { builder.append(this) } }

codecov-commenter · 2022-10-11T13:32:01Z

Codecov Report

Merging #1078 (94f9404) into feature/sdkv2 (946cc7e) will increase coverage by 0.07%.
The diff coverage is 100.00%.

@@                Coverage Diff                @@
##           feature/sdkv2    #1078      +/-   ##
=================================================
+ Coverage          82.58%   82.65%   +0.07%     
=================================================
  Files                346      347       +1     
  Lines              11157    11165       +8     
  Branches            1831     1832       +1     
=================================================
+ Hits                9214     9228      +14     
+ Misses              1381     1378       -3     
+ Partials             562      559       -3

Impacted Files	Coverage Δ
...play/recorder/mapper/MaskAllTextWireframeMapper.kt	`100.00% <100.00%> (+20.00%)`	⬆️
.../sessionreplay/recorder/mapper/StringObfuscator.kt	`100.00% <100.00%> (ø)`
...android/v2/core/internal/DatadogContextProvider.kt	`81.25% <0.00%> (-1.56%)`	⬇️
...src/main/kotlin/com/datadog/android/DatadogSite.kt	`86.67% <0.00%> (ø)`
...android/rum/internal/ndk/DatadogNdkCrashHandler.kt	`87.30% <0.00%> (ø)`
...rc/main/java/com/datadog/opentracing/DDTracer.java	`56.49% <0.00%> (+0.42%)`	⬆️
...ain/java/com/datadog/opentracing/PendingTrace.java	`60.34% <0.00%> (+1.72%)`	⬆️
...rsistence/file/batch/PlainBatchFileReaderWriter.kt	`91.18% <0.00%> (+2.94%)`	⬆️

0xnm · 2022-10-12T09:05:58Z

...replay/src/main/kotlin/com/datadog/android/sessionreplay/recorder/mapper/StringObfuscator.kt

+
+internal class StringObfuscator {
+    fun obfuscate(stringValue: String): String {
+        return String(


let's add TODO maybe here, saying it works only for ASCII. Also character can be non-printable Control character, but I guess we can ignore it.

I just realized that I need to use another method to support all unicode characters the way they are specifying in their comments.

this won't work properly for Unicode still. For example, let's take the following emoji. It has a code \u1f600, so it already exceeds char size (which values go only up to \uffff, because it can fit only 16 bits), so this single symbol consists of the 2 char -> "😀".length will give 2.

But in reality we cannot threat these chars independently, because first is just a high surrogate and second is the low surrogate. This is how UTF-16 encoding is actually working - it will read the stream of char of length 16 bits and decide if next char is a lower surrogate or independent symbol.

Yes I imagine this is the way it should work but I think this will not be a problem. The only issue would be if one of this emojis will contain one of the non printable characters in their composition. In that case we will brake the text but I am pretty sure Character.isWhitespace() handles that somehow or maybe these whitespace characters are not used in emojis.

My point is that in case of emoji we will have xx instead of single x, not sure if this is what we want to achieve (I guess Character.isWhitespace() will return false for any surrogate).

Yeah...I don't think this should be an issue as long as we obfuscate the texts. It will be a corner case anyway. I am going to add this explanation there though to have this in mind.

I won't say it is a corner case, because there are plenty of symbols with above \uffff, and if we think of emojis we may find a lot of them in messenger apps, for example, and obfuscation may break formatting. Not a blocker for me, but let's acknowledge this.

0xnm · 2022-10-12T09:14:06Z

...ay/src/test/kotlin/com/datadog/android/sessionreplay/recorder/mapper/StringObfuscatorTest.kt

+        assertThat(obfuscatedText).isEqualTo(fakeExpectedText)
+    }
+
+    private fun String.times(n: Int): String {


it seems the default repeat should do the job?

0xnm · 2022-10-12T09:46:08Z

...ay/src/test/kotlin/com/datadog/android/sessionreplay/recorder/mapper/StringObfuscatorTest.kt

+                fakeExpectedChunk3 +
+                fakeExpectedChunk4 +
+                fakeExpectedChunk5
+            ).times(n)


why do we need times call if it just copies the same?

to have a bigger text. I was following @xgouchet suggestion, not sure that this is what he wanted but I think is a bit overkill also. @xgouchet do you think this is still needed ?

the times was more to generate a string with just 'x' of a given size, here it doesn't make sense

0xnm · 2022-10-13T08:18:52Z

...replay/src/main/kotlin/com/datadog/android/sessionreplay/recorder/mapper/StringObfuscator.kt

+                val character = stringValue[it]
+                // Using the Character.isWhitespace() function we will support also the UNICODE
+                // characters according with the docs. Also have in mind that in JAVA the
+                // `char` primitive is a 16bits long number which covers all the UNICODE characters:


char doesn't cover all the Unicode characters. It covers BMP plain Unicode characters, but it doesn't cover Supplementary plain characters (above 0xFFFF). When we call character.code we just cast from char to int, but to get code of the symbol from supplementary plain we need to merge 2 chars into single int.

0xnm

lgtm!

0xnm · 2022-10-13T12:10:31Z

...ay/src/test/kotlin/com/datadog/android/sessionreplay/recorder/mapper/StringObfuscatorTest.kt

+
+    @Test
+    @TestTargetApi(Build.VERSION_CODES.N)
+    fun `M mask String W maskString(){string with newline, Android N}`(


It is not maskString, but obfuscate. Same for other tests.

Suggested change

fun `M mask String W maskString(){string with newline, Android N}`(

fun `M mask String W obfuscate(){string with newline, Android N}`(

mariusc83 self-assigned this Oct 11, 2022

mariusc83 requested a review from a team as a code owner October 11, 2022 12:34

xgouchet requested changes Oct 11, 2022

View reviewed changes

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch 3 times, most recently from 7bb701a to 52fb574 Compare October 12, 2022 08:15

mariusc83 requested a review from xgouchet October 12, 2022 08:16

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch from 52fb574 to a19d58e Compare October 12, 2022 08:25

xgouchet added the size-small This PR is small sized label Oct 12, 2022

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch from a19d58e to 99bb53a Compare October 12, 2022 09:26

0xnm reviewed Oct 12, 2022

View reviewed changes

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch 2 times, most recently from 754eefd to 94f9404 Compare October 12, 2022 11:59

mariusc83 requested a review from 0xnm October 12, 2022 12:01

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch from 94f9404 to c82b100 Compare October 13, 2022 08:11

0xnm approved these changes Oct 13, 2022

View reviewed changes

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch from c82b100 to dfee181 Compare October 13, 2022 10:39

xgouchet added the size-medium This PR is medium sized label Oct 13, 2022

0xnm approved these changes Oct 13, 2022

View reviewed changes

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch 2 times, most recently from 09bc0c4 to 483486f Compare October 13, 2022 14:57

RUMM-2651 SR - skip new lines and spaces when obfuscating texts

0d1e683

mariusc83 force-pushed the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch from 483486f to 0d1e683 Compare October 13, 2022 15:10

xgouchet approved these changes Oct 14, 2022

View reviewed changes

mariusc83 merged commit ef329c2 into feature/sdkv2 Oct 14, 2022

mariusc83 deleted the mconstantin/rumm-2651/do-not-replace-spaces-when-applying-obfuscation branch October 14, 2022 06:56

xgouchet added this to the 1.16.0 milestone Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUMM-2651 SR - skip new lines and spaces when obfuscating texts #1078

RUMM-2651 SR - skip new lines and spaces when obfuscating texts #1078

mariusc83 commented Oct 11, 2022

xgouchet Oct 11, 2022

xgouchet Oct 11, 2022

mariusc83 Oct 11, 2022

xgouchet Oct 11, 2022

mariusc83 Oct 11, 2022

codecov-commenter commented Oct 11, 2022 •

edited

Loading

0xnm Oct 12, 2022

mariusc83 Oct 12, 2022

0xnm Oct 12, 2022 •

edited

Loading

mariusc83 Oct 12, 2022

0xnm Oct 12, 2022

mariusc83 Oct 13, 2022 •

edited

Loading

0xnm Oct 13, 2022

0xnm Oct 12, 2022

0xnm Oct 12, 2022

mariusc83 Oct 12, 2022

xgouchet Oct 12, 2022

0xnm Oct 13, 2022

0xnm left a comment

0xnm Oct 13, 2022

	fun `M mask String W maskString(){string with newline, Android N}`(
	fun `M mask String W obfuscate(){string with newline, Android N}`(

RUMM-2651 SR - skip new lines and spaces when obfuscating texts #1078

RUMM-2651 SR - skip new lines and spaces when obfuscating texts #1078

Conversation

mariusc83 commented Oct 11, 2022

What does this PR do?

Motivation

Additional Notes

Review checklist (to be filled by reviewers)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 11, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xnm Oct 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mariusc83 Oct 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xnm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 11, 2022 •

edited

Loading

0xnm Oct 12, 2022 •

edited

Loading

mariusc83 Oct 13, 2022 •

edited

Loading