W-17037056 feat: add `data update bulk/resume` commands #1098

cristiand391 · 2024-10-22T11:13:28Z

What does this PR do?

Adds data update bulk/resume commands
Adds --column-delimiter flag to data import bulk
move data import bulk logic into a generic bulkIngest func that will be used by all bulkIngest-related commands (data import/update bulk in this PR, data delete/upsert bulk will be migrated in a separate PR)

What issues does this PR fix or reference?

@W-17037056@

[skip ci]

messages/bulkIngest.md

test/commands/data/import/resume.nut.ts

test/commands/data/update/bulk.nut.ts

mdonnalley

just a few thoughts/suggestions - otherwise looks good 👍

mdonnalley · 2024-10-29T15:03:15Z

src/bulkIngest.ts

+  const timeout = opts.async ? Duration.minutes(0) : opts.wait ?? Duration.minutes(0);
+  const async = timeout.milliseconds === 0;
+
+  const baseUrl = opts.conn.getAuthInfoFields().instanceUrl as string;


should we handle the possibility that .instanceUrl doesn't exist or undefined instead of using as string?

I can't find the slack thread but IIRC we said stuff like instanceUrl should always be in auth files but changing the types ins sfdx-core now would be a big change, I'll wrap it in ensureString 👍🏼 .

src/bulkIngest.ts

src/commands/data/update/bulk.ts

mdonnalley · 2024-10-29T15:23:45Z

test/commands/data/import/bulk.nut.ts

+    const result = execCmd<DataImportBulkResult>(
+      `data import bulk --file ${csvFile} --sobject Account --wait 10 --column-delimiter PIPE --json`,
+      { ensureExitCode: 0 }
+    ).jsonOutput?.result as DataImportBulkResult;


is as DataImportBulkResult just there to remove the | undefined? If so, why not use optional chaining instead of using an assertion?

I think I started doing the assertion in other NUTs and just carried the assertion, I can remove it.

mdonnalley · 2024-10-29T15:50:14Z

QA

🟢 data import bulk still works as expected

❯ sf data import bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertLarge.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Importing data ─────────────────

 ✔ Creating ingest job 11.01s
 ✔ Processing the job 1m 22.73s
   ▸ Processed records: 76380
   ▸ Successful records: 76380
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ov00000IpmhlIAB
 Elapsed Time: 1m 34.38s

🟡 sf data import bulk without setting --column-delimiter

It fails as expected but I'm wondering if there's a way to auto-detect the delimiter??

❯ sf data import bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertPipes.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Importing data ─────────────────

 ✔ Creating ingest job 1.57s
 ✘ Processing the job 11.74s
   ▸ Processed records: ✘
   ▸ Successful records: ✘
   ▸ Failed records: ✘

 Status: Failed
 Job Id: 750Ov00000IpeIoIAJ
 Elapsed Time: 13.41s

Error (JobFailedError): Job failed to be processed due to:

InvalidBatch : Field name not found : NAME|TYPE|PHONE|WEBSITE|ANNUALREVENUE

To review the details of this job, run this command:

sf org open --target-org test-p4bh29a7jcvc@example.com --path "/lightning/setup/AsyncApiJobStatus/page?address=%2F750Ov00000IpeIoIAJ"

🟢 data import bulk --column-delimiter works as expected

❯ sf data import bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertPipes.csv --sobject account --wait 10 --column-delimiter PIPE
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Importing data ─────────────────

 ✔ Creating ingest job 1.39s
 ✔ Processing the job 11.66s
   ▸ Processed records: 10
   ▸ Successful records: 10
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ov00000IpTdaIAF
 Elapsed Time: 13.11s

🟡 sf data update bulk with csv with no ID colum

As I was trying to figure out how to use the new command, I encountered this:

❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsert.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ✔ Creating ingest job 1.48s
 ✘ Processing the job 5.73s
   ▸ Processed records: 10
   ▸ Successful records: 0
   ▸ Failed records: 10

 Status: JobComplete
 Job Id: 750Ov00000IpoIAIAZ
 Elapsed Time: 7.42s

Error (FailedRecordDetailsError): Job finished being processed but failed to process 10 records.

To review the details of this job, run this command:

sf org open --target-org test-p4bh29a7jcvc@example.com --path "/lightning/setup/AsyncApiJobStatus/page?address=%2F750Ov00000IpoIAIAZ"

The sf org open works as expected but I'm not able to find any useful information about why it failed on that page. Are we able to surface a more descriptive error message to the user?

Looks like I needed to have an ID column in my csv - maybe that's something we can detect on our end if it's not something that the API tells us

🟢 sf data update bulk

❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsert.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ✔ Creating ingest job 1.40s
 ✔ Processing the job 5.79s
   ▸ Processed records: 10
   ▸ Successful records: 10
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ec00000FTwWFIA1
 Elapsed Time: 7.23s

🟢 handles aborted job

~/repos/trailheadapps/dreamhouse-lwc on  main via ⬢ v20.15.0 took 8.4s
❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertPipes.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ✔ Creating ingest job 1.77s
 ✘ Processing the job 2m 2.77s
   ▸ Processed records: ✘
   ▸ Successful records: ✘
   ▸ Failed records: ✘

 Status: Aborted
 Job Id: 750Ov00000IppR7IAJ
 Elapsed Time: 2m 5.28s

Error (JobAbortedError): Job has been aborted.

To review the details of this job, run this command:

sf org open --target-org test-p4bh29a7jcvc@example.com --path "/lightning/setup/AsyncApiJobStatus/page?address=%2F750Ov00000IppR7IAJ"

sf data update bulk --async and sf data update resume

~/repos/trailheadapps/dreamhouse-lwc on  main via ⬢ v20.15.0 took 9.6s
❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsert.csv --sobject account --async
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────── Updating data (async) ─────────────

 ✔ Creating ingest job 1.57s
 ◼ Processing the job

 Status: UploadComplete
 Job Id: 750Ec00000FU0MvIAL
 Elapsed Time: 1.61s

Run "sf data update resume --job-id 750Ec00000FU0MvIAL" to resume the operation.

~/repos/trailheadapps/dreamhouse-lwc on  main via ⬢ v20.15.0 took 3.5s
❯ sf data update resume --job-id 750Ec00000FU0MvIAL
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ◯ Creating ingest job - Skipped
 ✔ Processing the job 931ms
   ▸ Processed records: 10
   ▸ Successful records: 10
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ec00000FU0MvIAL
 Elapsed Time: 966ms

… into cd/bulk-update

cristiand391 added 4 commits October 22, 2024 07:52

chore: refactor bulk ingest utils

d33b455

feat: add data update bulk/resume

e3ddb40

fix: update data import bulk help

a57409a

test: add bulk update NUT

172c7d5

cristiand391 force-pushed the cd/bulk-update branch from 50d03b6 to 172c7d5 Compare October 22, 2024 17:45

cristiand391 and others added 16 commits October 23, 2024 15:29

test: break up NUTs (#1099)

f2faeb9

chore: unify bulk ingest logic

4f6145f

test: add bulk update NUTs to test matrix

31c9f41

fix: insert operation

feedba7

fix: command-specific resume instructions

8de8751

fix: command-specific stage title

9c429a4

fix: pass operation opt

7b01b4f

test: fix update resume NUT on win

81afcca

test: refactor/doc

74a29e9

chore: moar refactor/doc

4fb6c6b

chore: clean up msgs

a3c917a

feat: add column-delimiter flag to import/update bulk

b3785f5

chore: update command snapshot

c76a402

chore: eslint rule inline

6e3bf7a

test: validate async command's cache files

d750429

chore: update msg

e20bdc6

[skip ci]

cristiand391 marked this pull request as ready for review October 25, 2024 12:51

cristiand391 requested a review from a team as a code owner October 25, 2024 12:51

cristiand391 commented Oct 25, 2024

View reviewed changes

messages/bulkIngest.md Show resolved Hide resolved

cristiand391 commented Oct 25, 2024

View reviewed changes

test/commands/data/import/resume.nut.ts Outdated Show resolved Hide resolved

cristiand391 commented Oct 25, 2024

View reviewed changes

test/commands/data/update/bulk.nut.ts Show resolved Hide resolved

fix: edit help for new "data update bulk|resume" commands (#1106)

8a7c105

mdonnalley reviewed Oct 29, 2024

View reviewed changes

fix: remove as string

0af5ff3

cristiand391 added 7 commits October 30, 2024 07:55

chore: use proper stop status

f688432

chore: share column-delimiter flag def

f12a42b

test: remove type assertions

90b4e38

feat: detect column delimiter

6adb1e2

test: nut should detect column delimiter

0c1bf01

Merge branch 'cd/bulk-update' of github.com:salesforcecli/plugin-data…

dcb784e

… into cd/bulk-update

Merge remote-tracking branch 'origin/main' into cd/bulk-update

382e46d

mdonnalley approved these changes Oct 30, 2024

View reviewed changes

mdonnalley merged commit 5ef1b55 into main Oct 30, 2024
27 checks passed

mdonnalley deleted the cd/bulk-update branch October 30, 2024 14:58

iowillhoit changed the title ~~feat: add data update bulk/resume commands~~ W-17037056 feat: add data update bulk/resume commands Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

W-17037056 feat: add `data update bulk/resume` commands #1098

W-17037056 feat: add `data update bulk/resume` commands #1098

cristiand391 commented Oct 22, 2024 •

edited

Loading

mdonnalley left a comment

mdonnalley Oct 29, 2024

cristiand391 Oct 29, 2024

mdonnalley Oct 29, 2024

cristiand391 Oct 30, 2024

mdonnalley commented Oct 29, 2024

W-17037056 feat: add data update bulk/resume commands #1098

W-17037056 feat: add data update bulk/resume commands #1098

Conversation

cristiand391 commented Oct 22, 2024 • edited Loading

What does this PR do?

What issues does this PR fix or reference?

mdonnalley left a comment

Choose a reason for hiding this comment

mdonnalley Oct 29, 2024

Choose a reason for hiding this comment

cristiand391 Oct 29, 2024

Choose a reason for hiding this comment

mdonnalley Oct 29, 2024

Choose a reason for hiding this comment

cristiand391 Oct 30, 2024

Choose a reason for hiding this comment

mdonnalley commented Oct 29, 2024

W-17037056 feat: add `data update bulk/resume` commands #1098

W-17037056 feat: add `data update bulk/resume` commands #1098

cristiand391 commented Oct 22, 2024 •

edited

Loading