Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W-17037056 feat: add data update bulk/resume commands #1098

Merged
merged 29 commits into from
Oct 30, 2024
Merged

Conversation

cristiand391
Copy link
Member

@cristiand391 cristiand391 commented Oct 22, 2024

What does this PR do?

  • Adds data update bulk/resume commands
  • Adds --column-delimiter flag to data import bulk
  • move data import bulk logic into a generic bulkIngest func that will be used by all bulkIngest-related commands (data import/update bulk in this PR, data delete/upsert bulk will be migrated in a separate PR)

What issues does this PR fix or reference?

@W-17037056@

@cristiand391 cristiand391 marked this pull request as ready for review October 25, 2024 12:51
@cristiand391 cristiand391 requested a review from a team as a code owner October 25, 2024 12:51
Copy link
Contributor

@mdonnalley mdonnalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few thoughts/suggestions - otherwise looks good 👍

const timeout = opts.async ? Duration.minutes(0) : opts.wait ?? Duration.minutes(0);
const async = timeout.milliseconds === 0;

const baseUrl = opts.conn.getAuthInfoFields().instanceUrl as string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we handle the possibility that .instanceUrl doesn't exist or undefined instead of using as string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the slack thread but IIRC we said stuff like instanceUrl should always be in auth files but changing the types ins sfdx-core now would be a big change, I'll wrap it in ensureString 👍🏼 .

src/bulkIngest.ts Outdated Show resolved Hide resolved
src/commands/data/update/bulk.ts Outdated Show resolved Hide resolved
const result = execCmd<DataImportBulkResult>(
`data import bulk --file ${csvFile} --sobject Account --wait 10 --column-delimiter PIPE --json`,
{ ensureExitCode: 0 }
).jsonOutput?.result as DataImportBulkResult;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is as DataImportBulkResult just there to remove the | undefined? If so, why not use optional chaining instead of using an assertion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I started doing the assertion in other NUTs and just carried the assertion, I can remove it.

@mdonnalley
Copy link
Contributor

QA

🟢 data import bulk still works as expected

❯ sf data import bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertLarge.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Importing data ─────────────────

 ✔ Creating ingest job 11.01s
 ✔ Processing the job 1m 22.73s
   ▸ Processed records: 76380
   ▸ Successful records: 76380
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ov00000IpmhlIAB
 Elapsed Time: 1m 34.38s

🟡 sf data import bulk without setting --column-delimiter

It fails as expected but I'm wondering if there's a way to auto-detect the delimiter??

❯ sf data import bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertPipes.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Importing data ─────────────────

 ✔ Creating ingest job 1.57s
 ✘ Processing the job 11.74s
   ▸ Processed records: ✘
   ▸ Successful records: ✘
   ▸ Failed records: ✘

 Status: Failed
 Job Id: 750Ov00000IpeIoIAJ
 Elapsed Time: 13.41s

Error (JobFailedError): Job failed to be processed due to:

InvalidBatch : Field name not found : NAME|TYPE|PHONE|WEBSITE|ANNUALREVENUE

To review the details of this job, run this command:

sf org open --target-org test-p4bh29a7jcvc@example.com --path "/lightning/setup/AsyncApiJobStatus/page?address=%2F750Ov00000IpeIoIAJ"

🟢 data import bulk --column-delimiter works as expected

❯ sf data import bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertPipes.csv --sobject account --wait 10 --column-delimiter PIPE
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Importing data ─────────────────

 ✔ Creating ingest job 1.39s
 ✔ Processing the job 11.66s
   ▸ Processed records: 10
   ▸ Successful records: 10
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ov00000IpTdaIAF
 Elapsed Time: 13.11s

🟡 sf data update bulk with csv with no ID colum

As I was trying to figure out how to use the new command, I encountered this:

❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsert.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ✔ Creating ingest job 1.48s
 ✘ Processing the job 5.73s
   ▸ Processed records: 10
   ▸ Successful records: 0
   ▸ Failed records: 10

 Status: JobComplete
 Job Id: 750Ov00000IpoIAIAZ
 Elapsed Time: 7.42s

Error (FailedRecordDetailsError): Job finished being processed but failed to process 10 records.

To review the details of this job, run this command:

sf org open --target-org test-p4bh29a7jcvc@example.com --path "/lightning/setup/AsyncApiJobStatus/page?address=%2F750Ov00000IpoIAIAZ"

The sf org open works as expected but I'm not able to find any useful information about why it failed on that page. Are we able to surface a more descriptive error message to the user?

Looks like I needed to have an ID column in my csv - maybe that's something we can detect on our end if it's not something that the API tells us

🟢 sf data update bulk

❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsert.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ✔ Creating ingest job 1.40s
 ✔ Processing the job 5.79s
   ▸ Processed records: 10
   ▸ Successful records: 10
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ec00000FTwWFIA1
 Elapsed Time: 7.23s

🟢 handles aborted job

~/repos/trailheadapps/dreamhouse-lwc on  main via ⬢ v20.15.0 took 8.4s
❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsertPipes.csv --sobject account --wait 10
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ✔ Creating ingest job 1.77s
 ✘ Processing the job 2m 2.77s
   ▸ Processed records: ✘
   ▸ Successful records: ✘
   ▸ Failed records: ✘

 Status: Aborted
 Job Id: 750Ov00000IppR7IAJ
 Elapsed Time: 2m 5.28s

Error (JobAbortedError): Job has been aborted.

To review the details of this job, run this command:

sf org open --target-org test-p4bh29a7jcvc@example.com --path "/lightning/setup/AsyncApiJobStatus/page?address=%2F750Ov00000IppR7IAJ"

sf data update bulk --async and sf data update resume

~/repos/trailheadapps/dreamhouse-lwc on  main via ⬢ v20.15.0 took 9.6s
❯ sf data update bulk --file ~/repos/salesforcecli/plugin-data/test/test-files/data-project/data/bulkUpsert.csv --sobject account --async
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────── Updating data (async) ─────────────

 ✔ Creating ingest job 1.57s
 ◼ Processing the job

 Status: UploadComplete
 Job Id: 750Ec00000FU0MvIAL
 Elapsed Time: 1.61s

Run "sf data update resume --job-id 750Ec00000FU0MvIAL" to resume the operation.

~/repos/trailheadapps/dreamhouse-lwc on  main via ⬢ v20.15.0 took 3.5s
❯ sf data update resume --job-id 750Ec00000FU0MvIAL
 ›   Warning: @salesforce/plugin-data is a linked ESM module and cannot be auto-transpiled. Existing compiled source will be
 ›    used instead.

 ───────────────── Updating data ─────────────────

 ◯ Creating ingest job - Skipped
 ✔ Processing the job 931ms
   ▸ Processed records: 10
   ▸ Successful records: 10
   ▸ Failed records: 0

 Status: JobComplete
 Job Id: 750Ec00000FU0MvIAL
 Elapsed Time: 966ms

@mdonnalley mdonnalley merged commit 5ef1b55 into main Oct 30, 2024
27 checks passed
@mdonnalley mdonnalley deleted the cd/bulk-update branch October 30, 2024 14:58
@iowillhoit iowillhoit changed the title feat: add data update bulk/resume commands W-17037056 feat: add data update bulk/resume commands Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants