Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8158 bug UI not using presign for uploading objects #8365

Merged

Conversation

ItamarYuran
Copy link
Contributor

Closes #8158

Change Description

With the added changes, it is now possible to upload a file from the UI, with a minimal IAM permissions installation.

Background

See #8158

Bug Fix

Root cause - the GET request for the presigned url did not include a checksum of the desired file.
Solution - Adding the checksum parameter to the api for getPhysicalAddress operation

Additional info

In order to upload a file to a presigned url, there is a need to configure the bucket's CORS.

@ItamarYuran ItamarYuran added the exclude-changelog PR description should not be included in next release changelog label Nov 13, 2024
@ItamarYuran ItamarYuran linked an issue Nov 13, 2024 that may be closed by this pull request
Copy link

github-actions bot commented Nov 13, 2024

🎊 PR Preview 4f7f211 has been successfully built and deployed to https://treeverse-lakeFS-preview-pr-8365.surge.sh

🕐 Build time: 0.013s

🤖 By surge-preview

Copy link

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

Copy link

E2E Test Results - Quickstart

11 passed

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great catch, thanks!
A few change requests and a few questions and we're done 😃

Comment on lines 233 to 239
const parsedHeaders = {};

headerLines.forEach((line) => {
const [key, value] = line.split(':', 2).map((part) => part.trim());
if (key && value) {
parsedHeaders[key.toLowerCase()] = value;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More of a styling suggestion to initialize the parsedHeaders variable with data in a more functional way:

Suggested change
const parsedHeaders = {};
headerLines.forEach((line) => {
const [key, value] = line.split(':', 2).map((part) => part.trim());
if (key && value) {
parsedHeaders[key.toLowerCase()] = value;
}
const parsedHeaders = headerLines.reduce((acc, line) => {
const [key, value] = line.split(':', 2).map((part) => part.trim());
if (key && value) {
acc[key.toLowerCase()] = value;
}
return acc;
}, {});

Comment on lines 262 to 265
const checksum = extractChecksumFromResponse(uploadResponse.rawHeaders);
if (uploadResponse.status >= 400) {
throw new Error(`Error uploading file: HTTP ${uploadResponse.status}`);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const checksum = extractChecksumFromResponse(uploadResponse.rawHeaders);
if (uploadResponse.status >= 400) {
throw new Error(`Error uploading file: HTTP ${uploadResponse.status}`);
}
if (uploadResponse.status >= 400) {
throw new Error(`Error uploading file: HTTP ${uploadResponse.status}`);
}
const checksum = extractChecksumFromResponse(uploadResponse.rawHeaders);

webui/src/pages/repositories/repository/objects.jsx Outdated Show resolved Hide resolved
const parsedHeaders = {};

headerLines.forEach((line) => {
const [key, value] = line.split(':', 2).map((part) => part.trim());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting this way means that if there's a : in the value of the line (the value of the header) it won't appear in the array. It's possible that it will never happen but I don't really know. If you want to make sure that the array will be split into a header name and a value (which might include a :) you can use something like:

let [key, ...value] = line.split(':')
value = value.join(':')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, it would be great if you could extract the headers map creation out (it could be used on its own), and call it from this function, or even better, call it from the uploadFile function, and pass it to this function.

@@ -671,9 +671,8 @@ export const uploadWithProgress = (url, file, method = 'POST', onProgress = null
status: xhr.status,
body: xhr.responseText,
contentType: xhr.getResponseHeader('Content-Type'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this left?
Doesn't rawHeaders include this one as well?

webui/src/lib/api/index.js Show resolved Hide resolved
@ItamarYuran
Copy link
Contributor Author

Thank you for your review @Jonathan-Rosenberg, I corrected everything you mentioned.

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few more comments (and a one that you didn't address), but the biggest question of them all is:

How was this tested?

When I tried to run it it failed with:

Screenshot 2024-11-28 at 14 46 49

if (uploadResponse.status >= 400) {
throw new Error(`Error uploading file: HTTP ${uploadResponse.status}`);
}
const checksum = extractChecksumFromResponse(uploadResponse.rawHeaders);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation looks off

@@ -1125,7 +1121,7 @@ class Statistics {

class Staging {
async get(repoId, branchId, path, presign = false) {
const query = qs({path, presign});
const query = qs({path, presign});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation looks off

Comment on lines 671 to 672
status: xhr.status,
body: xhr.responseText,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should not get deleted

@Jonathan-Rosenberg
Copy link
Contributor

Defining CORS on the S3 bucket is crucial for the upload to succeed. If the user didn't configure the exposedHeaders field in the CORS policy it won't work.
Please return a proper error message if the checksum couldn't get extracted from the response.

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments but we're almost there!

}
// fallback to ETag
if (parsedHeaders['etag']) {
// drop any quote and space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation again 😆

});
xhr.addEventListener('error', () => reject(new Error('Upload Failed')));
xhr.addEventListener('error', () => resolve({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why is the error resolved instead of getting rejected?
If this is because the calling function doesn't catch it, then it should...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is that I wanted to throw the correct error. This way if there is an error due to missing CORS setting the calling func will catch it and throw suitable error. In case other piece of code uses this func and upload fails for some reason - the error will just state 'upload failed'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proper way of handling errors in Promises is using the try/catch mechanism at the calling function.
This should still reject, and the error (the object returned to the reject function) can be analyzed further by the calling function.
I would like to mention in this context that this is javascript and it's s**t.

webui/src/lib/api/index.js Show resolved Hide resolved
Comment on lines 230 to 241
const headersString = typeof rawHeaders === 'string' ? rawHeaders : rawHeaders.toString();
const cleanedHeadersString = headersString.trim();
const headerLines = cleanedHeadersString.split('\n');
const parsedHeaders = headerLines.reduce((acc, line) => {
let [key, ...value] = line.split(':'); // split into key and the rest of the value
key = key.trim();
value = value.join(':').trim();
if (key && value) {
acc[key.toLowerCase()] = value;
}
return acc;
}, {});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extract these lines to a function called parseRawHeaders (or something like that), and instead of those line run:

const parsedHeaders = parseRawHeaders(rawHeaders);
if (parsedHeaders['content-md5']) {
.....

I would actually pass the extractChecksumFromResponse function the parsed headers as an argument and let it fetch the checksum from it.

Comment on lines +58 to +73
export const parseRawHeaders = (rawHeaders) => {
const headersString = typeof rawHeaders === 'string' ? rawHeaders : rawHeaders.toString();
const cleanedHeadersString = headersString.trim();
const headerLines = cleanedHeadersString.split('\n');
const parsedHeaders = headerLines.reduce((acc, line) => {
let [key, ...value] = line.split(':'); // split into key and the rest of the value
key = key.trim();
value = value.join(':').trim();
if (key && value) {
acc[key.toLowerCase()] = value;
}
return acc;
}, {});
return parsedHeaders;
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this func here, as ive seen that no func was exported from the other file to this, and it is now also being used in this file.

Comment on lines 768 to 773
async upload(repoId, branchId, path, fileObject, onProgressFn = null) {
const query = qs({path});
const uploadUrl = `${API_ENDPOINT}/repositories/${encodeURIComponent(repoId)}/branches/${encodeURIComponent(branchId)}/objects?` + query;
const {status, body, contentType} = await uploadWithProgress(uploadUrl, fileObject, 'POST', onProgressFn)
const {status, body, rawHeaders} = await uploadWithProgress(uploadUrl, fileObject, 'POST', onProgressFn)
if (status !== 201) {
const contentType = rawHeaders ? parseRawHeaders(rawHeaders)['content-type'] : undefined;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This func is being used when lakefs is configured not to use presigned url. The change I introduced is due to the change in uploadWithProgress that only returns raw headers. Tested this manually, and in case of an error the func does extract the header and the error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great.
Only the indentation again 😅

Comment on lines 243 to 239
return response.etag.replace(/[" ]+/g, "");
return parsedHeaders['etag'].replace(/"/g, '');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the regex different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird, can't tell, switching back

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex explanation (which I think will not suffice 😅 ) and the indentation, and we're off!

Comment on lines 252 to 265
try {
const uploadResponse = await uploadWithProgress(getResp.presigned_url, file, 'PUT', onProgress, additionalHeaders);
const parsedHeaders = parseRawHeaders(uploadResponse.rawHeaders);
const checksum = extractChecksumFromResponse(parsedHeaders);
await staging.link(repo.id, reference.id, fpath, getResp, checksum, file.size, file.type);
} catch(error) {
if (error.status >= 400) {
throw new Error(`Error uploading file: HTTP ${error.status}`);
}
if (error.status === 0) {
throw new Error(`CORS settings error. Check documentation for more info.`);
}
throw error;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I-N-D-E-N-T-A-T-I-O-N

@ItamarYuran ItamarYuran force-pushed the 8158-bug-ui-not-using-presign-for-uploading-objects branch from 31bff69 to e767f86 Compare December 10, 2024 12:21
Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this and answering all of my comments!

@ItamarYuran ItamarYuran merged commit deb40d4 into master Dec 10, 2024
39 checks passed
@ItamarYuran ItamarYuran deleted the 8158-bug-ui-not-using-presign-for-uploading-objects branch December 10, 2024 12:53
@Jonathan-Rosenberg Jonathan-Rosenberg added include-changelog PR description should be included in next release changelog and removed exclude-changelog PR description should not be included in next release changelog labels Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: UI not using presign for uploading objects
2 participants