-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature #3
base: main
Are you sure you want to change the base?
Feature #3
Conversation
@lomick2090 Left a couple of comments here. please fix them and let me know. If you need clarity on my comments, please drop message, I will explain that. thanks. |
const data = JSON.parse(await response.text()); | ||
|
||
//Checks for full data pull | ||
startAt = startAt+50; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the use of 50 here.
Please dont use magic number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extract 50 into a variable and name it meaningfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will your code pull all issues no matter how many issues are there or projects are there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, made a resultsPerQuery variable that now equals 100, and added a "maxResults" property to the bodyData of the API call, so that 100 Issues are pulled per request. This number can be increased if needed, 50 is the default for an API call. You can see this in the new update.
Right now the code does pull all issues for a given project as it should. The implementation requires you to specify which project you want to pull from each time you run the program => node app.js (databaseId) **(projectId)**
. This could be reworked to pulling from all available projects if you want.
setTimeout(async () => { | ||
await BQ.insertRows(datasetId, projectId, issues) | ||
}, 10000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this timeout necessary? will it not work without wait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem faced here is that a newly created table may not appear to exist in the database until some time has passed. I found some discussion on the subject here:
googleapis/google-cloud-go#975
Sometimes the rows would insert no problem into the newly created table, and sometimes it would return a 404 error. It definitely appears to be a server-side issue.
There were a few ways I thought of fixing it:
-
Instead of deleting the table and creating a new table, instead delete the contents of the table and upload the new data. This at first glance seems to be the best solution, however it runs into a Streaming Buffer: where newly created records in the BQ storage cannot be deleted or edited until a 90 minute period has ended. This solution would work if you only ran the program once every 2 hours or longer, but it's hard to test, as you have to wait 90 minutes between tests, and this solution still doesn't cover the case of when a new table is created anyways.
-
Since we know we just created a new table, if insertRows returns a 404 error we could just run the function again until the rows are inserted or a different error is returned. The downside is this could cause a lot of BQ API calls. I have not tried this solution yet, but it seem feasible.
-
The solution I provided here with the timeout, which is maybe a little messy, but it was recommended in the above thread.
If you would like me to implement either of the other solutions let me know I can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good investigation. For the time being, lets keep it as it is.
…ltsPerQuery variable
static bigquery = new BigQuery() | ||
|
||
//Creates a new table in the dataset | ||
static async createTable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static make sense. good work.
Full commits