Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New batch migration causes error on sqlite #7348

Closed
mrsdizzie opened this issue Jul 3, 2019 · 8 comments · Fixed by #7353
Closed

New batch migration causes error on sqlite #7348

mrsdizzie opened this issue Jul 3, 2019 · 8 comments · Fixed by #7353
Labels

Comments

@mrsdizzie
Copy link
Member

mrsdizzie commented Jul 3, 2019

(cc @lunny)

With the changes in #7050, I now get these when testing a migration:

2019/07/03 11:52:30 routers/repo/repo.go:315:MigratePost() [E] MigratePost: too many SQL variables

My example happened for comments, based on this INSERT statement (I think):

gitea/models/migrate.go

Lines 121 to 127 in b5aa7f7

if _, err := sess.NoAutoTime().Insert(comments); err != nil {
return err
}
for issueID := range issueIDs {
if _, err := sess.Exec("UPDATE issue set num_comments = (SELECT count(*) FROM comment WHERE issue_id = ?) WHERE id = ?", issueID, issueID); err != nil {
return err
}

Here is the full gist:

https://gist.github.com/mrsdizzie/681ea0295c11350fea4244a4289665ef

This is just testing migrating the "tea" repo here: https://github.com/go-gitea/tea

But it could maybe happen other places too for similar reasons. For each comment there would be a few dozen variables in that SQL statement, so it breaks depending how many comments there are. Each comment adds about 22 variables to the statement now (one for each column), so even something like 50 total comments (say 10 issues with 5 comments each) is enough to trigger this error since I believe the default SQLITE_MAX_VARIABLE_NUMBER is 999 (22 * 50 = 1,100).

I know this is sort of a limit of sqlite and probably wouldn't happen with others -- but still an issue (and makes testing new migration features difficult since it is nice to use sqlite locally for development).

Maybe we can detect if sqlite and use the old method (or limit it to a known good number like 25 comments at a time)?

@lunny lunny added the type/bug label Jul 4, 2019
@lunny
Copy link
Member

lunny commented Jul 4, 2019

The batch size of insert comments is 100 comments. Maybe I should move it to 50.

@lunny
Copy link
Member

lunny commented Jul 4, 2019

And I tested locally, it seems it's OK to migrate github.com/go-gitea/tea with sqlite database. I'm on MacOS.

@lunny
Copy link
Member

lunny commented Jul 4, 2019

@mrsdizzie could you confirm #7353 can fix your issue. You need to change the default SAVE_BATCH_SIZE.

@zeripath
Copy link
Contributor

zeripath commented Jul 4, 2019

We should set the batch size to something that works for sqlite out of the box as that's the default db

@mrsdizzie
Copy link
Member Author

I'm not the most knowledgable on all of this, but from some reading of the error I think the gist is that for each column in a row, there with be a ? variable in the SQL statement. The more columns, the more ? and then it grows exponentially as each row inserted will have another set of them (see the gist I posted above). Once there are more than 999 ?sqlite will throw an error.

@lunny the test for tea might work for you and fail for me because I was testing a PR that adds two more columns to the comment table, so there are more variables for each comment Inserted and it hits the Sqlite limit faster. In My example above there are 1035 ? variables from 45 imported comments. If you remove the two columns I added for my PR, there would only be 945 (1035 - 90) and it would not hit the error.

I think the real issue here is that it isn't about the number of rows you are trying to insert but how many columns each of those rows has. Inserting 100 rows at a time will work fine for a table that has a few columns but will have trouble for a larger table like comments which as of my PR now has 23 columns.

While setting a Global limit could help in my situation as I could lower the number, it would involve guessing for most users since the problem isn't based on the number of rows inserted but how many columns are in those rows * the number of rows. I think it would be better in the code if we know that the comments table has x number of columns that we don't insert more rows than it can handle if possible.

If not the global limit would probably need to be based on the table with the largest number of columns that can have a lot of rows imported. In the case from my example, the limit would have to be 43 (23 * 42 = 989). And then it would need to be lower if another column was added to comments table. I'm not aware if there are larger tables that let you add lots of rows too, but if so it would need to be lower.

Alternatively, setting the default for Sqlite to something lower like 25 would probably avoid getting close to those limits without worrying about it breaking on an update when somebody adds a column.

Sorry for long response was just trying to work all of that out in my head.

@lunny
Copy link
Member

lunny commented Jul 6, 2019

In fact, we know the columns length of issue and comment tables.

@mrsdizzie
Copy link
Member Author

@lunny then I think we should add code where it does the insert not to use more than the number of rows that will cause an error (where rows * columns < 999). It could be sqlite conditional too. If there are no other places where it would try to do large inserts at once then maybe just adding that code to migrations will be good to fix this.

@lunny
Copy link
Member

lunny commented Jul 6, 2019

@mrsdizzie I updated #7353

@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants