Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large SQL files skipped without error on supabase db reset #470

Closed
FelixZY opened this issue Sep 23, 2022 · 3 comments
Closed

Large SQL files skipped without error on supabase db reset #470

FelixZY opened this issue Sep 23, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@FelixZY
Copy link
Contributor

FelixZY commented Sep 23, 2022

Bug report

Describe the bug

My database should contain geographical data for Sweden. This data is inserted using 327 INSERT statements and the SQL file takes up ~4MB. When I attempt to run this file (using supabase db reset) as part of either my migrations (supabase/migrations/*.sql) or my initial seed (supabase/seed.sql) some data is skipped over and the reset operation is reported as a success.

To Reproduce

Steps to reproduce the behavior, please provide code snippets or a repository:

  1. Place the provided SQL file in supabase/migrations/
  2. Run supabase db reset
  3. Go to studio -> locations table

Expected behavior

  • locations table should contain 327 records - the full SQL file should be executed.

Screenshots

In this case, the actual count was 206 records:

Screenshot from 2022-09-23 21-50-00

System information

  • Host OS: Ubuntu 20.04.4 LTS (focal) (64-bit)
  • Devcontainer OS: Debian 11 (bullseye)
  • Docker (Running on host): 20.10.17, build 100c701
  • Version of supabase cli (running in devcontainer): 1.4.8
CONTAINER ID   IMAGE                                                   COMMAND                  CREATED             STATUS                       PORTS                                                                                                                                   NAMES
0c91c366d071   public.ecr.aws/t3w2s2c9/studio:v0.1.0                   "docker-entrypoint.s…"   About an hour ago   Up About an hour             0.0.0.0:54323->3000/tcp, :::54323->3000/tcp                                                                                             supabase_studio_portal
ed3f8ddddef6   public.ecr.aws/t3w2s2c9/postgres-meta:v0.45.0           "docker-entrypoint.s…"   About an hour ago   Up About an hour             8080/tcp                                                                                                                                supabase_pg_meta_portal
0c37d35fc865   public.ecr.aws/t3w2s2c9/pgadmin-schema-diff:cli-0.0.5   "sleep infinity"         About an hour ago   Up About an hour                                                                                                                                                     supabase_differ_portal
122778e9a4e2   public.ecr.aws/t3w2s2c9/storage-api:v0.20.1             "docker-entrypoint.s…"   About an hour ago   Up 12 minutes                5000/tcp                                                                                                                                supabase_storage_portal
ac4cede65018   public.ecr.aws/t3w2s2c9/postgrest:v9.0.1.20220717       "/bin/postgrest"         About an hour ago   Up About an hour             3000/tcp                                                                                                                                supabase_rest_portal
5aa101a262b9   public.ecr.aws/t3w2s2c9/realtime:v0.22.7                "./prod/rel/realtime…"   About an hour ago   Up About an hour                                                                                                                                                     supabase_realtime_portal
f39c4983b1a6   public.ecr.aws/t3w2s2c9/inbucket:3.0.3                  "/start-inbucket.sh …"   About an hour ago   Up About an hour (healthy)   0.0.0.0:54326->1100/tcp, :::54326->1100/tcp, 0.0.0.0:54325->2500/tcp, :::54325->2500/tcp, 0.0.0.0:54324->9000/tcp, :::54324->9000/tcp   supabase_inbucket_portal
eb7abd449c05   public.ecr.aws/t3w2s2c9/gotrue:v2.15.3                  "gotrue"                 About an hour ago   Up About an hour                                                                                                                                                     supabase_auth_portal
b89c961806e0   public.ecr.aws/t3w2s2c9/kong:2.8.1                      "sh -c 'cat <<'EOF' …"   About an hour ago   Up About an hour (healthy)   8001/tcp, 8443-8444/tcp, 0.0.0.0:54321->8000/tcp, :::54321->8000/tcp                                                                    supabase_kong_portal
3dd717f7c160   public.ecr.aws/t3w2s2c9/postgres:14.1.0.66              "docker-entrypoint.s…"   About an hour ago   Up 12 minutes (healthy)      0.0.0.0:54322->5432/tcp, :::54322->5432/tcp                                                                                             supabase_db_portal
a29675cb4712   vsc-portal-6c69f167f61ae88009bdb1b8b95be08c-uid         "/bin/sh -c 'echo Co…"   About an hour ago   Up About an hour                                                                                                                                                     optimistic_matsumoto

Additional info

  • When using a single file for all INSERTs, I'm consistently getting 206 records inserted.
  • When I split the INSERT statements into multiple files, the number of inserted records changes.
  • By using one file per INSERT, I am able to have all 327 records inserted (though this is a very messy workaround).
@FelixZY FelixZY added the bug Something isn't working label Sep 23, 2022
@FelixZY
Copy link
Contributor Author

FelixZY commented Sep 24, 2022

Debugging this right now. First find of significance:

parser.Split(sql) returns 210 results here (4 create/etc. + 327 inserts expected):

for _, line := range parser.Split(sql) {

@FelixZY
Copy link
Contributor Author

FelixZY commented Sep 24, 2022

And the issue appears to be that scanner.Err() is not checked here:

func Split(sql io.Reader) (stats []string) {
t := tokenizer{state: &ReadyState{}}
scanner := bufio.NewScanner(sql)
scanner.Split(t.ScanToken)
for scanner.Scan() {
token := scanner.Text()
stats = append(stats, token)
}
return stats
}

By adding a print to the end:

for scanner.Scan() {
  token := scanner.Text()
  stats = append(stats, token)
}
fmt.Println(scanner.Err())
return stats

I get the following output:

$ cd ./cli && go build && cd ../ && ./cli/cli db reset
Resetting database...
Initialising schema...
Applying migration 20220923174143_locations.sql...
bufio.Scanner: token too long
210
bufio.Scanner: token too long
Seeding data...
<nil>
Activating branch...
Finished supabase db reset on branch main.

@FelixZY
Copy link
Contributor Author

FelixZY commented Sep 24, 2022

Following up from documentation:

Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

https://pkg.go.dev/bufio#Scanner


EDIT

I opted for increasing the buffer size of the scanner instead: https://stackoverflow.com/a/39864391/1137077

sweatybridge pushed a commit that referenced this issue Sep 24, 2022
For really long lines, it is possible for the scanner to run out of
buffer space, resulting in

```go
scanner.Err() // bufio.Scanner: token too long
```

This error was not checked however, causing a silent failure which was
supressed.

This commit adds a check for `scanner.Err()` and properly propagates any
non-nil values.
sweatybridge pushed a commit that referenced this issue Sep 24, 2022
For really long lines, it is possible for the scanner to run out of
buffer space, resulting in

```go
scanner.Err() // bufio.Scanner: token too long
```

Following [this](https://stackoverflow.com/a/39864391/1137077)
suggestion from Stack Overflow, this commit implements a workaround
where the maximum buffer capacity is increased from 64K to 256K. Testing
against the SQL file with geodata from #470 showed that already 128K was
enough to resolve that issue. 256K is double that and - as a wise man
once said - "ought to be enough for anybody".
@FelixZY FelixZY closed this as completed Sep 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant