Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Container hangs waiting for startup due to logs #1035

Closed
mausch opened this issue Oct 29, 2023 · 5 comments · Fixed by #1038
Closed

[Bug]: Container hangs waiting for startup due to logs #1035

mausch opened this issue Oct 29, 2023 · 5 comments · Fixed by #1038
Labels
bug Something isn't working

Comments

@mausch
Copy link
Contributor

mausch commented Oct 29, 2023

Testcontainers version

3.5.0

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host arch

x86

.NET version

6.0.414

Docker version

Client:
 Version:           20.10.25
 API version:       1.41
 Go version:        go1.20.8
 Git commit:        v20.10.25
 Built:             Thu Jan  1 00:00:00 1970
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.25
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.20.8
  Git commit:       v20.10.25
  Built:            Tue Jan  1 00:00:00 1980
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.1
  GitCommit:        v1.7.1
 runc:
  Version:          1.1.5
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:

Docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.10.4)
  compose: Docker Compose (Docker Inc., 2.18.1)

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 26
 Server Version: 20.10.25
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: journald
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: v1.7.1
 runc version: 
 init version: 
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.199
 Operating System: NixOS 23.05 (Stoat)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.03GiB
 Name: RYOGA
 ID: 3B5S:ZIAN:TKIC:7JOU:XYLM:FTRC:3IUW:TDVE:7TDQ:2UBI:BRNQ:HKJQ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: mausch
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true

What happened?

Sample code from https://dotnet.testcontainers.org/modules/postgres/ hangs. I've stepped through it and it turns out that somehow the Testcontainers lib can't get the container logs, they all come back as an empty string even though I can view them with docker logs.

Here's a trivial example. If you override the default wait strategy that checks for logs it doesn't hang (I'm aware that checking only for the port is incorrect).

using Testcontainers.PostgreSql;

var testContainer = new PostgreSqlBuilder()
    .WithWaitStrategy(Wait.ForUnixContainer().UntilPortIsAvailable(5432)) // comment this out and it hangs
    .Build();
await testContainer.StartAsync();

Seems to be some breaking change in docker?

I tried starting the container directly with Docker.DotNet and I was able to get the logs just fine though...

Relevant log output

No response

Additional information

No response

@mausch mausch added the bug Something isn't working label Oct 29, 2023
@HofmeisterAn
Copy link
Collaborator

I cannot reproduce it with the latest version, nor can I find any release notes (changes) that may be causing an issue here. It appears that you are using an older Docker Engine version (20.10). Could you please try the latest version, 24.0?

I tried starting the container directly with Docker.DotNet and I was able to get the logs just fine though...

Collecting the logs is implemented in a straightforward manner. There isn't anything particularly special about what we do here. We set the log filter to min-max (by default) and then get the logs from the container.

@mausch
Copy link
Contributor Author

mausch commented Oct 31, 2023

I think I've found the cause of the issue.
When getting the logs the timestamps are undefined at the callsite.
This leads to

public Task<(string Stdout, string Stderr)> GetContainerLogsAsync(string id, DateTime since = default, DateTime until = default, bool timestampsEnabled = true, CancellationToken ct = default)
{
#if NETSTANDARD2_1_OR_GREATER
var unixEpoch = DateTime.UnixEpoch;
#else
var unixEpoch = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
#endif
if (default(DateTime).Equals(since))
{
since = DateTime.MinValue;
}
if (default(DateTime).Equals(until))
{
until = DateTime.MaxValue;
}
return Container.GetLogsAsync(id, since.ToUniversalTime().Subtract(unixEpoch), until.ToUniversalTime().Subtract(unixEpoch), timestampsEnabled, ct);
}
setting Since = DateTime.MinValue - epoch which is -62135596740 .
For some reason any negative value in this parameter leads to getting no logs at all.
With Since = 0 all logs are fetched just fine.

Here's a code snippet that replaces the default WaitUntil and shows what works and what doesn't:

public sealed class PostgreSqlContainerTest : IAsyncLifetime
{
    private sealed class WaitUntil : IWaitUntil
    {
        private static readonly string[] LineEndings = { "\r\n", "\n" };

        /// <inheritdoc />
        public async Task<bool> UntilAsync(IContainer container)
        {
             // this works
             var client = new DockerClientConfiguration().CreateClient();
             var logsParameters = new ContainerLogsParameters
             {
                 ShowStdout = true,
                 ShowStderr = true,
                 Since = "0",
                 Until = "253402300800",
             };
            using var logs = await client.Containers.GetContainerLogsAsync(container.Id, false, logsParameters);
            var (stdout, stderr) = await logs.ReadOutputToEndAsync(default);

            // this can't get logs
            // var client = new DockerClientConfiguration().CreateClient();
            // var logsParameters = new ContainerLogsParameters
            // {
            //     ShowStdout = true,
            //     ShowStderr = true,
            //     Since = "-1",
            //     Until = "253402300800",
            // };
            // using var logs = await client.Containers.GetContainerLogsAsync(container.Id, false, logsParameters);
            // var (stdout, stderr) = await logs.ReadOutputToEndAsync(default);

            //this hangs because it always gets empty logs (default WaitUntil used in PostgreSqlBuilder)
            //var (stdout, stderr) = await container.GetLogsAsync(timestampsEnabled: false);
            
            // this hangs, it uses the same arguments as the default WaitUntil
            // var (stdout, stderr) = await container.GetLogsAsync(
            //     timestampsEnabled: false, 
            //     since: DateTime.MinValue, 
            //     until: DateTime.MaxValue
            // );

            // works
            // var (stdout, stderr) = await container.GetLogsAsync(
            //     timestampsEnabled: false, 
            //     since: DateTime.UnixEpoch, 
            //     until: DateTime.MaxValue
            // );

            // works
            // var (stdout, stderr) = await container.GetLogsAsync(
            //     timestampsEnabled: false, 
            //     since: DateTime.UtcNow.Subtract(TimeSpan.FromHours(1)), 
            //     until: DateTime.UtcNow.Add(TimeSpan.FromHours(1))
            // );

            return 2.Equals(Array.Empty<string>()
                .Concat(stdout.Split(LineEndings, StringSplitOptions.RemoveEmptyEntries))
                .Concat(stderr.Split(LineEndings, StringSplitOptions.RemoveEmptyEntries))
                .Count(line => line.Contains("database system is ready to accept connections")));
        }
    }
    
    private readonly PostgreSqlContainer _postgreSqlContainer = new PostgreSqlBuilder()
        .WithWaitStrategy(Wait.ForUnixContainer().AddCustomWaitStrategy(new WaitUntil()))
        .Build();

    public Task InitializeAsync()
    {
        return _postgreSqlContainer.StartAsync();
    }

    public Task DisposeAsync()
    {
        return _postgreSqlContainer.DisposeAsync().AsTask();
    }

    [Fact]
    public void ExecuteCommand()
    {
        using DbConnection connection = new NpgsqlConnection(_postgreSqlContainer.GetConnectionString());
        using DbCommand command = new NpgsqlCommand();
        connection.Open();
        command.Connection = connection;
        command.CommandText = "SELECT 1";
    }
}

@mausch
Copy link
Contributor Author

mausch commented Oct 31, 2023

Could you please try the latest version, 24.0

Docker 24 didn't change this behaviour:

Client:
 Version:    24.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /nix/store/c9dxg39wxk8g77bzanmxxyag6gvml24r-docker-plugins/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.21.0
    Path:     /nix/store/c9dxg39wxk8g77bzanmxxyag6gvml24r-docker-plugins/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 26
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: journald
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: v1.7.5
 runc version: 
 init version: 
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.10.199
 Operating System: NixOS 23.05 (Stoat)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.03GiB
 Name: RYOGA
 ID: 41ad67c0-e890-4f4a-b020-06f2b6238ff2
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: mausch
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true

@mausch
Copy link
Contributor Author

mausch commented Oct 31, 2023

For some reason any negative value in this parameter leads to getting no logs at all.

BTW I'm not 100% sure if this is mapped as-is but the docker CLI also complains about negative values:

$ docker logs my-container --since "-62135596740"
invalid value for "since": parsing time "-62135596740" as "2006-01-02": cannot parse "-62135596740" as "2006"

$ docker logs my-container --since "-1"
invalid value for "since": parsing time "-1" as "2006-01-02": cannot parse "-1" as "2006"

$ docker logs my-container --since "0"
<all_logs>

@HofmeisterAn
Copy link
Collaborator

Indeed 🤔 a negative value looks wrong. I do not remember why I choose DateTime.MinValue and DateTime.MaxValue anymore. Without taking a closer look, I would expect unixEpoch to be the default value. Then since and until will correspond to 0 seconds (which is the default value according to the docs). Good catch 👀 thanks. Would you mind creating a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants