Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPS binary doesn't work on non 4k page size linux arm64 kernels #473

Closed
pschiffe opened this issue Jun 15, 2024 · 8 comments
Closed

DPS binary doesn't work on non 4k page size linux arm64 kernels #473

pschiffe opened this issue Jun 15, 2024 · 8 comments
Labels
enhancement A little behavior change which so small that can't be considered a feature

Comments

@pschiffe
Copy link

What is Happening

DPS binary, distributed either as docker image or from dns-proxy-server-linux-aarch64-3.19.3-snapshot.tgz archive, doesn't work on non 4k page size linux arm kernels.

RHEL 8 variants for arm64 has only 64k page size kernel:

$ getconf PAGESIZE
65536

RHEL 9 variants for arm64 have 2 kernel versions (4k and 64k page size) and you can choose. DPS works on the 4k version, but not on the 64k version.

I didn't tested it, but this will probably be issue also when running on linux on apple's arm, as that kernel has 16k page size.

When you try to run the binary from release archive of from docker, it won't start, you just get the error:
Fatal error: Failed to create the main Isolate. (code 8)

When I try to run the DPS as jar, it works:

$ java -version
openjdk version "22.0.1" 2024-04-16
OpenJDK Runtime Environment (Red_Hat-22.0.1.0.8-1) (build 22.0.1+8)
OpenJDK 64-Bit Server VM (Red_Hat-22.0.1.0.8-1) (build 22.0.1+8, mixed mode)

$ java -jar ./dns-proxy-server.jar 
20:42:31.229 [main           ] INF c.m.dnsproxyserver.config.dataprovider.JsonConfigsl=75   m=createDefault                   status=createdDefaultConfigFile, path=/root/conf/config.json
20:42:31.308 [main           ] DEB c.m.d.config.dataprovider.ConfigDAOJson           l=39   m=find                            configPath=/root/conf/config.json
20:42:31.657 [main           ] INF c.m.d.s.d.a.DpsDockerEnvironmentSetupService      l=32   m=setup                           status=binding-docker-events, connectedToDocker=true
20:42:31.657 [main           ] INF c.m.d.s.d.a.DpsDockerEnvironmentSetupService      l=44   m=setupNetwork                    status=dpsNetwork, active=false
20:42:31.657 [main           ] INF c.m.d.s.docker.application.DpsContainerService    l=102  m=tRunningContainersToDpsNetwork  status=autoConnectDpsNetworkDisabled, dpsNetwork=false, dpsNetworkAutoConnect=false
20:42:31.657 [main           ] INF c.m.d.solver.docker.entrypoint.EventListener      l=32   m=onStart                         status=containerAutoConnectToDpsNetworkDisabled
20:42:31.660 [main           ] INF com.mageddo.dnsserver.UDPServerPool               l=31   m=start                           Starting UDP server, addresses=/0.0.0.0:53
....

I think this is some java issue, but I'm not sure how to fix it. Google shows some reports like this for other java sw.

Specs

  • Docker Version:
Client: Docker Engine - Community
 Version:           26.1.3
 API version:       1.45
 Go version:        go1.21.10
 Git commit:        b72abbb
 Built:             Thu May 16 08:34:00 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.1.3
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.10
  Git commit:       8e96db1
  Built:            Thu May 16 08:33:12 2024
  OS/Arch:          linux/arm64
  Experimental:     true
 containerd:
  Version:          1.6.32
  GitCommit:        8b3b7ca2e5ce38e8f31a34f35b2b68ceb8470d89
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • DPS Version: defreitas/dns-proxy-server:3.19.3-snapshot-aarch64
  • Attach DPS Log file
Fatal error: Failed to create the main Isolate. (code 8)
  • OS: Rocky Linux 8.10 (Green Obsidian)
@mageddo mageddo added the enhancement A little behavior change which so small that can't be considered a feature label Jun 17, 2024
@mageddo
Copy link
Owner

mageddo commented Jun 17, 2024

Hey @pschiffe thanks for your report. I have no idea of how to fix it, I will search how to improve on that, any help is welcome.

It is probably related to the docker file

FROM defreitas/tools_graalvm-22.3_java-19_debian-9_aarch64:0.1.3

Or the deploy specs

@mageddo
Copy link
Owner

mageddo commented Jun 17, 2024

Maybe it's related to this one oracle/graal#7513, Looks like I will have to upgrade qemu version, maybe graal version too.

@mageddo mageddo added the triage Issue, commonly a report, being reviewed by repo contributor label Jun 22, 2024
This was referenced Jun 25, 2024
@mageddo
Copy link
Owner

mageddo commented Jun 27, 2024

Hey, @pschiffe , I've upgraded GraalVM and Qemu versions, can you check if dps 3.22.0-snapshot fix your usecase?

mageddo added a commit that referenced this issue Jun 28, 2024
* upgrading qemu version to fix the issue

* qemu upgrading

* [Gradle Release Plugin] - new version commit:  '3.22.0-snapshot'.

* machine upgrade not necessary anymore

* adjusting image

* setup new qemu emulator

* deleting unused code

* adjusting release notes

* Upgrading debian image to fix gcc error

failed to solve: process "/bin/sh -c apt-get update -y &&  apt-get install --force-yes -y build-essential curl &&  apt-get install --force-yes -y libz-dev zlib1g-dev zlib1g" did not complete successfully: exit code: 100

* Revert "deleting unused code"

This reverts commit bc2415d.

* Revert "setup new qemu emulator"

This reverts commit 14b65cb.

* adjusting release notes
@mageddo mageddo added waiting-feedback Answered by repo owners and waiting reporter feedback stale Issue answered by the authors and waiting feedback for a long time and removed stale Issue answered by the authors and waiting feedback for a long time labels Jun 28, 2024
@pschiffe
Copy link
Author

pschiffe commented Jul 1, 2024

Hi @mageddo, so far no joy, though the error code is little bit different (and .jar stopped working):

# cat /etc/os-release | grep PRETTY
PRETTY_NAME="Rocky Linux 8.10 (Green Obsidian)"

# getconf PAGESIZE
65536

# uname -r
4.18.0-553.5.1.el8_10.aarch64

# docker version | grep -A 2 Server
Server: Docker Engine - Community
 Engine:
  Version:          26.1.3

# docker run -d defreitas/dns-proxy-server:3.22.0-snapshot-aarch64
# docker logs d6928c2b6735
Fatal error: Failed to create the main Isolate. (code 24)

When trying binary dns-proxy-server-linux-aarch64-3.22.0-snapshot.tgz:

# ./dns-proxy-server 
./dns-proxy-server: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by ./dns-proxy-server)
./dns-proxy-server: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by ./dns-proxy-server)

# rpm -q glibc
glibc-2.28-251.el8_10.2.aarch64

Trying .jar file:

# java -version
openjdk version "21.0.3" 2024-04-16 LTS
OpenJDK Runtime Environment (Red_Hat-21.0.3.0.9-1) (build 21.0.3+9-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-21.0.3.0.9-1) (build 21.0.3+9-LTS, mixed mode, sharing)

# java -jar ./dns-proxy-server.jar 
Exception in thread "main" java.lang.NoClassDefFoundError: org/graalvm/nativeimage/ImageInfo
	at com.mageddo.utils.Runtime.getRunningDir(Runtime.java:34)
	at com.mageddo.dnsproxyserver.config.dataprovider.ConfigDAOJson.buildConfigPath(ConfigDAOJson.java:59)
	at com.mageddo.dnsproxyserver.config.dataprovider.ConfigDAOJson.find(ConfigDAOJson.java:33)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575)
	at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
	at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616)
	at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622)
	at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627)
	at com.mageddo.dnsproxyserver.config.application.ConfigService.findConfigs(ConfigService.java:35)
	at com.mageddo.dnsproxyserver.config.application.ConfigService.findCurrentConfig(ConfigService.java:29)
	at com.mageddo.dnsproxyserver.config.application.Configs.lambda$getInstance$0(Configs.java:19)
	at com.mageddo.commons.lang.Singletons.lambda$createOrGet$0(Singletons.java:19)
	at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
	at com.mageddo.commons.lang.Singletons.createOrGet(Singletons.java:19)
	at com.mageddo.commons.lang.Singletons.createOrGet(Singletons.java:15)
	at com.mageddo.dnsproxyserver.config.application.Configs.getInstance(Configs.java:18)
	at com.mageddo.dnsproxyserver.App.findConfig(App.java:53)
	at com.mageddo.dnsproxyserver.App.start(App.java:36)
	at com.mageddo.dnsproxyserver.App.main(App.java:25)
Caused by: java.lang.ClassNotFoundException: org.graalvm.nativeimage.ImageInfo
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:526)
	... 23 more

@mageddo
Copy link
Owner

mageddo commented Jul 1, 2024

alright, I will have to try it out

@mageddo mageddo removed the waiting-feedback Answered by repo owners and waiting reporter feedback label Jul 3, 2024
@mageddo
Copy link
Owner

mageddo commented Jul 5, 2024

Hey @pschiffe , Can you check if DPS 3.24.0-snapshot fixes your usecase? The jar also should be working again.

  • Fixed Exception in thread "main" java.lang.NoClassDefFoundError: org/graalvm/nativeimage/ImageInfo
  • Probably fixed Fatal error: Failed to create the main Isolate. (code 24), I really tried to emulate rockylinux on aarch but as I don't have an aarch machine, I had not success until now
  • ./dns-proxy-server: /lib64/libc.so.6: version GLIBC_2.32 not found (required by ./dns-proxy-server) will implicate to downgrade the glibc version at the builder machine or upgrade the glibc at your machine. I got how downgrade required libc version to 2.28 at Downgrade necessary libc version to run aarch binary #507

@mageddo mageddo added waiting-feedback Answered by repo owners and waiting reporter feedback and removed triage Issue, commonly a report, being reviewed by repo contributor labels Jul 5, 2024
@pschiffe
Copy link
Author

pschiffe commented Jul 6, 2024

Hi @mageddo, awesome job! This seems to be fixed in all my use cases. Here's what I've tested (with 3.24.0-snapshot):

  • Rocky 8 on ARM64 with 64k page size: docker image, binary & jar all working
  • Rocky 9 on ARM64 with 4k page size: docker image, binary & jar all working
  • Rocky 9 on x86_64 with 4k page size: docker image, binary, binary static & jar all working

It's perfect, thank you!

Reg glibc version, it's usually not possible to upgrade the version within the distribution. RHEL 8 & derivates are currently running on glibc-2.28, and these will be supported until 2029. Anything else supported is probably running on newer versions, so you don't need to support older versions than that. For now, supporting glibc-2.28 and newer is perfect.

This issue can be closed from my side, thanks again.

@mageddo
Copy link
Owner

mageddo commented Jul 6, 2024

Cheers, thanks for your help.

@mageddo mageddo closed this as completed Jul 6, 2024
@mageddo mageddo removed the waiting-feedback Answered by repo owners and waiting reporter feedback label Jul 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A little behavior change which so small that can't be considered a feature
Projects
Archived in project
Development

No branches or pull requests

2 participants