Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process stuck #40

Closed
lucasnorman opened this issue Apr 16, 2021 · 11 comments
Closed

Process stuck #40

lucasnorman opened this issue Apr 16, 2021 · 11 comments
Assignees
Labels
Milestone

Comments

@lucasnorman
Copy link

I am trying to run the jar file and not getting any logs:

java -dsa -da -XX:+UseG1GC -Xmx2048m -Dlog.level=DEBUG -jar fsimage-exporter-1.4.jar localhost 9709 fsimage-conf.yml

Nothing is being logged and no process is reporting open on 9709. Any pointers?

@marcelmay
Copy link
Owner

Can you provide the output of
java -version
and
sha1sum fsimage-exporter-1.4.jar ?
and
java -jar fsimage-exporter-1.4.jar ?

The last line should print something like
Usage: WebServer [-Dlog.level=[WARN|INFO|DEBUG]] <hostname> <port> <yml configuration file>

@lucasnorman
Copy link
Author

$java -version
java version "1.8.0_271"
Java(TM) SE Runtime Environment (build 8.0.6.20 - pxa6480sr6fp20-20201120_02(SR6 FP20))
IBM J9 VM (build 2.9, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20201102_458768 (JIT enabled, AOT enabled)
OpenJ9   - 5b31a42
OMR      - 6ad3a34
IBM      - b7e48f4)
JCL - 20201119_01 based on Oracle jdk8u271-b09
$sha1sum fsimage-exporter-1.4.jar
15d331ea5c6b3fece4839d8370e0aa3a5a11374a  fsimage-exporter-1.4.jar
$java -jar fsimage-exporter-1.4.jar
Usage: WebServer [-Dlog.level=[WARN|INFO|DEBUG]] <hostname> <port> <yml configuration file>

@marcelmay
Copy link
Owner

Just tried to reproduce using your JDK version, but it seems to work:

Dockerfile

FROM ibmjava:jre
RUN java -version
ADD https://raw.githubusercontent.com/marcelmay/hadoop-hdfs-fsimage-exporter/master/example.yml .
ADD https://repo1.maven.org/maven2/de/m3y/prometheus/exporter/fsimage/fsimage-exporter/1.4/fsimage-exporter-1.4.jar .
ADD https://github.com/marcelmay/hadoop-hdfs-fsimage-exporter/raw/master/src/test/resources/fsimage_0001 /src/test/resources/fsimage_0001
#CMD ["java", "-version"]
CMD ["java", "-dsa", "-da", "-XX:+UseG1GC", "-Xmx2048m", "-Dlog.level=DEBUG", "-jar", "/fsimage-exporter-1.4.jar", "localhost", "9709", "example.yml"]
docker build -t fsimage-test .  
docker run -it --rm fsimage-test

2021-04-16 19:01:09,363 [pool-1-thread-1] DEBUG de.m3y.prometheus.exporter.fsimage.FsImageWatcher  - Detected changes (old=, new=/src/test/resources/fsimage_0001)
2021-04-16 19:01:09,409 [pool-1-thread-1] DEBUG de.m3y.hadoop.hdfs.hfsa.core.FsImageLoader  - Loading fsimage section STRING_TABLE of 53 bytes
2021-04-16 19:01:09,468 [pool-1-thread-1] INFO  org.apache.hadoop.hdfs.server.namenode.FSDirectory  - GLOBAL serial map: bits=29 maxEntries=536870911
2021-04-16 19:01:09,468 [pool-1-thread-1] INFO  org.apache.hadoop.hdfs.server.namenode.FSDirectory  - USER serial map: bits=24 maxEntries=16777215
2021-04-16 19:01:09,468 [pool-1-thread-1] INFO  org.apache.hadoop.hdfs.server.namenode.FSDirectory  - GROUP serial map: bits=24 maxEntries=16777215
2021-04-16 19:01:09,468 [pool-1-thread-1] INFO  org.apache.hadoop.hdfs.server.namenode.FSDirectory  - XATTR serial map: bits=24 maxEntries=16777215
2021-04-16 19:01:09,470 [pool-1-thread-1] DEBUG de.m3y.hadoop.hdfs.hfsa.core.FsImageLoader  - Loaded 5 strings into string table of length 53 bytes
2021-04-16 19:01:09,470 [pool-1-thread-1] DEBUG de.m3y.hadoop.hdfs.hfsa.core.FsImageLoader  - Loaded fsimage section STRING_TABLE in 61ms
...

My initial thought was that something about your special JDK is wrong, but as seen with the Dockerfile it works.

Could you share your fsimage-conf.yml , and do an ls -al on your fsimage?

@lucasnorman
Copy link
Author

conf:

fsImagePath : '/data/hadoop/hdfs/nn'

skipFileDistributionForGroupStats : true

skipFileDistributionForUserStats : false


#paths:
#  - '/data'
#  - '/home'
#  - '/rr'
#  - '/raw'
#  - '/system'
#  - '/tmp'
#  - '/user'
#  - '/usr'

skipFileDistributionForPathStats : true

skipFileDistributionForPathSetStats : true
$ls -la /data/hadoop/hdfs/nn/
total 472
drwxrwxr-x 3 hdfs-user hdfs-user      40 Apr 12 21:24 .
drwxrwxr-x 3 hdfs-user hdfs-user      16 Aug 27  2020 ..
drwx------ 2 hdfs-user hdfs-user 1601536 Apr 16 19:10 current
-rw-rw-r-- 1 hdfs-user hdfs-user      30 Apr 12 21:42 in_use.lock

@marcelmay
Copy link
Owner

Ok, it's the missing fsimage , as there is no fsimage_00* file in /data/hadoop/hdfs/nn/ .
As soon as there is an fsimage file dumped by the Hadoop NameNode, the exporter should work.

Thanks alot, I will try to fix this bug (to at least show the expected startup info).

@marcelmay
Copy link
Owner

You could try to trigger the fsimage dump via dfsadmin and saveNamespace option.

@marcelmay marcelmay added the bug label Apr 16, 2021
@marcelmay marcelmay added this to the 1.4.1 milestone Apr 16, 2021
@marcelmay marcelmay self-assigned this Apr 16, 2021
@lucasnorman
Copy link
Author

I dumped the fsimage file to the path but it is still not working. Does it need to have a specific name? Does fsImagePath need to be set to the file or the dir containing it?

@marcelmay
Copy link
Owner

You probably have to change the config and set the fsImagePath to the current subdirectory:
fsImagePath : '/data/hadoop/hdfs/nn/current'
At this location, there should be fsimage_00* files.

@lucasnorman
Copy link
Author

@marcelmay Okay thank you for your help, that worked! Would you like to close this or leave it open?

@marcelmay
Copy link
Owner

Thanks for the feedback.

Please leave this issue open for the missing output, the exporter should log something and complain about no expected fsimage at the specified location.

@marcelmay
Copy link
Owner

Output is now more verbose:

2021-04-19 21:38:18,867 [main] INFO  de.m3y.prometheus.exporter.fsimage.WebServer  - FSImage exporter started and listening on http://localhost:9709
2021-04-19 21:38:18,909 [pool-3-thread-1] WARN  de.m3y.prometheus.exporter.fsimage.FsImageWatcher  - Can not detect fsimage file : No fsimage file(s) found in /Users/mm/projects/hadoop-fsimage-exporter/src/test matching pattern fsimage_\d+
...
2021-04-19 21:39:18,916 [pool-3-thread-1] DEBUG de.m3y.prometheus.exporter.fsimage.FsImageWatcher  - Detected changes (old=, new=/Users/mm/projects/hadoop-fsimage-exporter/src/test/fsimage_0001)
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants