Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] Consolidate JNI compilation #2 #20367

Closed
12 tasks done
asfimport opened this issue Aug 12, 2022 · 13 comments
Closed
12 tasks done

[Java] Consolidate JNI compilation #2 #20367

asfimport opened this issue Aug 12, 2022 · 13 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Aug 12, 2022

Umbrella ticket for consolidating Java JNI compilation initiative #2

Initial part of consolidate JNI Java initiative was: Consolidate ORC/Dataset code and Separate JNI CMakeLists.txt compilation.

This 2nd part consist on:

1.- Make the Java library able to compile with a single mvn command
2.- Make Java library able to compile from an installed libarrow
3.- Migrate remaining C++ CMakeLists.txt specific to Java into the Java project: ORC / Dataset / Gandiva
4.- Add windows build script that produces DLLs
5.- Incorporate Windows DLLs into the maven packages
6.- Migrate ORC JNI to use C-Data-Interface

Reporter: David Dali Susanibar Arce / @davisusanibar

Subtasks:

Related issues:

Note: This issue was originally created as ARROW-17404. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

David Dali Susanibar Arce / @davisusanibar:
Hi @kou please if you could help me on these questions:

  1. Is ARROW-17081 cover point (2) por Dataset migration?

  2. Related to (3) point: Do you have some ideas about how do we could implement that? 

  3. Related to Window DLL stuff: Do you have some initial advice or recommendations about how do we could produce DDL for ORC / Dataset / Gandiva modules?

     

    @lwhite1  

@asfimport
Copy link
Collaborator Author

Kouhei Sutou / @kou:

  1. Yes.
  2. Yes. We can use find_package(Arrow REQUIRED) in CMakeLists.txt to find installed libarrow.so/libarrow.a/...
  3. Yes. I think that we can produce JNI DLLs easily on Windows too because CMake handles DLL related stuff. But we may need to spend time to prepare Windows CI. As the first step, we need to add a CI job on Windows to .github/workflows/java.yml without JNI related stuff. Then we can add JNI related stuff to the added CI job. We also need to add build-cpp-windows to dev/tasks/java-jars/github.yml to provide jars that support Linux/macOS/Windows.

@asfimport
Copy link
Collaborator Author

David Dali Susanibar Arce / @davisusanibar:
Thanks.

@kou For Window DLL stuff could be possible to consider that in your scope also? If yes, then, I could move to review migrate ORC JNI to use C-Data-Interface.

 

@asfimport
Copy link
Collaborator Author

Kouhei Sutou / @kou:
OK. I'll work on it. BTW, can you add a CI job on Windows to .github/workflows/java.yml without JNI related stuff? If you can, could you work on it? I'm not familiar with Java's (Maven) build system...

@asfimport
Copy link
Collaborator Author

David Dali Susanibar Arce / @davisusanibar:
Ok, let me work on that: add a CI job on Windows to .github/workflows/java.yml

@asfimport
Copy link
Collaborator Author

David Dali Susanibar Arce / @davisusanibar:
Hi @kou this PR is going to add changes to Java project to be able to compile on Windows environment.

Please let me know if there are some other task pending to you could help us with [Java] Add windows build script that produces DLLs for JNI modules?

@asfimport
Copy link
Collaborator Author

Kouhei Sutou / @kou:
How about moving cpp/ to java/ tasks for ORC and Gandiva too?
I'm working on ARROW-17081 and I found that it requires some CMake skills. Because we need to improve our CMake configurations to complete ARROW-17081. For example, ARROW-17081 depends on ARROW-12175 and ARROW-17451. ARROW-17451 is merged but ARROW-12175 isn't merged yet. Because ARROW-12175 depends on ARROW-17511.

@asfimport
Copy link
Collaborator Author

David Dali Susanibar Arce / @davisusanibar:
Moving cpp/ to java/ tasks for ORC and Gandiva sounds great!, thank you

@asfimport
Copy link
Collaborator Author

Kouhei Sutou / @kou:
OK. I've created ARROW-17560 and ARROW-17561 for them.

@asfimport
Copy link
Collaborator Author

David Dali Susanibar Arce / @davisusanibar:
Hi @kou I am trying to build JNI locally. Able to build Arrow JNI gandiva/orc, but when try to run:\

cmake \
-S java \
-B java-jni \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=java-dist/lib \
-DCMAKE_PREFIX_PATH=java-dist 

I am seeing messages like this:

CMake Error at CMakeLists.txt:60 (find_package):
  By not providing "FindArrowTesting.cmake" in CMAKE_MODULE_PATH this project
  has asked CMake to find a package configuration file provided by
  "ArrowTesting", but CMake did not find one.
  Could not find a package configuration file provided by "ArrowTesting" with
  any of the following names:
    ArrowTestingConfig.cmake
    arrowtesting-config.cmake
  Add the installation prefix of "ArrowTesting" to CMAKE_PREFIX_PATH or set
  "ArrowTesting_DIR" to a directory containing one of the above files.  If
  "ArrowTesting" provides a separate development package or SDK, be sure it
  has been installed. 

Is there some steps missing on building.rst file?

 

Then trying with 

cmake \
-S java \
-B java-jni \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=java-dist/lib \
-DCMAKE_PREFIX_PATH=java-dist \
-DBUILD_TESTING=OFF 

Seeing error like this:

-- Building using CMake version: 3.24.1
CMake Error at dataset/CMakeLists.txt:18 (find_package):
  By not providing "FindArrowDataset.cmake" in CMAKE_MODULE_PATH this project
  has asked CMake to find a package configuration file provided by
  "ArrowDataset", but CMake did not find oneCould not find a package configuration file provided by "ArrowDataset" with
  any of the following names:    ArrowDatasetConfig.cmake
    arrowdataset-config.cmake  Add the installation prefix of "ArrowDataset" to CMAKE_PREFIX_PATH or set
  "ArrowDataset_DIR" to a directory containing one of the above files.  If
  "ArrowDataset" provides a separate development package or SDK, be sure it
  has been installed.
 

@asfimport
Copy link
Collaborator Author

Kouhei Sutou / @kou:
You need to specify -DCMAKE_PREFIX_PATH=${CMAKE_INSTALL_PREFIX_VALUE_FOR_YOUR_APACHE_ARROW_CPP} to find ArrowTestingConfig.cmake and ArrowDatasetConfig.cmake

@asfimport
Copy link
Collaborator Author

Apache Arrow JIRA Bot:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

@asfimport
Copy link
Collaborator Author

Kouhei Sutou / @kou:
@davisusanibar Can we close this? Is there any more task to be resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant