SemanticDB support #1467

aishfenton · 2023-01-29T23:32:38Z

Description

Adds support for SemanticDB

Extends the work from the stale PR by @olafurpg and @ricochet1k.

Fixed to work with current HEAD
Added Scala3 support
Adds a enableSemanticDb option, that is set on the scala_toolchain method. Defaults to false. Needed, as semanticdb adds 20-50% overhead to compilation times (which not everyone will want/need).
Adds tests for semanticdb in Scala 2 & 3.

Motivation

SemanticDB is needed for many Scala tools, including Metals.

google-cla · 2023-01-29T23:32:44Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

aishfenton · 2023-01-30T00:58:08Z

@ricochet1k are you able to sign the Google CLA? Seems that's needed for them to accept your part of the PR.

aishfenton · 2023-01-31T02:15:25Z

Updated to add Scala3 test (and fixed Scala3 bug that exposed).

eed3si9n

Thanks @aishfenton for resurrecting @olafurpg's #1329!

We are in SemanticDB starvation with Scala and Bazel (Metals doesn't work etc), and letting scala_library generate SemanticDB monorepo-wide, I think will get the whole thing moving in the right direction.

FTR I added SemanticDB support in sbt 1.3.0 with the following settings https://www.scala-sbt.org/1.x/docs/sbt-1.3-Release-Notes.html#SemanticDB+support:

ThisBuild / semanticdbEnabled := true
ThisBuild / semanticdbVersion := "4.1.9"
ThisBuild / semanticdbIncludeInJar := false

Regarding rule vs aspect discussion from the previous PR, I think this is a better approach than using aspect since SemanticDB is basically a compiler plugin that generates some binary on the side, so making some tool build with aspect would effectively be a duplicated compilation effort on CI (or more likely on someone's laptop trying to use Metals).

ricochet1k · 2023-01-31T20:06:38Z

@ricochet1k are you able to sign the Google CLA? Seems that's needed for them to accept your part of the PR.

Strange, I thought I did already. Tried again.

aishfenton · 2023-01-31T20:18:58Z

That did the trick @ricochet1k 👍

eed3si9n · 2023-01-31T20:36:00Z

third_party/repositories/scala_2_13.bzl

@@ -81,6 +81,10 @@ artifacts = {
            "@io_bazel_rules_scala_scala_library",
        ],
    },
+    "org_scalameta_semanticdb_scalac": {
+        "artifact": "org.scalameta:semanticdb-scalac_2.13.6:4.7.3",


Given that ScalaMeta version is repeated a few times in this file:

"org_scalameta_scalameta": { "artifact": "org.scalameta:scalameta_2.13:4.3.24", .... "org_scalameta_trees": { "artifact": "org.scalameta:trees_2.13:4.3.24",

I suggest that it should be factored out as a variable similar to scala_version at line 1:

scala_version = "2.13.6"

and change the above to

"artifact": "org.scalameta:semanticdb-scalac_{}:{}".format(scala_version, scalameta_version),

Will do in follow up PR. Seems to have a few gotchas, as it also requires ScalaFmt rules to be updated.

From my own experience - keeping versions duplicated allows simple grep for artifact coords

liucijus · 2023-02-08T12:47:33Z

@aishfenton I will review this next week, sorry for the delay

liucijus · 2023-02-17T11:08:51Z

@aishfenton thanks for taking this!

Do you already use SemanticDB with Rules Scala? If yes, I would like to understand how tools consume such output.

Regarding impl, have you consider implementing it as a phase? Eg. scalafmt phase. I'd like to understand if we can avoid adding additional logic to ScalacWorker and keep it within starlark files. At Wix we have macro which adds semantic db generation (authored by @martynasp-wix), which is used for refactoring automation. Here's the gist of it (most likely can be trimmed to much shorter impl by dropping configuration checks):

def _append_parameters_to_list(kwargs, name, values_to_add):
    values = kwargs.pop(name, [])
    kwargs[name] = values + values_to_add

def _add_semanticdb_support(**kwargs):
    semanticdb_plugins = ["@org_scalameta_semanticdb_scalac_2_12_15"]
    semanticdb_scalacopts = [
        "-P:semanticdb:synthetics:on",
    ]

    is_main_repo = native.repository_name() == "@"
    build_semanticdb = kwargs.pop("build_semanticdb", None)

    if build_semanticdb:
        scalacopts = semanticdb_scalacopts
        plugins = semanticdb_plugins
    elif build_semanticdb == False or not is_main_repo:
        scalacopts = []
        plugins = []
    else:
        plugins = select({
            "@wix_tools//dependencies/rules_scala:with_semanticdb": semanticdb_plugins,
            "//conditions:default": [],
        })

        scalacopts = select({
            "@wix_tools//dependencies/rules_scala:with_semanticdb": semanticdb_scalacopts,
            "//conditions:default": [],
        })

    _append_parameters_to_list(kwargs, "plugins", plugins)
    _append_parameters_to_list(kwargs, "scalacopts", scalacopts)

    return kwargs

def with_meta(scala_rule, *args, **kwargs):
    scala_rule(*args, **_add_semanticdb_support(**kwargs))

aishfenton · 2023-02-20T16:21:09Z

Thanks @liucijus ! I went with modifying the ScalaWorker class, as Scala3 has SemanticDB built into the compiler (it now gets enabled as a standard scalac option, not a plugin). So I was thinking it'd make the most sense to treat that similar to other first-class scalac options. But that leaves the question for what to do for Scala2... And there I was somewhat hesitant to create a divergence between how 2.x and 3.x work.

Another consideration is that often users will need to switch SemanticDB on/off in a single build (IDE/tools need it, but it adds a 30-50% compile overhead, so generally users will want it to default to off). At the moment I'm supporting that by

Making the enableSemanticDb flag a toolchain param
Setting up two toolchains (one with it enabled, one with it disabled)
Making use of Bazel's --extra_toolchains to switch on/off as required.

Is there a similar mechanism we could exploit for phases? (without having to create two targets for everything).

aishfenton · 2023-02-21T00:38:02Z

Regarding how generally they're consumed by the tools. I'm only familiar with Metals. But my guess is that most tools would be similar in how they're using it. Metals does the following.

It asks the build system (via different adaptors, for different build systems) to compile a specific target
It looks in a user specified "class folder" to find the semanticdb files, alongside each compiled class file.

The last point is going to require a couple of changes on Metal's side to be compatible with Bazel. As we'll need to pass it a back collection of Jars, rather than a single exploded directory (working on that separately with them).

I think the most important consideration is that enabling semanticdb can slow down the compile task quite a lot (the scalameta team mentioned 30-50% to me, but obviously it depends on complexity of the code). So it's going be to be important to be able to toggle it on/off depending on the caller.

liucijus · 2023-02-21T08:29:03Z

Thanks @liucijus ! I went with modifying the ScalaWorker class, as Scala3 has SemanticDB built into the compiler (it now gets enabled as a standard scalac option, not a plugin). So I was thinking it'd make the most sense to treat that similar to other first-class scalac options. But that leaves the question for what to do for Scala2... And there I was somewhat hesitant to create a divergence between how 2.x and 3.x work.

I think phase can make a decision which parameters to pass based on configured scala version. The fact that we had to split compiler code for 2 and 3 already complicates maintenance, I'd like to remove code from there where it is possible.

Another consideration is that often users will need to switch SemanticDB on/off in a single build (IDE/tools need it, but it adds a 30-50% compile overhead, so generally users will want it to default to off). At the moment I'm supporting that by
* Making the `enableSemanticDb` flag a toolchain param

* Setting up two toolchains (one with it enabled, one with it disabled)

* Making use of Bazel's `--extra_toolchains` to switch on/off as required.
Is there a similar mechanism we could exploit for phases? (without having to create two targets for everything).

What works for us with dependency tracking is filters on toolchain. It does not have to be on the toolchain, but you need to have a way to enable it per target.

aishfenton · 2023-02-21T17:49:27Z

@liucijus , can you explain a bit more on how filters could solve this?

From what I can grok, filters seem to be about enabling something for a subset of targets, whereas the situation here is that we want it to enable/disable it for all targets (although doing it for a subset of targets could be nice too, but that hasn't come up as a use-case yet). And we need to be able to do that from the Bazel command line (without changing any BUILD files).

aishfenton · 2023-02-21T17:56:22Z

FWIW. I think conceptually semanticdb is quite similar to diagnosticsproto. It's an additional diagnostic output from the compiler, for IDE/tool consumers. Generally you want it on/off for all scalac targets. But you need to be able to make that switch at the command line (which you can do with diagnosticsproto via Bazel's --extra_toolchains option).

(i.e. so that IDEs can include --extra_toolchains option, to get the additional diagnostic outputs. Whereas normal runs of bazel won't have it enabled).

aishfenton · 2023-02-21T18:13:56Z

Just to run the design change past you, for phases (to make sure I've got it right).

Create a new phase, phase_semanticdb.bzl
The phase would check the toolchain to see if semanticdb is enabled, or not.
It'd then set the appropriate p.scalacopt options depending on that (and setting different ones for Scala2/3).

We'd include that phase as a standard phase (unlike phase_scalafmt)? Since it's enabled/disabled by the toolchain, rather than by specifying a custom set of phases.

Also want check an alternative design with you .. just in case.

Modify, phase_compile.bzl to see if the toolchain hassemanticdb enabled, or not.
Set the appropriate scalacopt options, before calling the scalac worker (and setting different ones for Scala2/3).

Wondering if that'd be clearer, since the semanticdb phase above isn't really doing much more than a pass through. (and this'd be more consistent with diagnosticproto).

liucijus · 2023-02-23T08:02:15Z

@liucijus , can you explain a bit more on how filters could solve this?

From what I can grok, filters seem to be about enabling something for a subset of targets, whereas the situation here is that we want it to enable/disable it for all targets (although doing it for a subset of targets could be nice too, but that hasn't come up as a use-case yet). And we need to be able to do that from the Bazel command line (without changing any BUILD files).

Sorry for not being clear. What I had in mind is that rebuilding all the graph might be expensive, so filtering is a nice way to limit the impact. I'm not saying that it's needed in case of semantic db, but dropping some food for thought.

liucijus · 2023-02-23T09:06:21Z

Just to run the design change past you, for phases (to make sure I've got it right).
* Create a new phase, `phase_semanticdb.bzl`

* The phase would check the toolchain to see if `semanticdb` is enabled, or not.

* It'd then set the appropriate `p.scalacopt` options depending on that (and setting different ones for Scala2/3).
We'd include that phase as a standard phase (unlike phase_scalafmt)? Since it's enabled/disabled by the toolchain, rather than by specifying a custom set of phases.

Also want check an alternative design with you .. just in case.
* Modify, `phase_compile.bzl` to see if the toolchain has`semanticdb` enabled, or not.

* Set the appropriate `scalacopt` options, before calling the scalac worker (and setting different ones for Scala2/3).
Wondering if that'd be clearer, since the semanticdb phase above isn't really doing much more than a pass through. (and this'd be more consistent with diagnosticproto).

I may be wrong but I think current phase implementation forces to have the following design:

Have a separate phase (phase_semanticdb.bzl) to check if semantic db is enabled. Can be simple toolchain attribute check. Separate phase allows it to be replaced with different logic if Rules Scala users want it.
(Unfortunately) at some other phase (maybe phase_compile.bzl) you need to translate output of sematicdb phase decision to scalacopts

Btw, pass through phases may look like they do almost nothing, but they are extension points which can be changed to a different logic by the users. For example filtering idea discussed above, can be implemented as a custom phase.

aishfenton · 2023-03-12T22:43:11Z

@liucijus had a go at those changes. Is that what you had in mind ? (one test environment is failing for bazel-green-head ... but not sure if that's related to my changes, as seems to be about scala_proto? 🤔 ).

liucijus

@aishfenton my expectations are to have all the sematic db logic implemented in starlark without anything added to worker classes. If you think this is not doable for some reason - let me know.

liucijus · 2023-03-15T11:15:58Z

src/java/io/bazel/rulesscala/scalac/ScalacWorker.java

@@ -243,6 +243,17 @@ private static String[] getPluginParamsFrom(CompileOptions ops) {
            "-P:dependency-analyzer:unused-deps-ignored-targets:"
                + String.join(":", ignoredTargets));
      }
+    } 
+
+    if (ops.enableSemanticDb) {


@aishfenton can this decision be made in one of the phases? My goal is to not have additional logic in worker code.

liucijus · 2023-03-15T11:38:44Z

src/java/io/bazel/rulesscala/scalac/ScalacWorker.java

+      String[] params = {
+        "-Yrangepos",
+        "-P:semanticdb:failures:error",
+        "-Xplugin-require:semanticdb"


these ops could be part of sematicdb phase. Eg.

def phase_semanticdb(ctx, p): ... return struct( enabled = enable_semanticdb, scalacopts = [ "-Yrangepos", "-P:semanticdb:failures:error", "-Xplugin-require:semanticdb" ] ) ...

then this phase can be consumed by compile phase to augment other scalacopts

crt-31 · 2023-03-21T02:04:29Z

I tried out the changes in aishfenton’s pull request. I was able to get it to work with a Metals BSP prototype I’m experimenting with. Here’s some of my feedback:

All the generated *.semanticdb should be declared as action outputs so that they work with bazel’s caching.
setup_scala_toolchain() should take a parameter for the semanticdb_scalac classpath (just like scala_compile_classpath and scala_library_classpath). This would allow to customize the semanticdb lib we want to use in this function.
The semanticdb version is always being set to 2.13.6 when major version is 2.13, but it should use the version that the compiler is using. (Looks like @luicijus already made a review comment related to this).
Might suggest there be an error if semanticdb_scalac depprovider has more than one jar dependency.
Might wanna support user ability to add additional semanticdb options if they want. (Maybe this is something that is already available using the existing scala_toolchain.scalaopts parameter?)

On Metals integration: I think that as long as an aspect can get the semanticdb jar path (which in the PR can be done using the semanticdb_scalac depprovider), a metals bsp adapter has mostly what it needs. Metals will ask for the scalacopts for a target, and then parses them to determine if semanticdb is being used (if option starts with “-Xplugin:” and contians “semanticdb-scalac”) as well as using the targetroot and sourceroot options. I would imagine, that a metals bsp adapter would be able to give Metals a groomed set of options that sets the -Xplugin, targetroot, and sourceroot to the correct values for Metals, even if bazel uses different options… which would be the case with relative path differences. Long way of saying I’m thinking the PR gives what is needed for a Metals integration as it is.

crt-31 · 2023-11-30T22:38:02Z

This ticket can probably be closed now. SemanitcDB support was added in #1508

aishfenton requested review from liucijus and simuons as code owners January 29, 2023 23:32

eed3si9n reviewed Jan 31, 2023

View reviewed changes

aishfenton mentioned this pull request Jan 31, 2023

Add support for SemanticDB bazeltools/bsp4bazel#31

Closed

eed3si9n reviewed Jan 31, 2023

View reviewed changes

liucijus requested changes Mar 15, 2023

View reviewed changes

tanishiking mentioned this pull request Apr 12, 2023

Bazel Support scalameta/metals#5138

Closed

7 tasks

crt-31 mentioned this pull request Aug 9, 2023

Add SemanticDB support - cont. #1508

Merged

aishfenton closed this Dec 2, 2023

aishfenton force-pushed the master branch from 6de925d to f938141 Compare December 2, 2023 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SemanticDB support #1467

SemanticDB support #1467

aishfenton commented Jan 29, 2023 •

edited

Loading

google-cla bot commented Jan 29, 2023

aishfenton commented Jan 30, 2023

aishfenton commented Jan 31, 2023 •

edited

Loading

eed3si9n left a comment

ricochet1k commented Jan 31, 2023

aishfenton commented Jan 31, 2023

eed3si9n Jan 31, 2023

aishfenton Feb 4, 2023

liucijus Feb 17, 2023

liucijus commented Feb 8, 2023

liucijus commented Feb 17, 2023

aishfenton commented Feb 20, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

liucijus commented Feb 21, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

liucijus commented Feb 23, 2023

liucijus commented Feb 23, 2023

aishfenton commented Mar 12, 2023

liucijus left a comment

liucijus Mar 15, 2023

liucijus Mar 15, 2023

crt-31 commented Mar 21, 2023

crt-31 commented Nov 30, 2023

SemanticDB support #1467

SemanticDB support #1467

Conversation

aishfenton commented Jan 29, 2023 • edited Loading

Description

Motivation

google-cla bot commented Jan 29, 2023

aishfenton commented Jan 30, 2023

aishfenton commented Jan 31, 2023 • edited Loading

eed3si9n left a comment

Choose a reason for hiding this comment

ricochet1k commented Jan 31, 2023

aishfenton commented Jan 31, 2023

eed3si9n Jan 31, 2023

Choose a reason for hiding this comment

aishfenton Feb 4, 2023

Choose a reason for hiding this comment

liucijus Feb 17, 2023

Choose a reason for hiding this comment

liucijus commented Feb 8, 2023

liucijus commented Feb 17, 2023

aishfenton commented Feb 20, 2023 • edited Loading

aishfenton commented Feb 21, 2023 • edited Loading

liucijus commented Feb 21, 2023 • edited Loading

aishfenton commented Feb 21, 2023 • edited Loading

aishfenton commented Feb 21, 2023 • edited Loading

aishfenton commented Feb 21, 2023 • edited Loading

liucijus commented Feb 23, 2023

liucijus commented Feb 23, 2023

aishfenton commented Mar 12, 2023

liucijus left a comment

Choose a reason for hiding this comment

liucijus Mar 15, 2023

Choose a reason for hiding this comment

liucijus Mar 15, 2023

Choose a reason for hiding this comment

crt-31 commented Mar 21, 2023

crt-31 commented Nov 30, 2023

aishfenton commented Jan 29, 2023 •

edited

Loading

aishfenton commented Jan 31, 2023 •

edited

Loading

aishfenton commented Feb 20, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

liucijus commented Feb 21, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading

aishfenton commented Feb 21, 2023 •

edited

Loading