Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved layer grouping #1326

Merged
merged 2 commits into from
Apr 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 33 additions & 18 deletions src/main/scala/com/typesafe/sbt/packager/docker/DockerPlugin.scala
Original file line number Diff line number Diff line change
Expand Up @@ -98,19 +98,25 @@ object DockerPlugin extends AutoPlugin {
),
dockerUpdateLatest := false,
dockerAutoremoveMultiStageIntermediateImages := true,
dockerLayerGrouping := {
dockerLayerGrouping := { _: String =>
None
},
dockerGroupLayers := {
val dockerBaseDirectory = (defaultLinuxInstallLocation in Docker).value
(path: String) =>
{
val pathInWorkdir = path.stripPrefix(dockerBaseDirectory)
if (pathInWorkdir.startsWith(s"/lib/${organization.value}"))
Some(2)
else if (pathInWorkdir.startsWith("/lib/"))
Some(1)
else if (pathInWorkdir.startsWith("/bin/"))
Some(1)
else None
}
// Ensure this doesn't break even if the JvmPlugin isn't enabled.
val artifacts = projectDependencyArtifacts.?.value.getOrElse(Nil).map(_.data).toSet
val oldFunction = (dockerLayerGrouping in Docker).value

// By default we set this to a function that always returns None.
val oldPartialFunction = Function.unlift((tuple: (File, String)) => oldFunction(tuple._2))

val libDir = dockerBaseDirectory + "/lib/"
val binDir = dockerBaseDirectory + "/bin/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you also add "/conf/" to the same layer? I noticed that /conf/application.ini gets created when javaOptions are used.

      javaOptions in Universal := Seq(
      "-Djava.net.preferIPv4Stack=true",
      "-Dlogback.configurationFile=logback.packaged.xml"
      ),


oldPartialFunction.orElse {
case (file, _) if artifacts(file) => 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for me artifacts.contains would be more explicit.

case (_, path) if path.startsWith(libDir) || path.startsWith(binDir) => 1
}
},
dockerAliases := {
val alias = dockerAlias.value
Expand Down Expand Up @@ -159,7 +165,8 @@ object DockerPlugin extends AutoPlugin {
val multiStageId = UUID.randomUUID().toString
val generalCommands = makeFromAs(base, "mainstage") +: makeMaintainer((maintainer in Docker).value).toSeq
val stage0name = "stage0"
val layerIdsAscending = (dockerLayerMappings in Docker).value.map(_.layerId).distinct.sorted
val layerMappings = (dockerLayerMappings in Docker).value
val layerIdsAscending = layerMappings.map(_.layerId).distinct.sorted
val stage0: Seq[CmdLike] = strategy match {
case DockerPermissionStrategy.MultiStage =>
Seq(
Expand All @@ -172,10 +179,18 @@ object DockerPlugin extends AutoPlugin {
Seq(makeUser("root")) ++ layerIdsAscending.map(
l => makeChmodRecursive(dockerChmodType.value, Seq(pathInLayer(dockerBaseDirectory, l)))
) ++ {
val layerToPath = (dockerLayerGrouping in Docker).value
val layerToPath = (dockerGroupLayers in Docker).value
addPerms map {
case (tpe, v) =>
val layerId = layerToPath(v)
// Try and find the source file for the path from the mappings
val layerId = layerMappings
.find(_.path == v)
.map(_.layerId)
.getOrElse {
// We couldn't find a source file for the mapping, so try with a dummy source file,
// in case there is an explicitly configured path based layer mapping, eg for a directory.
layerToPath.lift((new File("/dev/null"), v))
}
makeChmod(tpe, Seq(pathInLayer(v, layerId)))
}
} ++
Expand Down Expand Up @@ -263,11 +278,11 @@ object DockerPlugin extends AutoPlugin {
stage := (stage dependsOn dockerGenerateConfig).value,
stagingDirectory := (target in Docker).value / "stage",
dockerLayerMappings := {
val dockerGroups = dockerLayerGrouping.value
val dockerGroups = dockerGroupLayers.value
val dockerFinalFiles = (mappings in Docker).value
for {
(file, path) <- dockerFinalFiles
layerIdx = dockerGroups(path)
mapping @ (file, path) <- dockerFinalFiles
layerIdx = dockerGroups.lift(mapping)
} yield LayeredMapping(layerIdx, file, path)
},
target := target.value / "docker",
Expand Down
5 changes: 5 additions & 0 deletions src/main/scala/com/typesafe/sbt/packager/docker/Keys.scala
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,15 @@ private[packager] trait DockerKeysEx extends DockerKeys {
lazy val dockerAdditionalPermissions =
taskKey[Seq[(DockerChmodType, String)]]("Explicit chmod calls to some of the paths.")
val dockerApiVersion = TaskKey[Option[DockerApiVersion]]("dockerApiVersion", "The docker server api version")
@deprecated("Use dockerGroupLayers instead", "1.7.1")
val dockerLayerGrouping = settingKey[String => Option[Int]](
"Group files by path into in layers to increase docker cache hits. " +
"Lower index means the file would be a part of an earlier layer."
)
val dockerGroupLayers = taskKey[PartialFunction[(File, String), Int]](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very convinced to expose File as this function argument.

  1. Not sure how plugin users will do matching on file, but most likely they end up comparing location/path
  2. It will simplify chmod related code that will give up find.getOrElse and /dev/null and become simply layerToPath.lift(...)
  3. To make it work, just simple change is necessary: sth like .map(_.data.path) in body of dockerGroupLayers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how plugin users will do matching on file, but most likely they end up comparing location/path

I think there are use cases for either. With the path, the user has to know exactly the structure of the Docker image, including what that base directory is. The file I think makes more sense because it's the file that's changing, not the path. I know that this source file changes frequently, so every mapping that includes this source file, I want to put it in a higher layer. That's more direct and robust then having to work out what the destination path for that file is, and then matching based on that.

It will simplify chmod related code that will give up find.getOrElse and /dev/null and become simply layerToPath.lift(...)

Yes, but that code isn't that complex, and I think the advantages outweigh the minuses here.

To make it work, just simple change is necessary: sth like .map(_.data.path) in body of dockerGroupLayers

No, that's the point, that doesn't work, and that's why the File is being passed. If I want to match the slf4j api jar, the path for that is /home/jroper/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.30.jar, meanwhile, the path that gets passed into the function is opt/docker/lib/org.slf4j.slf4j-api-1.7.30.jar. They are completely different paths, even the file name doesn't match. So, in order to match based on the path, I need to know the completely undocumented method that sbt native packager uses to construct that file name, I also need to know the completely undocumented method that sbt native packager uses to construct the the path to it. What if sbt native packager ever changes that logic? Then my logic for matching will break. In contrast, this logic will never break, no matter what sbt native packager does:

dockerGroupLayers := {
  val slf4jFile = for {
    artifact <- externalDependencyClasspath.value
    moduleId <- artifact.metadata.get(AttributeKey[ModuleID]("moduleID"))
    if moduleId.organization == "org.slf4j" && moduleId.name == "slf4j-api"
  } yield artifact.data
  dockerGroupLayers.value.orElse {
    case (`slf4jFile`, _) => Some(2)
  }
}

Perhaps a bit wordy, but if we find it's popular, we could always add a helper function to sbt and then they could use:

dockerGroupLayers := {
  val slf4jFile = getClasspathFileFor(externalDependencyClasspath.value, "org.slf4j", "slf4j-api")
  dockerGroupLayers.value.orElse {
    case (`slf4jFile`, _) => Some(2)
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dockerGroupLayers := {
  val slf4jFile = for {
    artifact <- externalDependencyClasspath.value
    moduleId <- artifact.metadata.get(AttributeKey[ModuleID]("moduleID"))
    if moduleId.organization == "org.slf4j" && moduleId.name == "slf4j-api"
  } yield artifact.data
  dockerGroupLayers.value.orElse {
    case (`slf4jFile`, _) => Some(2)
  }
}

maybe we can add this to the docker.rst docs as well?

"Group files by mapping into layers to increase docker cache hits. " +
"Lower index means the file would be a part of an earlier layer."
)
val dockerLayerMappings =
taskKey[Seq[LayeredMapping]]("List of layer, source file and destination in Docker image.")
}
4 changes: 3 additions & 1 deletion src/sbt-test/docker/test-layer-groups/build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,7 @@ version := "0.1.0"

dockerPackageMappings in Docker ++= Seq(
(baseDirectory.value / "docker" / "spark-env.sh") -> "/opt/docker/spark/spark-env.sh",
(baseDirectory.value / "docker" / "log4j.properties") -> "/opt/docker/spark/log4j.properties"
(baseDirectory.value / "docker" / "log4j.properties") -> "/opt/docker/other/log4j.properties"
)

libraryDependencies += "org.slf4j" % "slf4j-api" % "1.7.30"
2 changes: 1 addition & 1 deletion src/sbt-test/docker/test-layer-groups/changes/nolayers.sbt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
dockerLayerGrouping in Docker := (_ => None)
dockerGroupLayers in Docker := PartialFunction.empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 changes: 4 additions & 12 deletions src/sbt-test/docker/test-layer-groups/layers.sbt
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
dockerLayerGrouping in Docker := {
dockerGroupLayers in Docker := {
val dockerBaseDirectory = (defaultLinuxInstallLocation in Docker).value
(path: String) =>
{
val pathInWorkdir = path.stripPrefix(dockerBaseDirectory)
if (pathInWorkdir.startsWith(s"/lib/${organization.value}"))
Some(2)
else if (pathInWorkdir.startsWith("/bin/"))
Some(123)
else if (pathInWorkdir.startsWith("/spark/"))
Some(54)
else None
}
(dockerGroupLayers in Docker).value.orElse {
case (_, path) if path.startsWith(dockerBaseDirectory + "/spark/") => 54
}
}
18 changes: 14 additions & 4 deletions src/sbt-test/docker/test-layer-groups/test
Original file line number Diff line number Diff line change
@@ -1,17 +1,27 @@
# Generate the Docker image locally
> docker:publishLocal
$ exists target/docker/stage/Dockerfile
$ exists target/docker/stage/123/opt/docker/bin/docker-groups
$ exists target/docker/stage/opt
$ exists target/docker/stage/1/opt/docker/bin/docker-groups
$ exists target/docker/stage/1/opt/docker/lib/org.slf4j.slf4j-api-1.7.30.jar
-$ exists target/docker/stage/1/opt/docker/lib/com.example.docker-groups-0.1.0.jar
$ exists target/docker/stage/opt/docker/other
$ exists target/docker/stage/2
-$ exists target/docker/stage/2/opt/docker/lib/org.slf4j.slf4j-api-1.7.30.jar
$ exists target/docker/stage/2/opt/docker/lib/com.example.docker-groups-0.1.0.jar
$ exists target/docker/stage/54/opt/docker/spark

$ exec bash -c 'docker run --rm --entrypoint=ls docker-groups:0.1.0 |tr "\n" "," | grep -q "bin,lib,spark"'
$ exec bash -c 'docker run --rm --entrypoint=ls docker-groups:0.1.0 |tr "\n" "," | grep -q "bin,lib,other,spark"'
$ exec bash -c 'docker rmi docker-groups:0.1.0'

$ copy-file changes/nolayers.sbt layers.sbt
> reload
> clean
> docker:publishLocal
$ exists target/docker/stage/opt/docker/bin
$ exists target/docker/stage/opt/docker/spark
$ exec bash -c 'docker run --rm --entrypoint=ls docker-groups:0.1.0 |tr "\n" "," | grep -q "bin,lib,spark"'
-$ exists target/docker/stage/1
-$ exists target/docker/stage/2
-$ exists target/docker/stage/54
$ exists target/docker/stage/opt/docker/lib/org.slf4j.slf4j-api-1.7.30.jar
$ exists target/docker/stage/opt/docker/lib/com.example.docker-groups-0.1.0.jar
$ exec bash -c 'docker run --rm --entrypoint=ls docker-groups:0.1.0 |tr "\n" "," | grep -q "bin,lib,other,spark"'