Skip to content

Commit

Permalink
Use base Python for cached environments (#11208)
Browse files Browse the repository at this point in the history
## Summary

It turns out that we were returning slightly different interpreter paths
on repeated `uv run --with` commands. This likely didn't affect many (or
any?) users, but it does affect our test suite, since in the test suite,
we use a symlinked interpreter.

The issue is that on first invocation, we create the virtual
environment, and that returns the path to the `python` executable in the
environment. On second invocation, we return the `python3` executable,
since that gets priority during discovery. This on its own is
potentially ok. The issue is that these resolve to different
`sys._base_executable` values in these flows... The latter gets the
correct value (since it's read from the `home` key), but the former gets
the incorrect value (since it's just the `base_executable` of the
executable that created the virtualenv, which is the symlink).

We now use the same logic to determine the "cached interpreter" as in
virtual environment creation, to ensure consistency between those paths.
  • Loading branch information
charliermarsh authored Feb 4, 2025
1 parent ec480bd commit 34552e2
Show file tree
Hide file tree
Showing 6 changed files with 201 additions and 146 deletions.
138 changes: 122 additions & 16 deletions crates/uv-python/src/interpreter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ use owo_colors::OwoColorize;
use same_file::is_same_file;
use serde::{Deserialize, Serialize};
use thiserror::Error;
use tracing::{trace, warn};
use tracing::{debug, trace, warn};

use uv_cache::{Cache, CacheBucket, CachedByTimestamp, Freshness};
use uv_cache_info::Timestamp;
Expand Down Expand Up @@ -120,23 +120,39 @@ impl Interpreter {
})
}

/// Return the [`Interpreter`] for the base executable, if it's available.
///
/// If no such base executable is available, or if the base executable is the same as the
/// current executable, this method returns `None`.
pub fn to_base_interpreter(&self, cache: &Cache) -> Result<Option<Self>, Error> {
if let Some(base_executable) = self
.sys_base_executable()
.filter(|base_executable| *base_executable != self.sys_executable())
{
match Self::query(base_executable, cache) {
Ok(base_interpreter) => Ok(Some(base_interpreter)),
Err(Error::NotFound(_)) => Ok(None),
Err(err) => Err(err),
/// Determine the base Python executable; that is, the Python executable that should be
/// considered the "base" for the virtual environment. This is typically the Python executable
/// from the [`Interpreter`]; however, if the interpreter is a virtual environment itself, then
/// the base Python executable is the Python executable of the interpreter's base interpreter.
pub fn to_base_python(&self) -> Result<PathBuf, io::Error> {
let base_executable = self.sys_base_executable().unwrap_or(self.sys_executable());
let base_python = if cfg!(unix) && self.is_standalone() {
// In `python-build-standalone`, a symlinked interpreter will return its own executable path
// as `sys._base_executable`. Using the symlinked path as the base Python executable can be
// incorrect, since it could cause `home` to point to something that is _not_ a Python
// installation. Specifically, if the interpreter _itself_ is symlinked to an arbitrary
// location, we need to fully resolve it to the actual Python executable; however, if the
// entire standalone interpreter is symlinked, then we can use the symlinked path.
//
// We emulate CPython's `getpath.py` to ensure that the base executable results in a valid
// Python prefix when converted into the `home` key for `pyvenv.cfg`.
match find_base_python(
base_executable,
self.python_major(),
self.python_minor(),
self.variant().suffix(),
) {
Ok(path) => path,
Err(err) => {
warn!("Failed to find base Python executable: {err}");
uv_fs::canonicalize_executable(base_executable)?
}
}
} else {
Ok(None)
}
std::path::absolute(base_executable)?
};

Ok(base_python)
}

/// Returns the path to the Python virtual environment.
Expand Down Expand Up @@ -890,6 +906,96 @@ impl InterpreterInfo {
}
}

/// Find the Python executable that should be considered the "base" for a virtual environment.
///
/// Assumes that the provided executable is that of a standalone Python interpreter.
///
/// The strategy here mimics that of `getpath.py`: we search up the ancestor path to determine
/// whether a given executable will convert into a valid Python prefix; if not, we resolve the
/// symlink and try again.
///
/// This ensures that:
///
/// 1. We avoid using symlinks to arbitrary locations as the base Python executable. For example,
/// if a user symlinks a Python _executable_ to `/Users/user/foo`, we want to avoid using
/// `/Users/user` as `home`, since it's not a Python installation, and so the relevant libraries
/// and headers won't be found when it's used as the executable directory.
/// See: <https://github.com/python/cpython/blob/a03efb533a58fd13fb0cc7f4a5c02c8406a407bd/Modules/getpath.py#L367-L400>
///
/// 2. We use the "first" resolved symlink that _is_ a valid Python prefix, and thereby preserve
/// symlinks. For example, if a user symlinks a Python _installation_ to `/Users/user/foo`, such
/// that `/Users/user/foo/bin/python` is the resulting executable, we want to use `/Users/user/foo`
/// as `home`, rather than resolving to the symlink target. Concretely, this allows users to
/// symlink patch versions (like `cpython-3.12.6-macos-aarch64-none`) to minor version aliases
/// (like `cpython-3.12-macos-aarch64-none`) and preserve those aliases in the resulting virtual
/// environments.
///
/// See: <https://github.com/python/cpython/blob/a03efb533a58fd13fb0cc7f4a5c02c8406a407bd/Modules/getpath.py#L591-L594>
fn find_base_python(
executable: &Path,
major: u8,
minor: u8,
suffix: &str,
) -> Result<PathBuf, io::Error> {
/// Returns `true` if `path` is the root directory.
fn is_root(path: &Path) -> bool {
let mut components = path.components();
components.next() == Some(std::path::Component::RootDir) && components.next().is_none()
}

/// Determining whether `dir` is a valid Python prefix by searching for a "landmark".
///
/// See: <https://github.com/python/cpython/blob/a03efb533a58fd13fb0cc7f4a5c02c8406a407bd/Modules/getpath.py#L183>
fn is_prefix(dir: &Path, major: u8, minor: u8, suffix: &str) -> bool {
if cfg!(windows) {
dir.join("Lib").join("os.py").is_file()
} else {
dir.join("lib")
.join(format!("python{major}.{minor}{suffix}"))
.join("os.py")
.is_file()
}
}

let mut executable = Cow::Borrowed(executable);

loop {
debug!(
"Assessing Python executable as base candidate: {}",
executable.display()
);

// Determine whether this executable will produce a valid `home` for a virtual environment.
for prefix in executable.ancestors().take_while(|path| !is_root(path)) {
if is_prefix(prefix, major, minor, suffix) {
return Ok(executable.into_owned());
}
}

// If not, resolve the symlink.
let resolved = fs_err::read_link(&executable)?;

// If the symlink is relative, resolve it relative to the executable.
let resolved = if resolved.is_relative() {
if let Some(parent) = executable.parent() {
parent.join(resolved)
} else {
return Err(io::Error::new(
io::ErrorKind::Other,
"Symlink has no parent directory",
));
}
} else {
resolved
};

// Normalize the resolved path.
let resolved = uv_fs::normalize_absolute_path(&resolved)?;

executable = Cow::Owned(resolved);
}
}

#[cfg(unix)]
#[cfg(test)]
mod tests {
Expand Down
3 changes: 3 additions & 0 deletions crates/uv-python/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,9 @@ pub(crate) fn current_dir() -> Result<std::path::PathBuf, std::io::Error> {

#[derive(Debug, Error)]
pub enum Error {
#[error(transparent)]
Io(#[from] std::io::Error),

#[error(transparent)]
VirtualEnv(#[from] virtualenv::Error),

Expand Down
2 changes: 1 addition & 1 deletion crates/uv-virtualenv/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ uv-fs = { workspace = true }
uv-platform-tags = { workspace = true }
uv-pypi-types = { workspace = true }
uv-python = { workspace = true }
uv-version = { workspace = true }
uv-shell = { workspace = true }
uv-version = { workspace = true }

fs-err = { workspace = true }
itertools = { workspace = true }
Expand Down
128 changes: 4 additions & 124 deletions crates/uv-virtualenv/src/virtualenv.rs
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
//! Create a virtual environment.
use std::borrow::Cow;
use std::env::consts::EXE_SUFFIX;
use std::io;
use std::io::{BufWriter, Write};
use std::path::{Path, PathBuf};
use std::path::Path;

use fs_err as fs;
use fs_err::File;
use itertools::Itertools;
use tracing::{debug, warn};
use tracing::debug;

use uv_fs::{cachedir, Simplified, CWD};
use uv_pypi_types::Scheme;
Expand Down Expand Up @@ -56,37 +55,8 @@ pub(crate) fn create(
seed: bool,
) -> Result<VirtualEnvironment, Error> {
// Determine the base Python executable; that is, the Python executable that should be
// considered the "base" for the virtual environment. This is typically the Python executable
// from the [`Interpreter`]; however, if the interpreter is a virtual environment itself, then
// the base Python executable is the Python executable of the interpreter's base interpreter.
let base_executable = interpreter
.sys_base_executable()
.unwrap_or(interpreter.sys_executable());
let base_python = if cfg!(unix) && interpreter.is_standalone() {
// In `python-build-standalone`, a symlinked interpreter will return its own executable path
// as `sys._base_executable`. Using the symlinked path as the base Python executable can be
// incorrect, since it could cause `home` to point to something that is _not_ a Python
// installation. Specifically, if the interpreter _itself_ is symlinked to an arbitrary
// location, we need to fully resolve it to the actual Python executable; however, if the
// entire standalone interpreter is symlinked, then we can use the symlinked path.
//
// We emulate CPython's `getpath.py` to ensure that the base executable results in a valid
// Python prefix when converted into the `home` key for `pyvenv.cfg`.
match find_base_python(
base_executable,
interpreter.python_major(),
interpreter.python_minor(),
interpreter.variant().suffix(),
) {
Ok(path) => path,
Err(err) => {
warn!("Failed to find base Python executable: {err}");
uv_fs::canonicalize_executable(base_executable)?
}
}
} else {
std::path::absolute(base_executable)?
};
// considered the "base" for the virtual environment.
let base_python = interpreter.to_base_python()?;

debug!(
"Using base executable for virtual environment: {}",
Expand Down Expand Up @@ -639,93 +609,3 @@ fn copy_launcher_windows(

Err(Error::NotFound(base_python.user_display().to_string()))
}

/// Find the Python executable that should be considered the "base" for a virtual environment.
///
/// Assumes that the provided executable is that of a standalone Python interpreter.
///
/// The strategy here mimics that of `getpath.py`: we search up the ancestor path to determine
/// whether a given executable will convert into a valid Python prefix; if not, we resolve the
/// symlink and try again.
///
/// This ensures that:
///
/// 1. We avoid using symlinks to arbitrary locations as the base Python executable. For example,
/// if a user symlinks a Python _executable_ to `/Users/user/foo`, we want to avoid using
/// `/Users/user` as `home`, since it's not a Python installation, and so the relevant libraries
/// and headers won't be found when it's used as the executable directory.
/// See: <https://github.com/python/cpython/blob/a03efb533a58fd13fb0cc7f4a5c02c8406a407bd/Modules/getpath.py#L367-L400>
///
/// 2. We use the "first" resolved symlink that _is_ a valid Python prefix, and thereby preserve
/// symlinks. For example, if a user symlinks a Python _installation_ to `/Users/user/foo`, such
/// that `/Users/user/foo/bin/python` is the resulting executable, we want to use `/Users/user/foo`
/// as `home`, rather than resolving to the symlink target. Concretely, this allows users to
/// symlink patch versions (like `cpython-3.12.6-macos-aarch64-none`) to minor version aliases
/// (like `cpython-3.12-macos-aarch64-none`) and preserve those aliases in the resulting virtual
/// environments.
///
/// See: <https://github.com/python/cpython/blob/a03efb533a58fd13fb0cc7f4a5c02c8406a407bd/Modules/getpath.py#L591-L594>
fn find_base_python(
executable: &Path,
major: u8,
minor: u8,
suffix: &str,
) -> Result<PathBuf, io::Error> {
/// Returns `true` if `path` is the root directory.
fn is_root(path: &Path) -> bool {
let mut components = path.components();
components.next() == Some(std::path::Component::RootDir) && components.next().is_none()
}

/// Determining whether `dir` is a valid Python prefix by searching for a "landmark".
///
/// See: <https://github.com/python/cpython/blob/a03efb533a58fd13fb0cc7f4a5c02c8406a407bd/Modules/getpath.py#L183>
fn is_prefix(dir: &Path, major: u8, minor: u8, suffix: &str) -> bool {
if cfg!(windows) {
dir.join("Lib").join("os.py").is_file()
} else {
dir.join("lib")
.join(format!("python{major}.{minor}{suffix}"))
.join("os.py")
.is_file()
}
}

let mut executable = Cow::Borrowed(executable);

loop {
debug!(
"Assessing Python executable as base candidate: {}",
executable.display()
);

// Determine whether this executable will produce a valid `home` for a virtual environment.
for prefix in executable.ancestors().take_while(|path| !is_root(path)) {
if is_prefix(prefix, major, minor, suffix) {
return Ok(executable.into_owned());
}
}

// If not, resolve the symlink.
let resolved = fs_err::read_link(&executable)?;

// If the symlink is relative, resolve it relative to the executable.
let resolved = if resolved.is_relative() {
if let Some(parent) = executable.parent() {
parent.join(resolved)
} else {
return Err(io::Error::new(
io::ErrorKind::Other,
"Symlink has no parent directory",
));
}
} else {
resolved
};

// Normalize the resolved path.
let resolved = uv_fs::normalize_absolute_path(&resolved)?;

executable = Cow::Owned(resolved);
}
}
12 changes: 7 additions & 5 deletions crates/uv/src/commands/project/environment.rs
Original file line number Diff line number Diff line change
Expand Up @@ -229,18 +229,20 @@ impl CachedEnvironment {
interpreter: &Interpreter,
cache: &Cache,
) -> Result<Interpreter, uv_python::Error> {
if let Some(interpreter) = interpreter.to_base_interpreter(cache)? {
let base_python = interpreter.to_base_python()?;
if base_python == interpreter.sys_executable() {
debug!(
"Caching via base interpreter: `{}`",
interpreter.sys_executable().display()
);
Ok(interpreter)
Ok(interpreter.clone())
} else {
let base_interpreter = Interpreter::query(base_python, cache)?;
debug!(
"Caching via interpreter: `{}`",
interpreter.sys_executable().display()
"Caching via base interpreter: `{}`",
base_interpreter.sys_executable().display()
);
Ok(interpreter.clone())
Ok(base_interpreter)
}
}
}
Loading

0 comments on commit 34552e2

Please sign in to comment.