Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derived Clone implementations don't utilize clone_from implementations from field types #98374

Closed
AngelicosPhosphoros opened this issue Jun 22, 2022 · 8 comments
Labels
C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@AngelicosPhosphoros
Copy link
Contributor

I tried this code:

struct A;

impl Clone for A {
    fn clone(&self) -> Self {
        println!("Basic clone");
        Self
    }

    fn clone_from(&mut self, other: &Self) {
        println!("Clone from");
        *self = Self;
    }
}

#[derive(Clone)]
struct B(A);

fn main() {
    let b = B(A);
    let mut bb = B(A);
    bb.clone_from(&b);
}

I expected to see this happen: prints Clone from

Instead, this happened: prints Basic clone

I think, this can lead to little losses for types which have multiple vector and string fields.

Meta

rustc --version --verbose:

rustc 1.61.0 (fe5b13d68 2022-05-18)
binary: rustc
commit-hash: fe5b13d681f25ee6474be29d748c65adcd91f69e
commit-date: 2022-05-18
host: x86_64-unknown-linux-gnu
release: 1.61.0
LLVM version: 14.0.0

rustc 1.63.0-nightly (5750a6aa2 2022-06-20)
binary: rustc
commit-hash: 5750a6aa2777382bf421b726f234da23f990a953
commit-date: 2022-06-20
host: x86_64-unknown-linux-gnu
release: 1.63.0-nightly
LLVM version: 14.0.5
@AngelicosPhosphoros AngelicosPhosphoros added the C-bug Category: This is a bug. label Jun 22, 2022
@ChayimFriedman2
Copy link
Contributor

Dup #13281.

@AngelicosPhosphoros
Copy link
Contributor Author

Well, it was 5 years ago, this reasoning can be not true anymore.

@LoganDark
Copy link

Well, it was 5 years ago, this reasoning can be not true anymore.

The reasoning was that it increased compile times too much. That reason will be just as good till the end of time because compiler speed is an important metric especially with how many derive(Clone)s are used in each crate.

I'm not opposed to the idea of running another experiment, perhaps derives are faster now; but if it's still too slow then it's still too slow.

@AngelicosPhosphoros
Copy link
Contributor Author

Yup, I would run tests maybe today or tomorrow.

@AngelicosPhosphoros
Copy link
Contributor Author

AngelicosPhosphoros commented Jun 22, 2022

Well, I wrote script and run tests.

Python script for generation and running tests
import csv
import dataclasses
import os
import subprocess
import time
import typing

def _gen_struct_definition(name: str, num_fields: int)->str:
    fields = (
        f"\ta{i}: String,"
        for i in range(num_fields)
    )
    fields = f"\n".join(fields)
    return f"""
    #[derive(Debug)]
    struct {name} {{
        {fields}
    }}"""

def _gen_clone(num_fields: int)->str:
    fields = (
        f"\t\t\t\ta{i}: self.a{i}.clone(),"
        for i in range(num_fields)
    )
    fields = f"\n".join(fields)
    return f"""
        #[inline]
        fn clone(&self)->Self{{
            Self{{
                {fields}
            }}
        }}
    """

def _gen_clone_from(num_fields: int)->str:
    fields = (
        f"\t\t\tself.a{i}.clone_from(&other.a{i});"
        for i in range(num_fields)
    )
    fields = f"\n".join(fields)
    return f"""
        #[inline]
        fn clone_from(&mut self, other: &Self){{
            {fields}
        }}
    """

def _generate_trait_impl(name: str, num_fields: int, make_clone_from: bool)->str:
    bodies = _gen_clone(num_fields)
    if make_clone_from:
        bodies += f"\n {_gen_clone_from(num_fields)}"
    return f"""
    impl Clone for {name} {{
        {bodies}
    }}
    """

def _generate_file(fname: str, num_structs: int, num_fields: int, make_clone_from: bool):
    with open(fname, "w") as f:
        for i in range(num_structs):
            struct_name = f"Struct{i}"
            f.write(_gen_struct_definition(struct_name, num_fields))
            f.write("\n")
            f.write(_generate_trait_impl(struct_name, num_fields, make_clone_from))
            f.write("\n")

def _generate_and_measure(num_structs: int, num_fields: int, make_clone_from: bool, do_optimize: bool)->float:
    if do_optimize:
        c_opt_level = "-Copt-level=3"
    else:
        c_opt_level = "-Copt-level=0"

    rustc_command = f"rustc measure_clone.rs --crate-type=rlib -Ccodegen-units=1 {c_opt_level}"

    _generate_file("measure_clone.rs", num_structs, num_fields, make_clone_from)

    def clean_prev_compilation():
        for item in os.listdir():
            if item.endswith(".rlib"):
                os.remove(item)

    # Run 3 compilations and get median
    timings = []
    for _ in range(3):
        clean_prev_compilation()
        begin = time.time()
        subprocess.run(rustc_command)
        end = time.time()
        timings.append(end - begin)
    clean_prev_compilation()

    timings.sort()
    return timings[ len(timings)//2 ]


@dataclasses.dataclass(frozen=True)
class Measurement:
    num_structs: int
    secs_with_old_dbg: float
    secs_with_clone_from_dbg: float
    secs_with_old_release: float
    secs_with_clone_from_release: float


def _generate_data()->tuple[Measurement, ...]:
    min_structs = 100
    max_structs = 1000
    step = 100
    fields = 20
    rows = []
    for num_structs in range(min_structs, max_structs+step, step):
        row = Measurement(
            num_structs=num_structs,
            secs_with_old_dbg=_generate_and_measure(
                num_structs, fields,
                make_clone_from=False, do_optimize=False
            ),
            secs_with_clone_from_dbg=_generate_and_measure(
                num_structs, fields,
                make_clone_from=True, do_optimize=False
            ),
            secs_with_old_release=_generate_and_measure(
                num_structs, fields,
                make_clone_from=False, do_optimize=True
            ),
            secs_with_clone_from_release=_generate_and_measure(
                num_structs, fields,
                make_clone_from=True, do_optimize=True
            ),
        )
        rows.append(row)
    return tuple(rows)

def _save_to_csv(rows: typing.Iterable[Measurement]):
    fields = [x.name for x in dataclasses.fields(Measurement)]
    with open("results.csv", "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=fields)
        writer.writeheader()
        for row in rows:
            writer.writerow(dataclasses.asdict(row))

if __name__ == "__main__":
    data = _generate_data()
    _save_to_csv(data)
    print("Finished!")

I had 2 times increase for compilation time if I have only Clone implemented but when I added derive(Debug) difference became less significant.

See table:

num_structs secs_with_old_dbg secs_with_clone_from_dbg secs_with_old_release secs_with_clone_from_release
100 0.907 1.114 1.354 1.564
200 1.659 2.079 2.591 3.078
300 2.431 3.065 3.818 4.518
400 3.386 4.181 5.149 6.049
500 4.163 5.229 6.574 7.558
600 4.854 6.036 7.630 9.120
700 5.541 7.039 8.747 10.486
800 6.255 8.049 10.116 11.712
900 7.001 9.011 11.546 13.125
1000 7.843 9.989 12.516 15.088

Also, in original PR there weren't any perf runs and more throughput tests so I would make PR with changed Clone derive to run perf.

@AngelicosPhosphoros
Copy link
Contributor Author

I made some progress with implementation but it happened to be quite hard actually.

Current built-in derives implemented in the way that they support only immutable function parameters.

@RalfJung
Copy link
Member

Quoting from #98445 (comment)

We discussed this in today's https://github.com/orgs/rust-lang/teams/libs-api meeting, and we decided not to make this change. People can hand-implement clone_from if they need the performance, but we shouldn't do so by default.

So... should this issue be closed as "wontfix"?

@fmease fmease added T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed needs-triage-legacy labels Jan 26, 2024
@fmease
Copy link
Member

fmease commented Jan 26, 2024

Closing as wontfix.

@fmease fmease closed this as not planned Won't fix, can't repro, duplicate, stale Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants