Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

floatrange causes "unkown" binary operator #274

Closed
vchuravy opened this issue Apr 7, 2022 · 12 comments · Fixed by #1656
Closed

floatrange causes "unkown" binary operator #274

vchuravy opened this issue Apr 7, 2022 · 12 comments · Fixed by #1656

Comments

@vchuravy
Copy link
Member

vchuravy commented Apr 7, 2022

using Enzyme
f(x) = first(1.0:0.1:x)
autodiff(f, Active(20.0))

cc: @boriskaus

@wsmoses
Copy link
Member

wsmoses commented Apr 7, 2022

Current scope:
; Function Attrs: willreturn mustprogress
define internal fastcc void @preprocess_julia_floatrange_1236({ [2 x double], [2 x double], i64, i64 }* noalias nocapture noundef nonnull writeonly sret({ [2 x double], [2 x double], i64, i64 }) align 8 dereferenceable(48) %0, i64 noundef signext %1, i64 noundef signext %2, i64 signext %3, i64 signext %4) unnamed_addr #13 !dbg !1121 {
top:
  %malloccall3 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %5 = bitcast i8* %malloccall3 to [2 x double]*
  %malloccall4 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %6 = bitcast i8* %malloccall4 to [2 x i64]*
  %malloccall2 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %7 = bitcast i8* %malloccall2 to [2 x double]*
  %malloccall5 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %8 = bitcast i8* %malloccall5 to [2 x i64]*
  %malloccall1 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %9 = bitcast i8* %malloccall1 to [2 x i64]*
  %malloccall6 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %10 = bitcast i8* %malloccall6 to [2 x double]*
  %malloccall = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %11 = bitcast i8* %malloccall to [2 x i64]*
  %malloccall7 = tail call noalias nonnull dereferenceable(16) dereferenceable_or_null(16) i8* @malloc(i64 16), !enzyme_fromstack !811
  %12 = bitcast i8* %malloccall7 to [2 x double]*
  %13 = call {}*** @julia.get_pgcstack() #17
  %14 = icmp slt i64 %3, 2, !dbg !1122
  %.not = icmp eq i64 %2, 0
  %or.cond = or i1 %.not, %14, !dbg !1123
  br i1 %or.cond, label %L108, label %L8, !dbg !1123

L8:                                               ; preds = %top
  %15 = sub i64 0, %1, !dbg !1124
  %16 = sitofp i64 %15 to double, !dbg !1126
  %17 = sitofp i64 %2 to double, !dbg !1126
  %18 = fdiv double %16, %17, !dbg !1130
  %19 = fadd double %18, 1.000000e+00, !dbg !1131
  %20 = call double @llvm.rint.f64(double %19) #17, !dbg !1133
  %21 = fcmp ult double %20, 0xC3E0000000000000, !dbg !1135
  %22 = fcmp uge double %20, 0x43E0000000000000, !dbg !1136
  %23 = or i1 %21, %22, !dbg !1136
  br i1 %23, label %L23, label %L37, !dbg !1136

L23:                                              ; preds = %L8
  %ptls_field2678 = getelementptr inbounds {}**, {}*** %13, i64 2305843009213693954, !dbg !1137
  %24 = bitcast {}*** %ptls_field2678 to i8**, !dbg !1137
  %ptls_load277980 = load i8*, i8** %24, align 8, !dbg !1137, !tbaa !89
  %25 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load277980, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 139965210088864 to {}*) to {} addrspace(10)*)) #18, !dbg !1137
  %26 = bitcast {} addrspace(10)* %25 to double addrspace(10)*, !dbg !1137
  store double %20, double addrspace(10)* %26, align 8, !dbg !1137, !tbaa !91
  %27 = call cc38 nonnull {} addrspace(10)* bitcast ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* @jl_invoke to {} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*)*)({} addrspace(10)* addrspacecast ({}* inttoptr (i64 139963675749408 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965262278592 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965431103064 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965208227648 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %25) #17, !dbg !1137
  %28 = addrspacecast {} addrspace(10)* %27 to {} addrspace(12)*, !dbg !1137
  call void @jl_throw({} addrspace(12)* %28) #19, !dbg !1137
  unreachable, !dbg !1137

L37:                                              ; preds = %L8
  %29 = fptosi double %20 to i64, !dbg !1138
  %30 = freeze i64 %29, !dbg !1138
  %.not62 = icmp sgt i64 %30, %3, !dbg !1140
  %31 = icmp sgt i64 %30, 1, !dbg !1142
  %32 = select i1 %31, i64 %30, i64 1, !dbg !1142
  %33 = select i1 %.not62, i64 %3, i64 %32, !dbg !1142
  %34 = add i64 %33, -1, !dbg !1143
  %35 = sub i64 %3, %33, !dbg !1143
  %.not72 = icmp slt i64 %35, %34, !dbg !1147
  %36 = select i1 %.not72, i64 %34, i64 %35, !dbg !1148
  %37 = sitofp i64 %36 to double, !dbg !1149
  %38 = call fastcc double @julia__log_1245(double %37) #13, !dbg !1153
  %39 = call double @llvm.ceil.f64(double %38) #17, !dbg !1155
  %40 = fcmp ult double %39, 0xC3E0000000000000, !dbg !1157
  %41 = fcmp uge double %39, 0x43E0000000000000, !dbg !1158
  %42 = or i1 %40, %41, !dbg !1158
  br i1 %42, label %L53, label %L85, !dbg !1158

L53:                                              ; preds = %L37
  %ptls_field2174 = getelementptr inbounds {}**, {}*** %13, i64 2305843009213693954, !dbg !1159
  %43 = bitcast {}*** %ptls_field2174 to i8**, !dbg !1159
  %ptls_load227576 = load i8*, i8** %43, align 8, !dbg !1159, !tbaa !89
  %44 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load227576, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 139965210088864 to {}*) to {} addrspace(10)*)) #18, !dbg !1159
  %45 = bitcast {} addrspace(10)* %44 to double addrspace(10)*, !dbg !1159
  store double %39, double addrspace(10)* %45, align 8, !dbg !1159, !tbaa !91
  %46 = call cc38 nonnull {} addrspace(10)* bitcast ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* @jl_invoke to {} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*)*)({} addrspace(10)* addrspacecast ({}* inttoptr (i64 139963675749408 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965262278592 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965431103064 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965208227648 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %44) #17, !dbg !1159
  %47 = addrspacecast {} addrspace(10)* %46 to {} addrspace(12)*, !dbg !1159
  call void @jl_throw({} addrspace(12)* %47) #19, !dbg !1159
  unreachable, !dbg !1159

L85:                                              ; preds = %L37
  %48 = mul i64 %34, %2, !dbg !1160
  %49 = add i64 %48, %1, !dbg !1162
  %50 = fptosi double %39 to i64, !dbg !1163
  %51 = freeze i64 %50, !dbg !1163
  %52 = add i64 %51, 1, !dbg !1165
  %.inv = icmp slt i64 %52, 27, !dbg !1166
  %53 = select i1 %.inv, i64 %52, i64 27, !dbg !1166
  %54 = getelementptr inbounds [2 x i64], [2 x i64]* %6, i64 0, i64 0, !dbg !1167
  store i64 %49, i64* %54, align 8, !dbg !1167, !tbaa !27
  %55 = getelementptr inbounds [2 x i64], [2 x i64]* %6, i64 0, i64 1, !dbg !1167
  store i64 %4, i64* %55, align 8, !dbg !1167, !tbaa !27
  %56 = getelementptr inbounds [2 x i64], [2 x i64]* %8, i64 0, i64 0, !dbg !1167
  store i64 %2, i64* %56, align 8, !dbg !1167, !tbaa !27
  %57 = getelementptr inbounds [2 x i64], [2 x i64]* %8, i64 0, i64 1, !dbg !1167
  store i64 %4, i64* %57, align 8, !dbg !1167, !tbaa !27
  %58 = addrspacecast [2 x i64]* %6 to [2 x i64] addrspace(11)*, !dbg !1168
  call fastcc void @julia_TwicePrecision_1264([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %10, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %58) #13, !dbg !1168
  %59 = addrspacecast [2 x i64]* %8 to [2 x i64] addrspace(11)*, !dbg !1169
  call fastcc void @julia_TwicePrecision_1264([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %12, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %59) #13, !dbg !1169
  %60 = icmp slt i64 %53, 0, !dbg !1170
  %61 = shl nsw i64 -1, %53, !dbg !1174
  %62 = icmp ugt i64 %53, 63, !dbg !1174
  %63 = select i1 %62, i64 0, i64 %61, !dbg !1174
  %64 = sub i64 0, %53, !dbg !1175
  %65 = lshr i64 -1, %64, !dbg !1176
  %66 = icmp ugt i64 %64, 63, !dbg !1176
  %67 = select i1 %66, i64 0, i64 %65, !dbg !1176
  %68 = select i1 %60, i64 %67, i64 %63, !dbg !1171
  %69 = bitcast [2 x double]* %12 to i64*, !dbg !1177
  %70 = load i64, i64* %69, align 8, !dbg !1177, !tbaa !27
  %71 = and i64 %70, %68, !dbg !1179
  %72 = icmp slt i64 %33, 1, !dbg !1180
  %73 = icmp sgt i64 %33, %3, !dbg !1181
  %value_phi8 = or i1 %72, %73, !dbg !1181
  br i1 %value_phi8, label %L96, label %L94, !dbg !1181

L94:                                              ; preds = %L85
  %74 = getelementptr inbounds [2 x double], [2 x double]* %12, i64 0, i64 1, !dbg !1184
  %75 = load double, double* %74, align 8, !dbg !1186, !tbaa !27
  %76 = bitcast i64 %71 to double, !dbg !1177
  %.cast = bitcast i64 %70 to double, !dbg !1187
  %77 = fsub double %.cast, %76, !dbg !1187
  %78 = fadd double %77, %75, !dbg !1186
  %.sroa.0.sroa.0.0..sroa.0.0..sroa_cast30.sroa_cast = bitcast [2 x double]* %10 to i8*, !dbg !1188
  %79 = bitcast { [2 x double], [2 x double], i64, i64 }* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture noundef nonnull writeonly align 8 dereferenceable(16) %79, i8* noundef nonnull align 8 dereferenceable(16) %.sroa.0.sroa.0.0..sroa.0.0..sroa_cast30.sroa_cast, i64 noundef 16, i1 noundef false) #17, !dbg !1188
  %.sroa.0.sroa.2.0..sroa.0.0..sroa_cast.sroa_idx37 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 0, !dbg !1167
  %80 = bitcast double* %.sroa.0.sroa.2.0..sroa.0.0..sroa_cast.sroa_idx37 to i64*, !dbg !1167
  store i64 %71, i64* %80, align 8, !dbg !1167
  %.sroa.0.sroa.3.0..sroa.0.0..sroa_cast.sroa_idx38 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 1, !dbg !1167
  store double %78, double* %.sroa.0.sroa.3.0..sroa.0.0..sroa_cast.sroa_idx38, align 8, !dbg !1167
  %.sroa.3.0..sroa_idx32 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 2, !dbg !1167
  store i64 %3, i64* %.sroa.3.0..sroa_idx32, align 8, !dbg !1167
  %.sroa.4.0..sroa_idx33 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 3, !dbg !1167
  store i64 %33, i64* %.sroa.4.0..sroa_idx33, align 8, !dbg !1167
  ret void, !dbg !1167

L96:                                              ; preds = %L85
  %81 = call nonnull {} addrspace(10)* @jl_box_int64(i64 signext %3) #17, !dbg !1181
  %82 = call nonnull {} addrspace(10)* @jl_box_int64(i64 signext %33) #17, !dbg !1181
  %83 = call cc38 nonnull {} addrspace(10)* bitcast ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* @jl_invoke to {} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*)*)({} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965236871888 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965223278400 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965282456800 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %81, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139965282456864 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %82) #17, !dbg !1181
  %ptls_field1065 = getelementptr inbounds {}**, {}*** %13, i64 2305843009213693954, !dbg !1181
  %84 = bitcast {}*** %ptls_field1065 to i8**, !dbg !1181
  %ptls_load116667 = load i8*, i8** %84, align 8, !dbg !1181, !tbaa !89
  %85 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load116667, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 139965222667824 to {}*) to {} addrspace(10)*)) #18, !dbg !1181
  %86 = bitcast {} addrspace(10)* %85 to {} addrspace(10)* addrspace(10)*, !dbg !1181
  store {} addrspace(10)* %83, {} addrspace(10)* addrspace(10)* %86, align 8, !dbg !1181, !tbaa !91
  %87 = addrspacecast {} addrspace(10)* %85 to {} addrspace(12)*, !dbg !1181
  call void @jl_throw({} addrspace(12)* %87) #19, !dbg !1181
  unreachable, !dbg !1181

L108:                                             ; preds = %top
  %88 = getelementptr inbounds [2 x i64], [2 x i64]* %11, i64 0, i64 0, !dbg !1189
  store i64 %1, i64* %88, align 8, !dbg !1189, !tbaa !27
  %89 = getelementptr inbounds [2 x i64], [2 x i64]* %11, i64 0, i64 1, !dbg !1189
  store i64 %4, i64* %89, align 8, !dbg !1189, !tbaa !27
  %90 = getelementptr inbounds [2 x i64], [2 x i64]* %9, i64 0, i64 0, !dbg !1189
  store i64 %2, i64* %90, align 8, !dbg !1189, !tbaa !27
  %91 = getelementptr inbounds [2 x i64], [2 x i64]* %9, i64 0, i64 1, !dbg !1189
  store i64 %4, i64* %91, align 8, !dbg !1189, !tbaa !27
  %92 = addrspacecast [2 x i64]* %11 to [2 x i64] addrspace(11)*, !dbg !1190
  call fastcc void @julia_TwicePrecision_1264([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %7, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %92) #13, !dbg !1190
  %93 = addrspacecast [2 x i64]* %9 to [2 x i64] addrspace(11)*, !dbg !1191
  call fastcc void @julia_TwicePrecision_1264([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %5, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %93) #13, !dbg !1191
  %94 = getelementptr inbounds [2 x double], [2 x double]* %5, i64 0, i64 0, !dbg !1192
  %95 = load double, double* %94, align 8, !dbg !1192, !tbaa !27
  %96 = icmp slt i64 %3, 0, !dbg !1196
  br i1 %96, label %L133, label %L127, !dbg !1198

L127:                                             ; preds = %L108
  %97 = getelementptr inbounds [2 x double], [2 x double]* %5, i64 0, i64 1, !dbg !1201
  %98 = load double, double* %97, align 8, !dbg !1203, !tbaa !27
  %99 = fsub double %95, %95, !dbg !1204
  %100 = fadd double %99, %98, !dbg !1203
  %.sroa.0.sroa.048.0..sroa.0.0..sroa_cast39.sroa_cast = bitcast [2 x double]* %7 to i8*, !dbg !1205
  %101 = bitcast { [2 x double], [2 x double], i64, i64 }* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture noundef nonnull writeonly align 8 dereferenceable(16) %101, i8* noundef nonnull align 8 dereferenceable(16) %.sroa.0.sroa.048.0..sroa.0.0..sroa_cast39.sroa_cast, i64 noundef 16, i1 noundef false) #17, !dbg !1205
  %.sroa.0.sroa.250.0..sroa.0.0..sroa_cast.sroa_idx51 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 0, !dbg !1189
  store double %95, double* %.sroa.0.sroa.250.0..sroa.0.0..sroa_cast.sroa_idx51, align 8, !dbg !1189
  %.sroa.0.sroa.352.0..sroa.0.0..sroa_cast.sroa_idx53 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 1, !dbg !1189
  store double %100, double* %.sroa.0.sroa.352.0..sroa.0.0..sroa_cast.sroa_idx53, align 8, !dbg !1189
  %.sroa.341.0..sroa_idx42 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 2, !dbg !1189
  store i64 %3, i64* %.sroa.341.0..sroa_idx42, align 8, !dbg !1189
  %.sroa.443.0..sroa_idx44 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 3, !dbg !1189
  store i64 1, i64* %.sroa.443.0..sroa_idx44, align 8, !dbg !1189
  ret void, !dbg !1189

L133:                                             ; preds = %L108
  %102 = call fastcc nonnull {} addrspace(10)* @julia_string_1234({} addrspace(10)* noundef nonnull align 16 addrspacecast ({}* inttoptr (i64 139965282456752 to {}*) to {} addrspace(10)*), i64 signext %3) #13, !dbg !1198
  %ptls_field359 = getelementptr inbounds {}**, {}*** %13, i64 2305843009213693954, !dbg !1198
  %103 = bitcast {}*** %ptls_field359 to i8**, !dbg !1198
  %ptls_load46061 = load i8*, i8** %103, align 8, !dbg !1198, !tbaa !89
  %104 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load46061, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 139965222667824 to {}*) to {} addrspace(10)*)) #18, !dbg !1198
  %105 = bitcast {} addrspace(10)* %104 to {} addrspace(10)* addrspace(10)*, !dbg !1198
  store {} addrspace(10)* %102, {} addrspace(10)* addrspace(10)* %105, align 8, !dbg !1198, !tbaa !91
  %106 = addrspacecast {} addrspace(10)* %104 to {} addrspace(12)*, !dbg !1198
  call void @jl_throw({} addrspace(12)* %106) #19, !dbg !1198
  unreachable, !dbg !1198
}

Cannot handle unknown binary operator   %71 = and i64 %70, %68, !dbg !132

Caused by:
Stacktrace:
 [1] &
   @ ./int.jl:336
 [2] truncmask
   @ ./twiceprecision.jl:29
 [3] truncbits
   @ ./twiceprecision.jl:34
 [4] twiceprecision
   @ ./twiceprecision.jl:237
 [5] TwicePrecision
   @ ./twiceprecision.jl:226
 [6] steprangelen_hp
   @ ./twiceprecision.jl:335
 [7] floatrange
   @ ./twiceprecision.jl:381

Stacktrace:
  [1] julia_error(cstr::Cstring, val::Ptr{LLVM.API.LLVMOpaqueValue}, errtype::Enzyme.API.ErrorType, data::Ptr{Nothing})
    @ Enzyme.Compiler /mnt/Data/git/Enzyme.jl/src/compiler.jl:2042
  [2] EnzymeCreatePrimalAndGradient(logic::Enzyme.Logic, todiff::LLVM.Function, retType::Enzyme.API.CDIFFE_TYPE, constant_args::Vector{Enzyme.API.CDIFFE_TYPE}, TA::Enzyme.TypeAnalysis, returnValue::Bool, dretUsed::Bool, mode::Enzyme.API.CDerivativeMode, width::Int64, additionalArg::Ptr{Nothing}, typeInfo::Enzyme.FnTypeInfo, uncacheable_args::Vector{Bool}, augmented::Ptr{Nothing}, atomicAdd::Bool)
    @ Enzyme.API /mnt/Data/git/Enzyme.jl/src/api.jl:110
  [3] enzyme!(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams, GPUCompiler.FunctionSpec{typeof(f), Tuple{Float64}}}, mod::LLVM.Module, primalf::LLVM.Function, adjoint::GPUCompiler.FunctionSpec{typeof(f), Tuple{Active{Float64}}}, mode::Enzyme.API.CDerivativeMode, parallel::Bool, actualRetType::Type, dupClosure::Bool, wrap::Bool)
    @ Enzyme.Compiler /mnt/Data/git/Enzyme.jl/src/compiler.jl:2613
  [4] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams, GPUCompiler.FunctionSpec{typeof(f), Tuple{Float64}}}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, ctx::LLVM.Context, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler /mnt/Data/git/Enzyme.jl/src/compiler.jl:3310
  [5] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams, GPUCompiler.FunctionSpec{typeof(f), Tuple{Float64}}})
    @ Enzyme.Compiler /mnt/Data/git/Enzyme.jl/src/compiler.jl:3675
  [6] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(Enzyme.Compiler._thunk), linker::typeof(Enzyme.Compiler._link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/cache.jl:90
  [7] thunk(f::typeof(f), df::Nothing, ::Type{Active{Float64}}, tt::Type{Tuple{Active{Float64}}}, ::Val{Enzyme.API.DEM_ReverseModeCombined})
    @ Enzyme.Compiler /mnt/Data/git/Enzyme.jl/src/compiler.jl:3728
  [8] autodiff
    @ /mnt/Data/git/Enzyme.jl/src/Enzyme.jl:197 [inlined]
  [9] autodiff(f::typeof(f), args::Active{Float64})
    @ Enzyme /mnt/Data/git/Enzyme.jl/src/Enzyme.jl:227
 [10] top-level scope
    @ /mnt/Data/git/Enzyme.jl/minor.jl:3
in expression starting at /mnt/Data/git/Enzyme.jl/minor.jl:3

throwing this one back to you

@vchuravy vchuravy self-assigned this Apr 8, 2022
@vchuravy
Copy link
Member Author

vchuravy commented Apr 8, 2022

So this is interesting. We basically have a custom floating point type, so teaching Enzyme about that is going to be fun. Probably best to wait for #177

@ChrisRackauckas
Copy link
Contributor

As a bandaid, should Enzyme just define a rule over range construction?

ChrisRackauckas added a commit to SciML/OrdinaryDiffEq.jl that referenced this issue Jul 13, 2024
MWE now works:

```julia
using Enzyme, OrdinaryDiffEq, StaticArrays

Enzyme.EnzymeCore.EnzymeRules.inactive_type(::Type{SciMLBase.DEStats}) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.increment_nf!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.increment_nf_from_initdt!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.fixed_t_for_floatingpoint_error!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.increment_accept!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.increment_reject!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(DiffEqBase.fastpow), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.increment_nf_perform_step!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.check_error!), args...) = true
Enzyme.EnzymeCore.EnzymeRules.inactive(::typeof(OrdinaryDiffEq.log_step!), args...) = true

function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end

const _saveat =  SA[0.0,0.25,0.5,0.75,1.0,1.25,1.5,1.75,2.0,2.25,2.5,2.75,3.0]

function f(y::Array{Float64}, u0::Array{Float64})
    tspan = (0.0, 3.0)
    prob = ODEProblem{true, SciMLBase.FullSpecialize}(lorenz!, u0, tspan)
    sol = DiffEqBase.solve(prob, Tsit5(), saveat = _saveat, sensealg = DiffEqBase.SensitivityADPassThrough())
    y .= sol[1,:]
    return nothing
end;
u0 = [1.0; 0.0; 0.0]
d_u0 = zeros(3)
y  = zeros(13)
dy = zeros(13)

Enzyme.autodiff(Reverse, f,  Duplicated(y, dy), Duplicated(u0, d_u0));
```

Core issues to finish this:

1. I shouldn't have to pull all of the logging out to a separate function, but there seems to be a bug in enzyme with int inactivity EnzymeAD/Enzyme.jl#1636
2. `saveat` has issues because it uses Julia ranges, which can have a floating point fix issue EnzymeAD/Enzyme.jl#274
3. adding the zero(u), zero(u) is required because Enzyme does not seem to support non-fully initialized types (@wsmoses is that known?) and segfaults when trying to use the uninitialized memory. So making the inner constructor not use undef is and easy fix to that. But that's not memory optimal. It would take a bit of a refactor to make it memory optimal, but it's no big deal and it's probably something that improves the package anyways.
@ChrisRackauckas
Copy link
Contributor

I've been working from this MWE:

using Enzyme
function f(x)
    ts = Array(0.0:x:3.0)
    sum(ts)
end
f(0.25)
Enzyme.autodiff(Forward, f,  Duplicated(0.25, 1.0))
Enzyme.autodiff(Reverse, f,  Active, Active(0.25))
ERROR: Enzyme execution failed.
Enzyme compilation failed.
Current scope: 
; Function Attrs: mustprogress willreturn
define internal fastcc void @preprocess_julia_steprangelen_hp_2378({ [2 x double], [2 x double], i64, i64 }* noalias nocapture noundef nonnull sret({ [2 x double], [2 x double], i64, i64 }) align 8 dereferenceable(48) "enzyme_type"="{[-1]:Pointer, [-1,0]:Float@double, [-1,8]:Float@double, [-1,16]:Float@double, [-1,24]:Float@double, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer, [-1,40]:Integer, [-1,41]:Integer, [-1,42]:Integer, [-1,43]:Integer, [-1,44]:Integer, [-1,45]:Integer, [-1,46]:Integer, [-1,47]:Integer}" %0, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer}" "enzymejl_parmtype"="4767707152" "enzymejl_parmtype_ref"="1" %1, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer}" "enzymejl_parmtype"="4767707152" "enzymejl_parmtype_ref"="1" %2, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="4839854064" "enzymejl_parmtype_ref"="0" %3, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="4839854064" "enzymejl_parmtype_ref"="0" %4, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="4839854064" "enzymejl_parmtype_ref"="0" %5) unnamed_addr #29 !dbg !1953 {
top:
  %newstruct = alloca [2 x double], align 8
  %newstruct5 = alloca [2 x double], align 8
  %6 = alloca [2 x double], align 8
  %newstruct11 = alloca [2 x double], align 8
  %newstruct13 = alloca [2 x double], align 8
  %7 = alloca [2 x double], align 8
  %8 = call {}*** @julia.get_pgcstack() #31
  %current_task144 = getelementptr inbounds {}**, {}*** %8, i64 -14
  %current_task1 = bitcast {}*** %current_task144 to {}**
  %ptls_field45 = getelementptr inbounds {}**, {}*** %8, i64 2
  %9 = bitcast {}*** %ptls_field45 to i64***
  %ptls_load4647 = load i64**, i64*** %9, align 8, !tbaa !22
  %10 = getelementptr inbounds i64*, i64** %ptls_load4647, i64 2
  %safepoint = load i64*, i64** %10, align 8, !tbaa !26
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint) #31, !dbg !1954
  fence syncscope("singlethread") seq_cst
  %11 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %1, i64 0, i64 0, !dbg !1955
  %12 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %1, i64 0, i64 1, !dbg !1956
  %unbox = load i64, i64 addrspace(11)* %11, align 8, !dbg !1959, !tbaa !26, !alias.scope !43, !noalias !46
  %13 = sitofp i64 %unbox to double, !dbg !1959
  %bitcast_coercion = bitcast double %13 to i64, !dbg !1963
  %14 = and i64 %bitcast_coercion, -134217728, !dbg !1966
  %bitcast_coercion2 = bitcast i64 %14 to double, !dbg !1963
  %15 = fcmp ult double %bitcast_coercion2, 0xC3E0000000000000, !dbg !1967
  %16 = fcmp uge double %bitcast_coercion2, 0x43E0000000000000, !dbg !1968
  %17 = or i1 %15, %16, !dbg !1968
  br i1 %17, label %L22, label %L14, !dbg !1968

L14:                                              ; preds = %top
  %18 = call double @llvm.trunc.f64(double %bitcast_coercion2) #31, !dbg !1972
  %19 = fsub double %bitcast_coercion2, %18, !dbg !1976
  %20 = fcmp une double %19, 0.000000e+00, !dbg !1977
  br i1 %20, label %L22, label %L20, !dbg !1968

L20:                                              ; preds = %L14
  %21 = fptosi double %bitcast_coercion2 to i64, !dbg !1979
  %22 = freeze i64 %21, !dbg !1979
  %23 = sub i64 %unbox, %22, !dbg !1981
  %24 = sitofp i64 %23 to double, !dbg !1983
  %25 = fadd double %bitcast_coercion2, %24, !dbg !1984
  %26 = fsub double %bitcast_coercion2, %25, !dbg !1986
  %27 = fadd double %26, %24, !dbg !1988
  %28 = getelementptr inbounds [2 x double], [2 x double]* %newstruct, i64 0, i64 0, !dbg !1989
  store double %25, double* %28, align 8, !dbg !1989, !tbaa !100, !alias.scope !102, !noalias !1990
  %29 = getelementptr inbounds [2 x double], [2 x double]* %newstruct, i64 0, i64 1, !dbg !1989
  store double %27, double* %29, align 8, !dbg !1989, !tbaa !100, !alias.scope !102, !noalias !1990
  %unbox4 = load i64, i64 addrspace(11)* %12, align 8, !dbg !1993, !tbaa !26, !alias.scope !43, !noalias !46
  %30 = sitofp i64 %unbox4 to double, !dbg !1993
  %31 = getelementptr inbounds [2 x double], [2 x double]* %newstruct5, i64 0, i64 0, !dbg !1997
  store double %30, double* %31, align 8, !dbg !1997, !tbaa !100, !alias.scope !102, !noalias !1990
  %memcpy_refined_dst = getelementptr inbounds [2 x double], [2 x double]* %newstruct5, i64 0, i64 1, !dbg !1997
  store double 0.000000e+00, double* %memcpy_refined_dst, align 8, !dbg !1997, !tbaa !100, !alias.scope !102, !noalias !1990
  %32 = addrspacecast [2 x double]* %newstruct to [2 x double] addrspace(11)*, !dbg !1996
  %33 = addrspacecast [2 x double]* %newstruct5 to [2 x double] addrspace(11)*, !dbg !1996
  call fastcc void @julia___2382([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %6, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %32, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %33) #31, !dbg !1996
  %34 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %2, i64 0, i64 0, !dbg !2000
  %35 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %2, i64 0, i64 1, !dbg !2001
  %unbox6 = load i64, i64 addrspace(11)* %34, align 8, !dbg !2004, !tbaa !26, !alias.scope !43, !noalias !46
  %36 = sitofp i64 %unbox6 to double, !dbg !2004
  %bitcast_coercion7 = bitcast double %36 to i64, !dbg !2008
  %37 = and i64 %bitcast_coercion7, -134217728, !dbg !2011
  %bitcast_coercion8 = bitcast i64 %37 to double, !dbg !2008
  %38 = fcmp ult double %bitcast_coercion8, 0xC3E0000000000000, !dbg !2012
  %39 = fcmp uge double %bitcast_coercion8, 0x43E0000000000000, !dbg !2013
  %40 = or i1 %38, %39, !dbg !2013
  br i1 %40, label %L60, label %L52, !dbg !2013

L22:                                              ; preds = %L14, %top
  %box33 = call noalias nonnull dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4839853264 to {}*) to {} addrspace(10)*)) #32, !dbg !2017
  %41 = bitcast {} addrspace(10)* %box33 to i64 addrspace(10)*, !dbg !2017
  store i64 %14, i64 addrspace(10)* %41, align 8, !dbg !2017, !tbaa !132, !alias.scope !136, !noalias !2018
  %42 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4818200976 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4818200384 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4311988784 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839854064 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box33) #33, !dbg !2017
  %43 = addrspacecast {} addrspace(10)* %42 to {} addrspace(12)*, !dbg !2017
  call void @ijl_throw({} addrspace(12)* %43) #34, !dbg !2017
  unreachable, !dbg !2017

L52:                                              ; preds = %L20
  %44 = call double @llvm.trunc.f64(double %bitcast_coercion8) #31, !dbg !2019
  %45 = fsub double %bitcast_coercion8, %44, !dbg !2023
  %46 = fcmp une double %45, 0.000000e+00, !dbg !2024
  br i1 %46, label %L60, label %L58, !dbg !2013

L58:                                              ; preds = %L52
  %47 = fptosi double %bitcast_coercion8 to i64, !dbg !2026
  %48 = freeze i64 %47, !dbg !2026
  %49 = sub i64 %unbox6, %48, !dbg !2028
  %50 = sitofp i64 %49 to double, !dbg !2030
  %51 = fadd double %bitcast_coercion8, %50, !dbg !2031
  %52 = fsub double %bitcast_coercion8, %51, !dbg !2033
  %53 = fadd double %52, %50, !dbg !2035
  %54 = getelementptr inbounds [2 x double], [2 x double]* %newstruct11, i64 0, i64 0, !dbg !2036
  store double %51, double* %54, align 8, !dbg !2036, !tbaa !100, !alias.scope !102, !noalias !1990
  %55 = getelementptr inbounds [2 x double], [2 x double]* %newstruct11, i64 0, i64 1, !dbg !2036
  store double %53, double* %55, align 8, !dbg !2036, !tbaa !100, !alias.scope !102, !noalias !1990
  %unbox12 = load i64, i64 addrspace(11)* %35, align 8, !dbg !2037, !tbaa !26, !alias.scope !43, !noalias !46
  %56 = sitofp i64 %unbox12 to double, !dbg !2037
  %57 = getelementptr inbounds [2 x double], [2 x double]* %newstruct13, i64 0, i64 0, !dbg !2041
  store double %56, double* %57, align 8, !dbg !2041, !tbaa !100, !alias.scope !102, !noalias !1990
  %memcpy_refined_dst14 = getelementptr inbounds [2 x double], [2 x double]* %newstruct13, i64 0, i64 1, !dbg !2041
  store double 0.000000e+00, double* %memcpy_refined_dst14, align 8, !dbg !2041, !tbaa !100, !alias.scope !102, !noalias !1990
  %58 = addrspacecast [2 x double]* %newstruct11 to [2 x double] addrspace(11)*, !dbg !2040
  %59 = addrspacecast [2 x double]* %newstruct13 to [2 x double] addrspace(11)*, !dbg !2040
  call fastcc void @julia___2382([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %7, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %58, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %59) #31, !dbg !2040
  %60 = getelementptr inbounds [2 x double], [2 x double]* %7, i64 0, i64 0, !dbg !2044
  %bitcast = load double, double* %60, align 8, !dbg !2046, !tbaa !100, !alias.scope !102, !noalias !171
  %61 = getelementptr inbounds [2 x double], [2 x double]* %7, i64 0, i64 1, !dbg !2049
  %unbox18 = load double, double* %61, align 8, !dbg !2051, !tbaa !100, !alias.scope !102, !noalias !171
  %62 = icmp slt i64 %4, 0, !dbg !2052
  br i1 %62, label %L111, label %L96, !dbg !2054

L60:                                              ; preds = %L52, %L20
  %box29 = call noalias nonnull dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839853264 to {}*) to {} addrspace(10)*)) #32, !dbg !2057
  %63 = bitcast {} addrspace(10)* %box29 to i64 addrspace(10)*, !dbg !2057
  store i64 %37, i64 addrspace(10)* %63, align 8, !dbg !2057, !tbaa !132, !alias.scope !136, !noalias !2018
  %64 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* nonnull @ijl_invoke, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4818200976 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4818200384 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4311988784 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839854064 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %box29) #33, !dbg !2057
  %65 = addrspacecast {} addrspace(10)* %64 to {} addrspace(12)*, !dbg !2057
  call void @ijl_throw({} addrspace(12)* %65) #31, !dbg !2057
  unreachable, !dbg !2057

L96:                                              ; preds = %L58
  %66 = icmp slt i64 %5, 1, !dbg !2058
  %67 = call i64 @llvm.smax.i64(i64 %4, i64 1) #31, !dbg !2059
  %68 = icmp slt i64 %67, %5, !dbg !2059
  %value_phi21.off0 = select i1 %66, i1 true, i1 %68, !dbg !2059
  br i1 %value_phi21.off0, label %L107, label %L105, !dbg !2059

L105:                                             ; preds = %L96
  %69 = icmp slt i64 %3, 0, !dbg !2060
  %70 = sub i64 0, %3, !dbg !2062
  %71 = icmp ugt i64 %70, 63, !dbg !2063
  %72 = lshr i64 -1, %70, !dbg !2063
  %73 = select i1 %71, i64 0, i64 %72, !dbg !2063
  %74 = icmp ugt i64 %3, 63, !dbg !2064
  %75 = shl nsw i64 -1, %3, !dbg !2064
  %76 = select i1 %74, i64 0, i64 %75, !dbg !2064
  %77 = select i1 %69, i64 %73, i64 %76, !dbg !2065
  %bitcast_coercion15 = bitcast double %bitcast to i64, !dbg !2046
  %78 = and i64 %77, %bitcast_coercion15, !dbg !2066
  %bitcast_coercion16 = bitcast i64 %78 to double, !dbg !2046
  %79 = fsub double %bitcast, %bitcast_coercion16, !dbg !2067
  %80 = fadd double %unbox18, %79, !dbg !2051
  %newstruct22.sroa.0.sroa.0.0.newstruct22.sroa.0.0..sroa_cast36.sroa_cast = bitcast [2 x double]* %6 to i8*, !dbg !2068
  %81 = bitcast { [2 x double], [2 x double], i64, i64 }* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(16) %81, i8* noundef nonnull align 8 dereferenceable(16) %newstruct22.sroa.0.sroa.0.0.newstruct22.sroa.0.0..sroa_cast36.sroa_cast, i64 16, i1 false) #31, !dbg !2068, !noalias !2069
  %newstruct22.sroa.0.sroa.2.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx42 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 0, !dbg !2068
  %82 = bitcast double* %newstruct22.sroa.0.sroa.2.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx42 to i64*, !dbg !2068
  store i64 %78, i64* %82, align 8, !dbg !2068, !noalias !2069
  %newstruct22.sroa.0.sroa.3.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx43 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 1, !dbg !2068
  store double %80, double* %newstruct22.sroa.0.sroa.3.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx43, align 8, !dbg !2068, !noalias !2069
  %newstruct22.sroa.3.0..sroa_idx38 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 2, !dbg !2068
  store i64 %4, i64* %newstruct22.sroa.3.0..sroa_idx38, align 8, !dbg !2068, !noalias !2069
  %newstruct22.sroa.4.0..sroa_idx39 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 3, !dbg !2068
  store i64 %5, i64* %newstruct22.sroa.4.0..sroa_idx39, align 8, !dbg !2068, !noalias !2069
  ret void, !dbg !2068

L107:                                             ; preds = %L96
  %83 = call noalias nonnull "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @ijl_box_int64(i64 signext %4) #35, !dbg !2059
  %84 = call noalias nonnull "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @ijl_box_int64(i64 signext %5) #35, !dbg !2059
  %85 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* nonnull @ijl_invoke, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4796769728 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4772842608 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4906089056 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %83, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4906089024 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %84) #36, !dbg !2059
  %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4838711792 to {}*) to {} addrspace(10)*)) #32, !dbg !2059
  %86 = bitcast {} addrspace(10)* %box to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !2059
  %87 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %86, i64 0, i64 0, !dbg !2059
  store {} addrspace(10)* %85, {} addrspace(10)* addrspace(10)* %87, align 8, !dbg !2059, !tbaa !132, !alias.scope !136, !noalias !2018
  %88 = addrspacecast {} addrspace(10)* %box to {} addrspace(12)*, !dbg !2059
  call void @ijl_throw({} addrspace(12)* %88) #31, !dbg !2059
  unreachable, !dbg !2059

L111:                                             ; preds = %L58
  %89 = call nonnull {} addrspace(10)* @julia_string_2373({} addrspace(10)* nofree noundef nonnull align 32 addrspacecast ({}* inttoptr (i64 4906089120 to {}*) to {} addrspace(10)*), i64 signext %4) #31, !dbg !2054
  %box25 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4838711792 to {}*) to {} addrspace(10)*)) #32, !dbg !2054
  %90 = bitcast {} addrspace(10)* %box25 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !2054
  %91 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %90, i64 0, i64 0, !dbg !2054
  store {} addrspace(10)* %89, {} addrspace(10)* addrspace(10)* %91, align 8, !dbg !2054, !tbaa !132, !alias.scope !136, !noalias !2018
  %92 = addrspacecast {} addrspace(10)* %box25 to {} addrspace(12)*, !dbg !2054
  call void @ijl_throw({} addrspace(12)* %92) #31, !dbg !2054
  unreachable, !dbg !2054
}

; Function Attrs: mustprogress willreturn
define internal fastcc void @preprocess_julia_steprangelen_hp_2378({ [2 x double], [2 x double], i64, i64 }* noalias nocapture noundef nonnull sret({ [2 x double], [2 x double], i64, i64 }) align 8 dereferenceable(48) "enzyme_type"="{[-1]:Pointer, [-1,0]:Float@double, [-1,8]:Float@double, [-1,16]:Float@double, [-1,24]:Float@double, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer, [-1,40]:Integer, [-1,41]:Integer, [-1,42]:Integer, [-1,43]:Integer, [-1,44]:Integer, [-1,45]:Integer, [-1,46]:Integer, [-1,47]:Integer}" %0, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer}" "enzymejl_parmtype"="4767707152" "enzymejl_parmtype_ref"="1" %1, [2 x i64] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer}" "enzymejl_parmtype"="4767707152" "enzymejl_parmtype_ref"="1" %2, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="4839854064" "enzymejl_parmtype_ref"="0" %3, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="4839854064" "enzymejl_parmtype_ref"="0" %4, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="4839854064" "enzymejl_parmtype_ref"="0" %5) unnamed_addr #29 !dbg !1953 {
top:
  %newstruct = alloca [2 x double], align 8
  %newstruct5 = alloca [2 x double], align 8
  %6 = alloca [2 x double], align 8
  %newstruct11 = alloca [2 x double], align 8
  %newstruct13 = alloca [2 x double], align 8
  %7 = alloca [2 x double], align 8
  %8 = call {}*** @julia.get_pgcstack() #31
  %current_task144 = getelementptr inbounds {}**, {}*** %8, i64 -14
  %current_task1 = bitcast {}*** %current_task144 to {}**
  %ptls_field45 = getelementptr inbounds {}**, {}*** %8, i64 2
  %9 = bitcast {}*** %ptls_field45 to i64***
  %ptls_load4647 = load i64**, i64*** %9, align 8, !tbaa !22
  %10 = getelementptr inbounds i64*, i64** %ptls_load4647, i64 2
  %safepoint = load i64*, i64** %10, align 8, !tbaa !26
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint) #31, !dbg !1954
  fence syncscope("singlethread") seq_cst
  %11 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %1, i64 0, i64 0, !dbg !1955
  %12 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %1, i64 0, i64 1, !dbg !1956
  %unbox = load i64, i64 addrspace(11)* %11, align 8, !dbg !1959, !tbaa !26, !alias.scope !43, !noalias !46
  %13 = sitofp i64 %unbox to double, !dbg !1959
  %bitcast_coercion = bitcast double %13 to i64, !dbg !1963
  %14 = and i64 %bitcast_coercion, -134217728, !dbg !1966
  %bitcast_coercion2 = bitcast i64 %14 to double, !dbg !1963
  %15 = fcmp ult double %bitcast_coercion2, 0xC3E0000000000000, !dbg !1967
  %16 = fcmp uge double %bitcast_coercion2, 0x43E0000000000000, !dbg !1968
  %17 = or i1 %15, %16, !dbg !1968
  br i1 %17, label %L22, label %L14, !dbg !1968

L14:                                              ; preds = %top
  %18 = call double @llvm.trunc.f64(double %bitcast_coercion2) #31, !dbg !1972
  %19 = fsub double %bitcast_coercion2, %18, !dbg !1976
  %20 = fcmp une double %19, 0.000000e+00, !dbg !1977
  br i1 %20, label %L22, label %L20, !dbg !1968

L20:                                              ; preds = %L14
  %21 = fptosi double %bitcast_coercion2 to i64, !dbg !1979
  %22 = freeze i64 %21, !dbg !1979
  %23 = sub i64 %unbox, %22, !dbg !1981
  %24 = sitofp i64 %23 to double, !dbg !1983
  %25 = fadd double %bitcast_coercion2, %24, !dbg !1984
  %26 = fsub double %bitcast_coercion2, %25, !dbg !1986
  %27 = fadd double %26, %24, !dbg !1988
  %28 = getelementptr inbounds [2 x double], [2 x double]* %newstruct, i64 0, i64 0, !dbg !1989
  store double %25, double* %28, align 8, !dbg !1989, !tbaa !100, !alias.scope !102, !noalias !1990
  %29 = getelementptr inbounds [2 x double], [2 x double]* %newstruct, i64 0, i64 1, !dbg !1989
  store double %27, double* %29, align 8, !dbg !1989, !tbaa !100, !alias.scope !102, !noalias !1990
  %unbox4 = load i64, i64 addrspace(11)* %12, align 8, !dbg !1993, !tbaa !26, !alias.scope !43, !noalias !46
  %30 = sitofp i64 %unbox4 to double, !dbg !1993
  %31 = getelementptr inbounds [2 x double], [2 x double]* %newstruct5, i64 0, i64 0, !dbg !1997
  store double %30, double* %31, align 8, !dbg !1997, !tbaa !100, !alias.scope !102, !noalias !1990
  %memcpy_refined_dst = getelementptr inbounds [2 x double], [2 x double]* %newstruct5, i64 0, i64 1, !dbg !1997
  store double 0.000000e+00, double* %memcpy_refined_dst, align 8, !dbg !1997, !tbaa !100, !alias.scope !102, !noalias !1990
  %32 = addrspacecast [2 x double]* %newstruct to [2 x double] addrspace(11)*, !dbg !1996
  %33 = addrspacecast [2 x double]* %newstruct5 to [2 x double] addrspace(11)*, !dbg !1996
  call fastcc void @julia___2382([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %6, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %32, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %33) #31, !dbg !1996
  %34 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %2, i64 0, i64 0, !dbg !2000
  %35 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %2, i64 0, i64 1, !dbg !2001
  %unbox6 = load i64, i64 addrspace(11)* %34, align 8, !dbg !2004, !tbaa !26, !alias.scope !43, !noalias !46
  %36 = sitofp i64 %unbox6 to double, !dbg !2004
  %bitcast_coercion7 = bitcast double %36 to i64, !dbg !2008
  %37 = and i64 %bitcast_coercion7, -134217728, !dbg !2011
  %bitcast_coercion8 = bitcast i64 %37 to double, !dbg !2008
  %38 = fcmp ult double %bitcast_coercion8, 0xC3E0000000000000, !dbg !2012
  %39 = fcmp uge double %bitcast_coercion8, 0x43E0000000000000, !dbg !2013
  %40 = or i1 %38, %39, !dbg !2013
  br i1 %40, label %L60, label %L52, !dbg !2013

L22:                                              ; preds = %L14, %top
  %box33 = call noalias nonnull dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4839853264 to {}*) to {} addrspace(10)*)) #32, !dbg !2017
  %41 = bitcast {} addrspace(10)* %box33 to i64 addrspace(10)*, !dbg !2017
  store i64 %14, i64 addrspace(10)* %41, align 8, !dbg !2017, !tbaa !132, !alias.scope !136, !noalias !2018
  %42 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4818200976 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4818200384 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4311988784 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839854064 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box33) #33, !dbg !2017
  %43 = addrspacecast {} addrspace(10)* %42 to {} addrspace(12)*, !dbg !2017
  call void @ijl_throw({} addrspace(12)* %43) #34, !dbg !2017
  unreachable, !dbg !2017

L52:                                              ; preds = %L20
  %44 = call double @llvm.trunc.f64(double %bitcast_coercion8) #31, !dbg !2019
  %45 = fsub double %bitcast_coercion8, %44, !dbg !2023
  %46 = fcmp une double %45, 0.000000e+00, !dbg !2024
  br i1 %46, label %L60, label %L58, !dbg !2013

L58:                                              ; preds = %L52
  %47 = fptosi double %bitcast_coercion8 to i64, !dbg !2026
  %48 = freeze i64 %47, !dbg !2026
  %49 = sub i64 %unbox6, %48, !dbg !2028
  %50 = sitofp i64 %49 to double, !dbg !2030
  %51 = fadd double %bitcast_coercion8, %50, !dbg !2031
  %52 = fsub double %bitcast_coercion8, %51, !dbg !2033
  %53 = fadd double %52, %50, !dbg !2035
  %54 = getelementptr inbounds [2 x double], [2 x double]* %newstruct11, i64 0, i64 0, !dbg !2036
  store double %51, double* %54, align 8, !dbg !2036, !tbaa !100, !alias.scope !102, !noalias !1990
  %55 = getelementptr inbounds [2 x double], [2 x double]* %newstruct11, i64 0, i64 1, !dbg !2036
  store double %53, double* %55, align 8, !dbg !2036, !tbaa !100, !alias.scope !102, !noalias !1990
  %unbox12 = load i64, i64 addrspace(11)* %35, align 8, !dbg !2037, !tbaa !26, !alias.scope !43, !noalias !46
  %56 = sitofp i64 %unbox12 to double, !dbg !2037
  %57 = getelementptr inbounds [2 x double], [2 x double]* %newstruct13, i64 0, i64 0, !dbg !2041
  store double %56, double* %57, align 8, !dbg !2041, !tbaa !100, !alias.scope !102, !noalias !1990
  %memcpy_refined_dst14 = getelementptr inbounds [2 x double], [2 x double]* %newstruct13, i64 0, i64 1, !dbg !2041
  store double 0.000000e+00, double* %memcpy_refined_dst14, align 8, !dbg !2041, !tbaa !100, !alias.scope !102, !noalias !1990
  %58 = addrspacecast [2 x double]* %newstruct11 to [2 x double] addrspace(11)*, !dbg !2040
  %59 = addrspacecast [2 x double]* %newstruct13 to [2 x double] addrspace(11)*, !dbg !2040
  call fastcc void @julia___2382([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %7, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %58, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %59) #31, !dbg !2040
  %60 = getelementptr inbounds [2 x double], [2 x double]* %7, i64 0, i64 0, !dbg !2044
  %bitcast = load double, double* %60, align 8, !dbg !2046, !tbaa !100, !alias.scope !102, !noalias !171
  %61 = getelementptr inbounds [2 x double], [2 x double]* %7, i64 0, i64 1, !dbg !2049
  %unbox18 = load double, double* %61, align 8, !dbg !2051, !tbaa !100, !alias.scope !102, !noalias !171
  %62 = icmp slt i64 %4, 0, !dbg !2052
  br i1 %62, label %L111, label %L96, !dbg !2054

L60:                                              ; preds = %L52, %L20
  %box29 = call noalias nonnull dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839853264 to {}*) to {} addrspace(10)*)) #32, !dbg !2057
  %63 = bitcast {} addrspace(10)* %box29 to i64 addrspace(10)*, !dbg !2057
  store i64 %37, i64 addrspace(10)* %63, align 8, !dbg !2057, !tbaa !132, !alias.scope !136, !noalias !2018
  %64 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* nonnull @ijl_invoke, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4818200976 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4818200384 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4311988784 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839854064 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %box29) #33, !dbg !2057
  %65 = addrspacecast {} addrspace(10)* %64 to {} addrspace(12)*, !dbg !2057
  call void @ijl_throw({} addrspace(12)* %65) #31, !dbg !2057
  unreachable, !dbg !2057

L96:                                              ; preds = %L58
  %66 = icmp slt i64 %5, 1, !dbg !2058
  %67 = call i64 @llvm.smax.i64(i64 %4, i64 1) #31, !dbg !2059
  %68 = icmp slt i64 %67, %5, !dbg !2059
  %value_phi21.off0 = select i1 %66, i1 true, i1 %68, !dbg !2059
  br i1 %value_phi21.off0, label %L107, label %L105, !dbg !2059

L105:                                             ; preds = %L96
  %69 = icmp slt i64 %3, 0, !dbg !2060
  %70 = sub i64 0, %3, !dbg !2062
  %71 = icmp ugt i64 %70, 63, !dbg !2063
  %72 = lshr i64 -1, %70, !dbg !2063
  %73 = select i1 %71, i64 0, i64 %72, !dbg !2063
  %74 = icmp ugt i64 %3, 63, !dbg !2064
  %75 = shl nsw i64 -1, %3, !dbg !2064
  %76 = select i1 %74, i64 0, i64 %75, !dbg !2064
  %77 = select i1 %69, i64 %73, i64 %76, !dbg !2065
  %bitcast_coercion15 = bitcast double %bitcast to i64, !dbg !2046
  %78 = and i64 %77, %bitcast_coercion15, !dbg !2066
  %bitcast_coercion16 = bitcast i64 %78 to double, !dbg !2046
  %79 = fsub double %bitcast, %bitcast_coercion16, !dbg !2067
  %80 = fadd double %unbox18, %79, !dbg !2051
  %newstruct22.sroa.0.sroa.0.0.newstruct22.sroa.0.0..sroa_cast36.sroa_cast = bitcast [2 x double]* %6 to i8*, !dbg !2068
  %81 = bitcast { [2 x double], [2 x double], i64, i64 }* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(16) %81, i8* noundef nonnull align 8 dereferenceable(16) %newstruct22.sroa.0.sroa.0.0.newstruct22.sroa.0.0..sroa_cast36.sroa_cast, i64 16, i1 false) #31, !dbg !2068, !noalias !2069
  %newstruct22.sroa.0.sroa.2.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx42 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 0, !dbg !2068
  %82 = bitcast double* %newstruct22.sroa.0.sroa.2.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx42 to i64*, !dbg !2068
  store i64 %78, i64* %82, align 8, !dbg !2068, !noalias !2069
  %newstruct22.sroa.0.sroa.3.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx43 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 1, !dbg !2068
  store double %80, double* %newstruct22.sroa.0.sroa.3.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx43, align 8, !dbg !2068, !noalias !2069
  %newstruct22.sroa.3.0..sroa_idx38 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 2, !dbg !2068
  store i64 %4, i64* %newstruct22.sroa.3.0..sroa_idx38, align 8, !dbg !2068, !noalias !2069
  %newstruct22.sroa.4.0..sroa_idx39 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 3, !dbg !2068
  store i64 %5, i64* %newstruct22.sroa.4.0..sroa_idx39, align 8, !dbg !2068, !noalias !2069
  ret void, !dbg !2068

L107:                                             ; preds = %L96
  %83 = call noalias nonnull "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @ijl_box_int64(i64 signext %4) #35, !dbg !2059
  %84 = call noalias nonnull "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @ijl_box_int64(i64 signext %5) #35, !dbg !2059
  %85 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* nonnull @ijl_invoke, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4796769728 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4772842608 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4906089056 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %83, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4906089024 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %84) #36, !dbg !2059
  %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4838711792 to {}*) to {} addrspace(10)*)) #32, !dbg !2059
  %86 = bitcast {} addrspace(10)* %box to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !2059
  %87 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %86, i64 0, i64 0, !dbg !2059
  store {} addrspace(10)* %85, {} addrspace(10)* addrspace(10)* %87, align 8, !dbg !2059, !tbaa !132, !alias.scope !136, !noalias !2018
  %88 = addrspacecast {} addrspace(10)* %box to {} addrspace(12)*, !dbg !2059
  call void @ijl_throw({} addrspace(12)* %88) #31, !dbg !2059
  unreachable, !dbg !2059

L111:                                             ; preds = %L58
  %89 = call nonnull {} addrspace(10)* @julia_string_2373({} addrspace(10)* nofree noundef nonnull align 32 addrspacecast ({}* inttoptr (i64 4906089120 to {}*) to {} addrspace(10)*), i64 signext %4) #31, !dbg !2054
  %box25 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4838711792 to {}*) to {} addrspace(10)*)) #32, !dbg !2054
  %90 = bitcast {} addrspace(10)* %box25 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !2054
  %91 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %90, i64 0, i64 0, !dbg !2054
  store {} addrspace(10)* %89, {} addrspace(10)* addrspace(10)* %91, align 8, !dbg !2054, !tbaa !132, !alias.scope !136, !noalias !2018
  %92 = addrspacecast {} addrspace(10)* %box25 to {} addrspace(12)*, !dbg !2054
  call void @ijl_throw({} addrspace(12)* %92) #31, !dbg !2054
  unreachable, !dbg !2054
}

 constantarg[{ [2 x double], [2 x double], i64, i64 }* %0] = 0 type: {[-1]:Pointer, [-1,0]:Float@double, [-1,8]:Float@double, [-1,16]:Float@double, [-1,24]:Float@double, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer, [-1,40]:Integer, [-1,41]:Integer, [-1,42]:Integer, [-1,43]:Integer, [-1,44]:Integer, [-1,45]:Integer, [-1,46]:Integer, [-1,47]:Integer} - vals: {}
 constantarg[[2 x i64] addrspace(11)* %1] = 1 type: {[-1]:Pointer, [-1,-1]:Integer} - vals: {}
 constantarg[[2 x i64] addrspace(11)* %2] = 1 type: {[-1]:Pointer, [-1,-1]:Integer} - vals: {}
 constantarg[i64 %3] = 1 type: {[-1]:Integer} - vals: {}
 constantarg[i64 %4] = 1 type: {[-1]:Integer} - vals: {}
 constantarg[i64 %5] = 1 type: {[-1]:Integer} - vals: {}
 constantinst[  %newstruct = alloca [2 x double], align 8] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %newstruct5 = alloca [2 x double], align 8] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %6 = alloca [2 x double], align 8] = 1 val:0 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %newstruct11 = alloca [2 x double], align 8] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %newstruct13 = alloca [2 x double], align 8] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %7 = alloca [2 x double], align 8] = 1 val:0 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %8 = call {}*** @julia.get_pgcstack() #31] = 1 val:1 type: {[-1]:Pointer, [-1,16]:Pointer}
 constantinst[  %current_task144 = getelementptr inbounds {}**, {}*** %8, i64 -14] = 1 val:1 type: {[-1]:Pointer}
 constantinst[  %current_task1 = bitcast {}*** %current_task144 to {}**] = 1 val:1 type: {[-1]:Pointer}
 constantinst[  %ptls_field45 = getelementptr inbounds {}**, {}*** %8, i64 2] = 1 val:1 type: {[-1]:Pointer, [-1,0]:Pointer}
 constantinst[  %9 = bitcast {}*** %ptls_field45 to i64***] = 1 val:1 type: {[-1]:Pointer, [-1,0]:Pointer}
 constantinst[  %ptls_load4647 = load i64**, i64*** %9, align 8, !tbaa !22] = 1 val:1 type: {[-1]:Pointer}
 constantinst[  %10 = getelementptr inbounds i64*, i64** %ptls_load4647, i64 2] = 1 val:1 type: {[-1]:Pointer}
 constantinst[  %safepoint = load i64*, i64** %10, align 8, !tbaa !26] = 1 val:1 type: {}
 constantinst[  fence syncscope("singlethread") seq_cst] = 1 val:1 type: {}
 constantinst[  call void @julia.safepoint(i64* %safepoint) #31, !dbg !28] = 1 val:1 type: {}
 constantinst[  fence syncscope("singlethread") seq_cst] = 1 val:1 type: {}
 constantinst[  %11 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %1, i64 0, i64 0, !dbg !29] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Integer}
 constantinst[  %12 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %1, i64 0, i64 1, !dbg !32] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Integer}
 constantinst[  %unbox = load i64, i64 addrspace(11)* %11, align 8, !dbg !36, !tbaa !26, !alias.scope !43, !noalias !46] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %13 = sitofp i64 %unbox to double, !dbg !36] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %bitcast_coercion = bitcast double %13 to i64, !dbg !51] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %14 = and i64 %bitcast_coercion, -134217728, !dbg !58] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %bitcast_coercion2 = bitcast i64 %14 to double, !dbg !51] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %15 = fcmp ult double %bitcast_coercion2, 0xC3E0000000000000, !dbg !61] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %16 = fcmp uge double %bitcast_coercion2, 0x43E0000000000000, !dbg !63] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %17 = or i1 %15, %16, !dbg !63] = 1 val:1 type: {[-1]:Integer}
 constantinst[  br i1 %17, label %L22, label %L14, !dbg !63] = 1 val:1 type: {}
 constantinst[  %18 = call double @llvm.trunc.f64(double %bitcast_coercion2) #31, !dbg !71] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %19 = fsub double %bitcast_coercion2, %18, !dbg !80] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %20 = fcmp une double %19, 0.000000e+00, !dbg !82] = 1 val:1 type: {[-1]:Integer}
 constantinst[  br i1 %20, label %L22, label %L20, !dbg !63] = 1 val:1 type: {}
 constantinst[  %21 = fptosi double %bitcast_coercion2 to i64, !dbg !85] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %22 = freeze i64 %21, !dbg !85] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %23 = sub i64 %unbox, %22, !dbg !88] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %24 = sitofp i64 %23 to double, !dbg !91] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %25 = fadd double %bitcast_coercion2, %24, !dbg !92] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %26 = fsub double %bitcast_coercion2, %25, !dbg !96] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %27 = fadd double %26, %24, !dbg !98] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %28 = getelementptr inbounds [2 x double], [2 x double]* %newstruct, i64 0, i64 0, !dbg !99] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double %25, double* %28, align 8, !dbg !99, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %29 = getelementptr inbounds [2 x double], [2 x double]* %newstruct, i64 0, i64 1, !dbg !99] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double %27, double* %29, align 8, !dbg !99, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %unbox4 = load i64, i64 addrspace(11)* %12, align 8, !dbg !106, !tbaa !26, !alias.scope !43, !noalias !46] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %30 = sitofp i64 %unbox4 to double, !dbg !106] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %31 = getelementptr inbounds [2 x double], [2 x double]* %newstruct5, i64 0, i64 0, !dbg !111] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double %30, double* %31, align 8, !dbg !111, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %memcpy_refined_dst = getelementptr inbounds [2 x double], [2 x double]* %newstruct5, i64 0, i64 1, !dbg !111] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double 0.000000e+00, double* %memcpy_refined_dst, align 8, !dbg !111, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %32 = addrspacecast [2 x double]* %newstruct to [2 x double] addrspace(11)*, !dbg !109] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %33 = addrspacecast [2 x double]* %newstruct5 to [2 x double] addrspace(11)*, !dbg !109] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  call fastcc void @julia___2382([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %6, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %32, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %33) #31, !dbg !109] = 0 val:1 type: {}
 constantinst[  %34 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %2, i64 0, i64 0, !dbg !114] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Integer}
 constantinst[  %35 = getelementptr inbounds [2 x i64], [2 x i64] addrspace(11)* %2, i64 0, i64 1, !dbg !115] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Integer}
 constantinst[  %unbox6 = load i64, i64 addrspace(11)* %34, align 8, !dbg !118, !tbaa !26, !alias.scope !43, !noalias !46] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %36 = sitofp i64 %unbox6 to double, !dbg !118] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %bitcast_coercion7 = bitcast double %36 to i64, !dbg !122] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %37 = and i64 %bitcast_coercion7, -134217728, !dbg !125] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %bitcast_coercion8 = bitcast i64 %37 to double, !dbg !122] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %38 = fcmp ult double %bitcast_coercion8, 0xC3E0000000000000, !dbg !126] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %39 = fcmp uge double %bitcast_coercion8, 0x43E0000000000000, !dbg !127] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %40 = or i1 %38, %39, !dbg !127] = 1 val:1 type: {[-1]:Integer}
 constantinst[  br i1 %40, label %L60, label %L52, !dbg !127] = 1 val:1 type: {}
 constantinst[  %box33 = call noalias nonnull dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4839853264 to {}*) to {} addrspace(10)*)) #32, !dbg !131] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %41 = bitcast {} addrspace(10)* %box33 to i64 addrspace(10)*, !dbg !131] = 1 val:1 type: {}
 constantinst[  store i64 %14, i64 addrspace(10)* %41, align 8, !dbg !131, !tbaa !132, !alias.scope !136, !noalias !137] = 1 val:1 type: {}
 constantinst[  %42 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4818200976 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 4818200384 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4311988784 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839854064 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull %box33) #33, !dbg !131] = 1 val:0 type: {[-1]:Pointer}
 constantinst[  %43 = addrspacecast {} addrspace(10)* %42 to {} addrspace(12)*, !dbg !131] = 1 val:0 type: {}
 constantinst[  call void @ijl_throw({} addrspace(12)* %43) #34, !dbg !131] = 1 val:1 type: {}
 constantinst[  unreachable, !dbg !131] = 1 val:1 type: {}
 constantinst[  %44 = call double @llvm.trunc.f64(double %bitcast_coercion8) #31, !dbg !138] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %45 = fsub double %bitcast_coercion8, %44, !dbg !142] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %46 = fcmp une double %45, 0.000000e+00, !dbg !143] = 1 val:1 type: {[-1]:Integer}
 constantinst[  br i1 %46, label %L60, label %L58, !dbg !127] = 1 val:1 type: {}
 constantinst[  %47 = fptosi double %bitcast_coercion8 to i64, !dbg !145] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %48 = freeze i64 %47, !dbg !145] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %49 = sub i64 %unbox6, %48, !dbg !147] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %50 = sitofp i64 %49 to double, !dbg !149] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %51 = fadd double %bitcast_coercion8, %50, !dbg !150] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %52 = fsub double %bitcast_coercion8, %51, !dbg !152] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %53 = fadd double %52, %50, !dbg !154] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %54 = getelementptr inbounds [2 x double], [2 x double]* %newstruct11, i64 0, i64 0, !dbg !155] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double %51, double* %54, align 8, !dbg !155, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %55 = getelementptr inbounds [2 x double], [2 x double]* %newstruct11, i64 0, i64 1, !dbg !155] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double %53, double* %55, align 8, !dbg !155, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %unbox12 = load i64, i64 addrspace(11)* %35, align 8, !dbg !156, !tbaa !26, !alias.scope !43, !noalias !46] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %56 = sitofp i64 %unbox12 to double, !dbg !156] = 1 val:1 type: {[-1]:Float@double}
 constantinst[  %57 = getelementptr inbounds [2 x double], [2 x double]* %newstruct13, i64 0, i64 0, !dbg !160] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double %56, double* %57, align 8, !dbg !160, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %memcpy_refined_dst14 = getelementptr inbounds [2 x double], [2 x double]* %newstruct13, i64 0, i64 1, !dbg !160] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  store double 0.000000e+00, double* %memcpy_refined_dst14, align 8, !dbg !160, !tbaa !100, !alias.scope !102, !noalias !103] = 1 val:1 type: {}
 constantinst[  %58 = addrspacecast [2 x double]* %newstruct11 to [2 x double] addrspace(11)*, !dbg !159] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %59 = addrspacecast [2 x double]* %newstruct13 to [2 x double] addrspace(11)*, !dbg !159] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  call fastcc void @julia___2382([2 x double]* noalias nocapture noundef nonnull writeonly sret([2 x double]) align 8 dereferenceable(16) %7, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %58, [2 x double] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %59) #31, !dbg !159] = 0 val:1 type: {}
 constantinst[  %60 = getelementptr inbounds [2 x double], [2 x double]* %7, i64 0, i64 0, !dbg !163] = 1 val:0 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %bitcast = load double, double* %60, align 8, !dbg !168, !tbaa !100, !alias.scope !102, !noalias !171] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %61 = getelementptr inbounds [2 x double], [2 x double]* %7, i64 0, i64 1, !dbg !172] = 1 val:0 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %unbox18 = load double, double* %61, align 8, !dbg !174, !tbaa !100, !alias.scope !102, !noalias !171] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %62 = icmp slt i64 %4, 0, !dbg !175] = 1 val:1 type: {[-1]:Integer}
 constantinst[  br i1 %62, label %L111, label %L96, !dbg !180] = 1 val:1 type: {}
 constantinst[  %box29 = call noalias nonnull dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839853264 to {}*) to {} addrspace(10)*)) #32, !dbg !186] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %63 = bitcast {} addrspace(10)* %box29 to i64 addrspace(10)*, !dbg !186] = 1 val:1 type: {}
 constantinst[  store i64 %37, i64 addrspace(10)* %63, align 8, !dbg !186, !tbaa !132, !alias.scope !136, !noalias !137] = 1 val:1 type: {}
 constantinst[  %64 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* nonnull @ijl_invoke, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4818200976 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4818200384 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4311988784 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4839854064 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %box29) #33, !dbg !186] = 1 val:0 type: {[-1]:Pointer}
 constantinst[  %65 = addrspacecast {} addrspace(10)* %64 to {} addrspace(12)*, !dbg !186] = 1 val:0 type: {}
 constantinst[  call void @ijl_throw({} addrspace(12)* %65) #31, !dbg !186] = 1 val:1 type: {}
 constantinst[  unreachable, !dbg !186] = 1 val:1 type: {}
 constantinst[  %66 = icmp slt i64 %5, 1, !dbg !187] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %67 = call i64 @llvm.smax.i64(i64 %4, i64 1) #31, !dbg !188] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %68 = icmp slt i64 %67, %5, !dbg !188] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %value_phi21.off0 = select i1 %66, i1 true, i1 %68, !dbg !188] = 1 val:1 type: {[-1]:Integer}
 constantinst[  br i1 %value_phi21.off0, label %L107, label %L105, !dbg !188] = 1 val:1 type: {}
 constantinst[  %69 = icmp slt i64 %3, 0, !dbg !189] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %70 = sub i64 0, %3, !dbg !192] = 1 val:1 type: {[-1]:Anything}
 constantinst[  %71 = icmp ugt i64 %70, 63, !dbg !193] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %72 = lshr i64 -1, %70, !dbg !193] = 1 val:1 type: {[-1]:Anything}
 constantinst[  %73 = select i1 %71, i64 0, i64 %72, !dbg !193] = 1 val:1 type: {[-1]:Anything}
 constantinst[  %74 = icmp ugt i64 %3, 63, !dbg !195] = 1 val:1 type: {[-1]:Integer}
 constantinst[  %75 = shl nsw i64 -1, %3, !dbg !195] = 1 val:1 type: {[-1]:Anything}
 constantinst[  %76 = select i1 %74, i64 0, i64 %75, !dbg !195] = 1 val:1 type: {[-1]:Anything}
 constantinst[  %77 = select i1 %69, i64 %73, i64 %76, !dbg !196] = 1 val:1 type: {[-1]:Anything}
 constantinst[  %bitcast_coercion15 = bitcast double %bitcast to i64, !dbg !168] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %78 = and i64 %77, %bitcast_coercion15, !dbg !198] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %bitcast_coercion16 = bitcast i64 %78 to double, !dbg !168] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %79 = fsub double %bitcast, %bitcast_coercion16, !dbg !199] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %80 = fadd double %unbox18, %79, !dbg !174] = 0 val:0 type: {[-1]:Float@double}
 constantinst[  %newstruct22.sroa.0.sroa.0.0.newstruct22.sroa.0.0..sroa_cast36.sroa_cast = bitcast [2 x double]* %6 to i8*, !dbg !200] = 1 val:0 type: {[-1]:Pointer, [-1,-1]:Float@double}
 constantinst[  %81 = bitcast { [2 x double], [2 x double], i64, i64 }* %0 to i8*] = 1 val:0 type: {[-1]:Pointer, [-1,0]:Float@double, [-1,8]:Float@double, [-1,16]:Float@double, [-1,24]:Float@double, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer, [-1,40]:Integer, [-1,41]:Integer, [-1,42]:Integer, [-1,43]:Integer, [-1,44]:Integer, [-1,45]:Integer, [-1,46]:Integer, [-1,47]:Integer}
 constantinst[  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(16) %81, i8* noundef nonnull align 8 dereferenceable(16) %newstruct22.sroa.0.sroa.0.0.newstruct22.sroa.0.0..sroa_cast36.sroa_cast, i64 16, i1 false) #31, !dbg !200, !noalias !201] = 0 val:1 type: {}
 constantinst[  %newstruct22.sroa.0.sroa.2.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx42 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 0, !dbg !200] = 1 val:0 type: {[-1]:Pointer, [-1,0]:Float@double}
 constantinst[  %82 = bitcast double* %newstruct22.sroa.0.sroa.2.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx42 to i64*, !dbg !200] = 1 val:0 type: {[-1]:Pointer, [-1,0]:Float@double}
 constantinst[  store i64 %78, i64* %82, align 8, !dbg !200, !noalias !201] = 0 val:1 type: {}
 constantinst[  %newstruct22.sroa.0.sroa.3.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx43 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 1, i64 1, !dbg !200] = 1 val:0 type: {[-1]:Pointer, [-1,0]:Float@double}
 constantinst[  store double %80, double* %newstruct22.sroa.0.sroa.3.0.newstruct22.sroa.0.0..sroa_cast.sroa_idx43, align 8, !dbg !200, !noalias !201] = 0 val:1 type: {}
 constantinst[  %newstruct22.sroa.3.0..sroa_idx38 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 2, !dbg !200] = 1 val:0 type: {[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer}
 constantinst[  store i64 %4, i64* %newstruct22.sroa.3.0..sroa_idx38, align 8, !dbg !200, !noalias !201] = 1 val:1 type: {}
 constantinst[  %newstruct22.sroa.4.0..sroa_idx39 = getelementptr inbounds { [2 x double], [2 x double], i64, i64 }, { [2 x double], [2 x double], i64, i64 }* %0, i64 0, i32 3, !dbg !200] = 1 val:0 type: {[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer}
 constantinst[  store i64 %5, i64* %newstruct22.sroa.4.0..sroa_idx39, align 8, !dbg !200, !noalias !201] = 1 val:1 type: {}
 constantinst[  ret void, !dbg !200] = 1 val:1 type: {}
 constantinst[  %83 = call noalias nonnull "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @ijl_box_int64(i64 signext %4) #35, !dbg !188] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Integer}
 constantinst[  %84 = call noalias nonnull "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Integer}" {} addrspace(10)* @ijl_box_int64(i64 signext %5) #35, !dbg !188] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Integer}
 constantinst[  %85 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* nonnull @ijl_invoke, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4796769728 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4772842608 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4906089056 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %83, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4906089024 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %84) #36, !dbg !188] = 1 val:1 type: {[-1]:Pointer}
 constantinst[  %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4838711792 to {}*) to {} addrspace(10)*)) #32, !dbg !188] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Pointer}
 constantinst[  %86 = bitcast {} addrspace(10)* %box to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !188] = 1 val:1 type: {}
 constantinst[  %87 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %86, i64 0, i64 0, !dbg !188] = 1 val:1 type: {}
 constantinst[  store {} addrspace(10)* %85, {} addrspace(10)* addrspace(10)* %87, align 8, !dbg !188, !tbaa !132, !alias.scope !136, !noalias !137] = 1 val:1 type: {}
 constantinst[  %88 = addrspacecast {} addrspace(10)* %box to {} addrspace(12)*, !dbg !188] = 1 val:1 type: {}
 constantinst[  call void @ijl_throw({} addrspace(12)* %88) #31, !dbg !188] = 1 val:1 type: {}
 constantinst[  unreachable, !dbg !188] = 1 val:1 type: {}
 constantinst[  %89 = call nonnull {} addrspace(10)* @julia_string_2373({} addrspace(10)* nofree noundef nonnull align 32 addrspacecast ({}* inttoptr (i64 4906089120 to {}*) to {} addrspace(10)*), i64 signext %4) #31, !dbg !180] = 1 val:1 type: {[-1]:Pointer}
 constantinst[  %box25 = call noalias nonnull dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 4838711792 to {}*) to {} addrspace(10)*)) #32, !dbg !180] = 1 val:1 type: {[-1]:Pointer, [-1,-1]:Pointer}
 constantinst[  %90 = bitcast {} addrspace(10)* %box25 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !180] = 1 val:1 type: {}
 constantinst[  %91 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %90, i64 0, i64 0, !dbg !180] = 1 val:1 type: {}
 constantinst[  store {} addrspace(10)* %89, {} addrspace(10)* addrspace(10)* %91, align 8, !dbg !180, !tbaa !132, !alias.scope !136, !noalias !137] = 1 val:1 type: {}
 constantinst[  %92 = addrspacecast {} addrspace(10)* %box25 to {} addrspace(12)*, !dbg !180] = 1 val:1 type: {}
 constantinst[  call void @ijl_throw({} addrspace(12)* %92) #31, !dbg !180] = 1 val:1 type: {}
 constantinst[  unreachable, !dbg !180] = 1 val:1 type: {}
cannot handle unknown binary operator:   %78 = and i64 %77, %bitcast_coercion15, !dbg !198


Stacktrace:
 [1] &
   @ ./int.jl:347
 [2] truncmask
   @ ./twiceprecision.jl:29
 [3] truncbits
   @ ./twiceprecision.jl:34
 [4] twiceprecision
   @ ./twiceprecision.jl:247
 [5] TwicePrecision
   @ ./twiceprecision.jl:236
 [6] steprangelen_hp
   @ ./twiceprecision.jl:344


Stacktrace:
  [1] throwerr(cstr::Cstring)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/SiyIj/src/compiler.jl:1623
  [2] reinterpret
    @ ./essentials.jl:581 [inlined]
  [3] truncmask
    @ ./twiceprecision.jl:29 [inlined]
  [4] truncbits
    @ ./twiceprecision.jl:34 [inlined]
  [5] twiceprecision
    @ ./twiceprecision.jl:247 [inlined]
  [6] TwicePrecision
    @ ./twiceprecision.jl:236 [inlined]
  [7] steprangelen_hp
    @ ./twiceprecision.jl:344
  [8] floatrange
    @ ./twiceprecision.jl:390
  [9] Colon
    @ ./twiceprecision.jl:414
 [10] f
    @ ~/Desktop/current.jl:149 [inlined]
 [11] fwddiffejulia_f_2354wrap
    @ ~/Desktop/current.jl:0
 [12] macro expansion
    @ ~/.julia/packages/Enzyme/SiyIj/src/compiler.jl:6622 [inlined]
 [13] enzyme_call
    @ ~/.julia/packages/Enzyme/SiyIj/src/compiler.jl:6223 [inlined]
 [14] ForwardModeThunk
    @ ~/.julia/packages/Enzyme/SiyIj/src/compiler.jl:6103 [inlined]
 [15] autodiff
    @ ~/.julia/packages/Enzyme/SiyIj/src/Enzyme.jl:417 [inlined]
 [16] autodiff
    @ ~/.julia/packages/Enzyme/SiyIj/src/Enzyme.jl:333 [inlined]
 [17] autodiff(mode::ForwardMode{FFIABI}, f::typeof(f), args::Duplicated{Float64})
    @ Enzyme ~/.julia/packages/Enzyme/SiyIj/src/Enzyme.jl:318
 [18] top-level scope
    @ ~/Desktop/current.jl:153

@ChrisRackauckas
Copy link
Contributor

Inside of the range code there is a Base.truncbits which is the truncation back down from the extended float type to Float64, and this is where the & shows up. It's effectively just an identity function except for the last bit correction, and thus treating it as an identity function would be sufficient for automatic differentiation. However, for some reason Enzyme is passing zero dvals into it, i.e. treating things as constant before it even gets there, so fixing that simply results in a zero derivative.

using Enzyme, ReverseDiff, Tracker
import .EnzymeRules: forward, reverse, augmented_primal
using .EnzymeRules

function forward(func::Const{typeof(Base.truncbits)}, ::Type{<:Duplicated}, x::Duplicated, mask)
    println("Using custom rule!")
    maskval = if hasproperty(mask, :val)
        mask.val
    else
        mask
    end

    ret = func.val(x.val, maskval)
    @show x.dval
    return Duplicated(ret, one(ret))
end

function augmented_primal(config::ConfigWidth{1}, func::Const{typeof(Base.truncbits)}, ::Type{<:Union{Const,Active}},
                            x::Union{Const,Active}, mask)
    println("In custom augmented primal rule.")

    maskval = if hasproperty(mask, :val)
        mask.val
    else
        mask
    end

    # Compute primal
    if needs_primal(config)
        primal = func.val(x.val, maskval)
    else
        primal = nothing
    end

    # Return an AugmentedReturn object with shadow = nothing
    return AugmentedReturn(primal, nothing, nothing)
end

function reverse(config::ConfigWidth{1}, func::Const{typeof(Base.truncbits)}, ::Type{<:Union{Const,Active}}, dret::Union{Active,Const},
                 x::Union{Const,Active}, mask)
    println("In custom reverse rule.")
    return (one(x.val), nothing)
end

function reverse(config::ConfigWidth{1}, func::Const{typeof(Base.truncbits)}, ::Type{<:Active}, dret::Nothing,
                 x::Active, mask)
    println("In custom reverse rule.")
    return (one(x.val), nothing)
end

function reverse(config::ConfigWidth{1}, func::Const{typeof(Base.truncbits)}, ::Type{<:Const}, dret::Nothing,
                 x::Const, mask)
    println("In custom reverse rule.")
    return (nothing, nothing)
end

function reverse(config::ConfigWidth{1}, func::Const{typeof(Base.truncbits)}, ::Active, dret::Nothing,
                 x::Active, mask)
    println("In custom reverse rule.")
    return (one(x.val), nothing)
end

function reverse(config::ConfigWidth{1}, func::Const{typeof(Base.truncbits)}, ::Const, dret::Nothing,
                 x::Const, mask)
    println("In custom reverse rule.")
    return (nothing, nothing)
end

function f(x)
    ts = Array(0.0:x:3.0)
    sum(ts)
end
f(0.25)
Enzyme.autodiff(Forward, f,  Duplicated(0.25, 1.0))
Enzyme.autodiff(Reverse, f,  Active, Active(0.25))
julia> Enzyme.autodiff(Forward, f,  Duplicated(0.25, 1.0))
Using custom rule!
x.dval = 0.0
(0.0,)

julia> Enzyme.autodiff(Reverse, f,  Active, Active(0.25))
In custom augmented primal rule.
In custom augmented primal rule.
In custom augmented primal rule.
In custom reverse rule.
In custom reverse rule.
In custom reverse rule.
((0.0,),)

To me that seems like a bug in the activity detection, possibly caused by some of the reinterprets back from UInt representations. I'll see if I can fix this with a rule targeted a bit higher.

@ChrisRackauckas
Copy link
Contributor

For reference, targeting it like this with other AD systems works well:

Base.div(x::Tracker.TrackedReal, y::Tracker.TrackedReal, r::RoundingMode) = div(Tracker.value(x), Tracker.value(y), r)
_y, back = Tracker.forward(f, 0.25)
back(1) 
ulia> back(1)
(78.0 (tracked),)

@ChrisRackauckas
Copy link
Contributor

In my journey here, one level up:

function forward(func::Const{typeof(Base.steprangelen_hp)}, ::Type{<:Duplicated}, outtype::Const{Type{Float64}}, ref::Union{Const,Active}, step::Union{Const,Active}, nb, len, offset)
    println("Using custom rule!")
    ret = func.val(getval.((outtype, ref, step, nb, len, offset))...)
    @show outtype, ref, step, nb, len, offset
    start = ref isa Const ? zero(eltype(ret)) : one(eltype(ret))
    dstep = step isa Const ? zero(eltype(ret)) : one(eltype(ret))

    return Duplicated(ret, StepRangeLen(Base.TwicePrecision(start), Base.TwicePrecision(dstep), length(ret)))
end

All of the values are Int though at this level, so again Enzyme keeps them const. I think I know the right target now though, it has to be the : dispatch itself since

# Construct range for rational start=start_n/den, step=step_n/den
function floatrange(::Type{T}, start_n::Integer, step_n::Integer, len::Integer, den::Integer) where T
    len = len + 0 # promote with Int
    if len < 2 || step_n == 0
        return steprangelen_hp(T, (start_n, den), (step_n, den), 0, len, oneunit(len))
    end
    # index of smallest-magnitude value
    L = typeof(len)
    imin = clamp(round(typeof(len), -start_n/step_n+1), oneunit(L), len)
    # Compute smallest-magnitude element to 2x precision
    ref_n = start_n+(imin-1)*step_n  # this shouldn't overflow, so don't check
    nb = nbitslen(T, len, imin)
    @show steprangelen_hp(T, (ref_n, den), (step_n, den), nb, len, imin)
end

so everything below that is already treated as const.

@ChrisRackauckas
Copy link
Contributor

Okay moving a level higher, I got a forward rule to work:

using Enzyme, ReverseDiff, Tracker
import .EnzymeRules: forward, reverse, augmented_primal
using .EnzymeRules

getval(x) = hasproperty(x, :val) ? x.val : x
function forward(func::Const{Colon}, ::Type{<:Duplicated}, start::Union{Const, Active}, step::Union{Const, Active}, stop::Union{Const, Active})
    ret = func.val(getval.((start, step, stop))...)
    dstart = start isa Const ? zero(eltype(ret)) : one(eltype(ret))
    dstep = step isa Const ? zero(eltype(ret)) : one(eltype(ret))

    return Duplicated(ret, range(dstart, step=dstep, length=length(ret)))
end

function f1(x)
    ts = Array(0.0:x:3.0)
    sum(ts)
end
function f2(x)
    ts = Array(0.0:.25:3.0)
    sum(ts) + x
end
function f3(x)
    ts = Array(x:.25:3.0)
    sum(ts)
end
function f4(x)
    ts = Array(0.0:.25:x)
    sum(ts)
end
f1(0.25)
Enzyme.autodiff(Forward, f1,  Duplicated(0.25, 1.0)) == (78,)
Enzyme.autodiff(Forward, f2,  Duplicated(0.25, 1.0)) == (1.0,)
Enzyme.autodiff(Forward, f3,  Duplicated(0.25, 1.0)) == (12,)
Enzyme.autodiff(Forward, f4,  Duplicated(3.0, 1.0)) == (0,)

using ForwardDiff
ForwardDiff.derivative(f1, 0.25)
ForwardDiff.derivative(f2, 0.25)
ForwardDiff.derivative(f3, 0.25)
ForwardDiff.derivative(f4, 3.0)

🎉

@ChrisRackauckas
Copy link
Contributor

For the reverse mode, I need to figure out how to make Enzyme run a custom piece of code. The underlying problem is that malformed ranges are lossy:

julia> 10:1:1
10:1:9

The reason this comes up is because Enzyme does a very naive construction of the dret range for the reverse. You can see it like this:

using Enzyme, ReverseDiff, Tracker
import .EnzymeRules: reverse, augmented_primal
using .EnzymeRules

function augmented_primal(config::ConfigWidth{1}, func::Const{Colon}, ::Type{<:Active},
                          start, step ,stop)
    println("In custom augmented primal rule.")
    # Compute primal
    if needs_primal(config)
        primal = func.val(start.val, step.val, stop.val)
    else
        primal = nothing
    end
    return AugmentedReturn(primal, nothing, nothing)
end

function reverse(config::ConfigWidth{1}, func::Const{Colon}, dret, tape::Nothing,
                 start, step, stop)
    println("In custom reverse rule.")
    _dret = @show dret.val
    #fixedreverse = if _dret.start > _dret.stop && _dret.step > 0
    #     _dret.stop:_dret.step:_dret.start
    #else
    #    _dret
    #end
    dstart = start isa Const ? nothing : one(eltype(dret.val))
    dstep = step isa Const ? nothing : one(eltype(dret.val))
    dstop = stop isa Const ? nothing : zero(eltype(dret.val))
    return (dstart, dstep, dstop)
end

Enzyme.autodiff(Reverse, f1,  Active, Active(0.25))
Enzyme.autodiff(Reverse, f2,  Active, Active(0.25)) == ((1.0,),)
Enzyme.autodiff(Reverse, f3,  Active, Active(0.25))
Enzyme.autodiff(Reverse, f4,  Active, Active(0.25)) == ((0.0,),)

using ForwardDiff
ForwardDiff.derivative(f1, 0.25)
ForwardDiff.derivative(f2, 0.25)
ForwardDiff.derivative(f3, 0.25)
ForwardDiff.derivative(f4, 3.0)

That's the set of test cases, for the first one you can see it:

ulia> Enzyme.autodiff(Reverse, f1,  Active, Active(0.25))
In custom augmented primal rule.
(ref, step, len, offset) = (Base.TwicePrecision{Float64}(0.0, 0.0), Base.TwicePrecision{Float64}(0.25, 0.0), 13, 1)
primal = 0.0:0.25:3.0
In custom reverse rule.
dret.val = 182.0:156.0:26.0
((1.0,),)

Basically dret.val = 182.0:156.0:26.0, the 26.0 is not the true value. Same as

julia> 10:1:1
10:1:9

the 26 is simply 182-156. But to construct the adjoint I need the real value. I believe since it's just the identity it ends up just needing sum(dret.val), but that's always 0 for the malformed arrays that are naively reversed.

I can patch Julia to not be lossy here. I can Revise in Base/twiceprecision.jl:

function (:)(start::T, step::T, stop::T) where T<:IEEEFloat
    step == 0 && throw(ArgumentError("range step cannot be zero"))
    # see if the inputs have exact rational approximations (and if so,
    # perform all computations in terms of the rationals)
    step_n, step_d = rat(step)
    if step_d != 0 && T(step_n/step_d) == step
        start_n, start_d = rat(start)
        stop_n, stop_d = rat(stop)
        if start_d != 0 && stop_d != 0 &&
                T(start_n/start_d) == start && T(stop_n/stop_d) == stop
            den = lcm_unchecked(start_d, step_d) # use same denominator for start and step
            m = maxintfloat(T, Int)
            if den != 0 && abs(start*den) <= m && abs(step*den) <= m &&  # will round succeed?
                    rem(den, start_d) == 0 && rem(den, step_d) == 0      # check lcm overflow
                start_n = round(Int, start*den)
                step_n = round(Int, step*den)
                len = max(0, Int(div(den*stop_n - stop_d*start_n + step_n*stop_d, step_n*stop_d)))
                # Integer ops could overflow, so check that this makes sense
                if isbetween(start, start + (len-1)*step, stop + step/2) &&
                        !isbetween(start, start + len*step, stop)
                    # Return a 2x precision range
                    return floatrange(T, start_n, step_n, len, den)
                end
            end
        end
    end
    # Fallback, taking start and step literally
    # n.b. we use Int as the default length type for IEEEFloats
    lf = (stop-start)/step
    if lf < 0
        len = 0
    elseif lf == 0
        len = 1
    else
        len = round(Int, lf) + 1
        stop′ = start + (len-1)*step
        # if we've overshot the end, subtract one:
        len -= (start < stop < stop′) + (start > stop > stop′)
    end
    steprangelen_hp(T, start, step, 0, len, 1)
end

becomes:

function (:)(start::T, step::T, stop::T) where T<:IEEEFloat
    step == 0 && throw(ArgumentError("range step cannot be zero"))
    # see if the inputs have exact rational approximations (and if so,
    # perform all computations in terms of the rationals)
    step_n, step_d = rat(step)
    if step_d != 0 && T(step_n/step_d) == step
        start_n, start_d = rat(start)
        stop_n, stop_d = rat(stop)
        if start_d != 0 && stop_d != 0 &&
                T(start_n/start_d) == start && T(stop_n/stop_d) == stop
            den = lcm_unchecked(start_d, step_d) # use same denominator for start and step
            m = maxintfloat(T, Int)
            if den != 0 && abs(start*den) <= m && abs(step*den) <= m &&  # will round succeed?
                    rem(den, start_d) == 0 && rem(den, step_d) == 0      # check lcm overflow
                start_n = round(Int, start*den)
                step_n = round(Int, step*den)
                len = max(0, Int(div(den*stop_n - stop_d*start_n + step_n*stop_d, step_n*stop_d)))
                # Integer ops could overflow, so check that this makes sense
                if isbetween(start, start + (len-1)*step, stop + step/2) &&
                        !isbetween(start, start + len*step, stop)
                    # Return a 2x precision range
                    return floatrange(T, start_n, step_n, len, den)
                end
            end
        end
    end
    # Fallback, taking start and step literally
    # n.b. we use Int as the default length type for IEEEFloats
    lf = (stop-start)/step
    #if lf < 0
    #    len = 0
    #elseif lf == 0
    #    len = 1
    #else
        len = round(Int, lf) + 1
        stop′ = start + (len-1)*step
        # if we've overshot the end, subtract one:
        len -= (start < stop < stop′) + (start > stop > stop′)
    #end
    if len < 0
        step = -step
        len = -len + 2
    end
    steprangelen_hp(T, start, step, 0, len, 1)
end

and with this patch:

julia> 10.0:1.0:1.0
(ref, step, len, offset) = (Base.TwicePrecision{Float64}(10.0, 0.0), Base.TwicePrecision{Float64}(-1.0, 0.0), 10, 1)
10.0:-1.0:1.0

so bueno, this let's me retain the information. However, when I Revise this in, Enzyme does not seem to use it in its construction of dret.val:

In custom augmented primal rule.
In custom reverse rule.
dret.val = 182.0:156.0:26.0
((1.0,),)

In custom augmented primal rule.
In custom reverse rule.
dret.val = 182.0:156.0:26.0
true

In custom augmented primal rule.
In custom reverse rule.
dret.val = 156.0:132.0:24.0
((1.0,),)

In custom augmented primal rule.
In custom reverse rule.
dret.val = 6.0:2.0:4.0
true

You can see it's still using some constructor that's forcing the malformed range to be lossy, and thus the rule cannot be written. So there's two paths here:

  1. @wsmoses is there a way to make Enzyme run a custom version of the primal code to not be lossy here,
  2. It looks like the lossy choice stems from JuliaLang/julia@2065842 v0.6 days. @StefanKarpinski would Base be okay with accepting the construction of this malformed version so it can be semantically reconstructed within the reverse pass. Since Enzyme does this construction without knowing the semantics of the ranges it's just a programmatic action on the struct values itself that ends up not making sense as a range, and that's okay but I can't "fix" the values in the adjoint because this constructor choice has already dropped the values required to make the fix.

@wsmoses
Copy link
Member

wsmoses commented Jul 14, 2024

I'm not sure I understand/follow. Enzyme wouldn't construct the dval by calling this function, but creating a zero'd tuple and += the value from the uses.

@wsmoses
Copy link
Member

wsmoses commented Jul 15, 2024

@ChrisRackauckas do you want to move this to a PR to make it easier to comment?

@wsmoses
Copy link
Member

wsmoses commented Jul 21, 2024

bump @ChrisRackauckas can you move this to a PR?

ChrisRackauckas added a commit to ChrisRackauckas/Enzyme.jl that referenced this issue Jul 21, 2024
This is part 1 one solving EnzymeAD#274. It does the forward mode rules as those are simpler. A separate PR will do the WIP reverse mode rules as that seems to be a bit more complex.
ChrisRackauckas added a commit to ChrisRackauckas/Enzyme.jl that referenced this issue Jul 21, 2024
This is part 1 one solving EnzymeAD#274. It does the forward mode rules as those are simpler. A separate PR will do the WIP reverse mode rules as that seems to be a bit more complex.

Add missing `@test`

don't forget the rule
ChrisRackauckas added a commit to ChrisRackauckas/Enzyme.jl that referenced this issue Jul 21, 2024
This is the second PR to fix EnzymeAD#274. It's separated as I think the forward mode one can just be merged no problem, and this one may take a little bit more time.

The crux of why this one is hard is because of how Julia deals with malformed ranges.

```
Basically dret.val = 182.0:156.0:26.0, the 26.0 is not the true value. Same as

julia> 10:1:1
10:1:9
```

Because of that behavior, the reverse `dret` does not actually have the information as to what its final point is, and its length is "incorrect" as it's changed by the constructor. In order to "fix" the reverse, we'd want to swap the `step` to negative and then use the same start/stop, but that information is already lost so it cannot be fixed within the rule. You can see the commented out code that would do the fixing if the information is there, and without that we cannot get a correctly sized reversed range for the rule.

But it's a bit puzzling to figure out how to remove that behavior. In Base Julia it seems to be done in the `function (:)(start::T, step::T, stop::T) where T<:IEEEFloat`, and as I showed in the issue, I can overload that function and the behavior goes away, but Enzyme's constructed range still has that truncation behavior, which means I missed spot or something.
ChrisRackauckas added a commit to ChrisRackauckas/Enzyme.jl that referenced this issue Jul 21, 2024
This is the second PR to fix EnzymeAD#274. It's separated as I think the forward mode one can just be merged no problem, and this one may take a little bit more time.

The crux of why this one is hard is because of how Julia deals with malformed ranges.

```
Basically dret.val = 182.0:156.0:26.0, the 26.0 is not the true value. Same as

julia> 10:1:1
10:1:9
```

Because of that behavior, the reverse `dret` does not actually have the information as to what its final point is, and its length is "incorrect" as it's changed by the constructor. In order to "fix" the reverse, we'd want to swap the `step` to negative and then use the same start/stop, but that information is already lost so it cannot be fixed within the rule. You can see the commented out code that would do the fixing if the information is there, and without that we cannot get a correctly sized reversed range for the rule.

But it's a bit puzzling to figure out how to remove that behavior. In Base Julia it seems to be done in the `function (:)(start::T, step::T, stop::T) where T<:IEEEFloat`, and as I showed in the issue, I can overload that function and the behavior goes away, but Enzyme's constructed range still has that truncation behavior, which means I missed spot or something.
ChrisRackauckas added a commit to ChrisRackauckas/Enzyme.jl that referenced this issue Jul 21, 2024
This is the second PR to fix EnzymeAD#274. It's separated as I think the forward mode one can just be merged no problem, and this one may take a little bit more time.

The crux of why this one is hard is because of how Julia deals with malformed ranges.

```
Basically dret.val = 182.0:156.0:26.0, the 26.0 is not the true value. Same as

julia> 10:1:1
10:1:9
```

Because of that behavior, the reverse `dret` does not actually have the information as to what its final point is, and its length is "incorrect" as it's changed by the constructor. In order to "fix" the reverse, we'd want to swap the `step` to negative and then use the same start/stop, but that information is already lost so it cannot be fixed within the rule. You can see the commented out code that would do the fixing if the information is there, and without that we cannot get a correctly sized reversed range for the rule.

But it's a bit puzzling to figure out how to remove that behavior. In Base Julia it seems to be done in the `function (:)(start::T, step::T, stop::T) where T<:IEEEFloat`, and as I showed in the issue, I can overload that function and the behavior goes away, but Enzyme's constructed range still has that truncation behavior, which means I missed spot or something.

namespace ConfigWidth

namespace

namespace needs_primal

namespace AugmentedReturn
wsmoses added a commit that referenced this issue Jul 22, 2024
* Add internal forward-mode rules for ranges

This is part 1 one solving #274. It does the forward mode rules as those are simpler. A separate PR will do the WIP reverse mode rules as that seems to be a bit more complex.

Add missing `@test`

don't forget the rule

* namespace

* Update internal_rules.jl

* Update internal_rules.jl

* Update src/internal_rules.jl

* Update internal_rules.jl

* Update internal_rules.jl

---------

Co-authored-by: William Moses <gh@wsmoses.com>
wsmoses pushed a commit to ChrisRackauckas/Enzyme.jl that referenced this issue Aug 26, 2024
This is the second PR to fix EnzymeAD#274. It's separated as I think the forward mode one can just be merged no problem, and this one may take a little bit more time.

The crux of why this one is hard is because of how Julia deals with malformed ranges.

```
Basically dret.val = 182.0:156.0:26.0, the 26.0 is not the true value. Same as

julia> 10:1:1
10:1:9
```

Because of that behavior, the reverse `dret` does not actually have the information as to what its final point is, and its length is "incorrect" as it's changed by the constructor. In order to "fix" the reverse, we'd want to swap the `step` to negative and then use the same start/stop, but that information is already lost so it cannot be fixed within the rule. You can see the commented out code that would do the fixing if the information is there, and without that we cannot get a correctly sized reversed range for the rule.

But it's a bit puzzling to figure out how to remove that behavior. In Base Julia it seems to be done in the `function (:)(start::T, step::T, stop::T) where T<:IEEEFloat`, and as I showed in the issue, I can overload that function and the behavior goes away, but Enzyme's constructed range still has that truncation behavior, which means I missed spot or something.

namespace ConfigWidth

namespace

namespace needs_primal

namespace AugmentedReturn
wsmoses added a commit that referenced this issue Aug 26, 2024
* WIP: Add internal reverse-mode rules for ranges

This is the second PR to fix #274. It's separated as I think the forward mode one can just be merged no problem, and this one may take a little bit more time.

The crux of why this one is hard is because of how Julia deals with malformed ranges.

```
Basically dret.val = 182.0:156.0:26.0, the 26.0 is not the true value. Same as

julia> 10:1:1
10:1:9
```

Because of that behavior, the reverse `dret` does not actually have the information as to what its final point is, and its length is "incorrect" as it's changed by the constructor. In order to "fix" the reverse, we'd want to swap the `step` to negative and then use the same start/stop, but that information is already lost so it cannot be fixed within the rule. You can see the commented out code that would do the fixing if the information is there, and without that we cannot get a correctly sized reversed range for the rule.

But it's a bit puzzling to figure out how to remove that behavior. In Base Julia it seems to be done in the `function (:)(start::T, step::T, stop::T) where T<:IEEEFloat`, and as I showed in the issue, I can overload that function and the behavior goes away, but Enzyme's constructed range still has that truncation behavior, which means I missed spot or something.

namespace ConfigWidth

namespace

namespace needs_primal

namespace AugmentedReturn

* Complete implementation

* fix

* fix

---------

Co-authored-by: Billy Moses <wmoses@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants