Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic recompilation of dependent functions #265

Closed
vtjnash opened this issue Nov 26, 2011 · 63 comments · Fixed by #17057 or #18413
Closed

automatic recompilation of dependent functions #265

vtjnash opened this issue Nov 26, 2011 · 63 comments · Fixed by #17057 or #18413
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@vtjnash
Copy link
Member

vtjnash commented Nov 26, 2011

if I define a function d2 before a function d1 which calls d2 then change d2, d1 uses the old definition for d2.
I assume this is because it is all precompiled, but maybe there should be a note warning of this? Or would it be possible to replace the old definition with a longjmp to the new one?
(Mostly important for the REPL, since I don't always do a full load)

julia> function d2()
       a
       end

julia> function d()
         d2()
       end

julia> d()
in d: a not defined

julia> function d2()
       b=2
       end

julia> d()
in d: a not defined

julia> d2
Methods for generic function d2
d2() at prompt:2

julia> function d()
         d2()
       end

julia> d()
2
@StefanKarpinski
Copy link
Member

I believe this is because the code for d2() is generated when the function is defined so the d() method is resolved at that time. As you noted, you can redefine d2() as it was originally defined to make it work as expected. Ideally, we would automatically re-define anything that depends on d() when it is changed like this, but we don't have any of the metadata in place that would allow that. I guess we could put a warning about this behavior in the manual. Better still would be to fix it, but I think it's a bit tricky to do :-/

@ghost ghost assigned JeffBezanson Nov 28, 2011
@JeffBezanson
Copy link
Member

I was hoping nobody would notice this, but I guess that wasn't realistic of me :)

@vtjnash
Copy link
Member Author

vtjnash commented Nov 28, 2011

It's more fun than that; it really lets you see the JIT at work, since it isn't resolved until execution:

julia> function f2()
       a
       end

julia> function f1()
       f2()
       end

julia> function f2()
       2
       end

julia> f1()
2

However, I fear this can lead to some unexpected results at the REPL, although I don't really know the best way to deal with it. Although I used an error to show the difference in dramatic fashion, it could also happens if I was trying to see the effect of changing an inner equation or constant!

@StefanKarpinski
Copy link
Member

There are two types of locations that may need to be updated when a method is altered:

  • places where the method is called explicitly as a function (call/ret)
  • places where the method has been inlined

The former could be updated just by changing the function body: it could be altered in place, or the calling sites could be actively updated to call a new location, or the old function body could be replaced with a stub that jumps to the new version, optionally patching the calling site. For inlining, we need to track the callers that have inlined the function and re-JIT them.

@JeffBezanson
Copy link
Member

Really all we can do is discard code, because even if a method wasn't inlined any function that calls it might depend on type behavior that can change if the method changes. This is mostly for interactive cases so re-compiling code is not a huge deal.
Requires same work as issue #47.

@StefanKarpinski
Copy link
Member

Bummer. Well, since it's primarily the repl that's affected, doing the right thing slowly is fine. At run-time, this should almost never kick in — why would someone define something and then redefine it again except interactively? If the solution ends up inducing a lot of overhead maybe turning it one should be optional and automatically done in the repl?

@StefanKarpinski
Copy link
Member

We need to document that this is undefined currently.

@StefanKarpinski
Copy link
Member

Technically, this isn't bug, it's an undefined behavior. When you redefine a method, the resulting behavior is undefined. So Julia can do whatever it likes, including what it currently does. Providing a well-defined behavior upon method redefinition is a feature, not a bug fix. I'm also not convinced that this is a v1.0 issue since going from an undefined behavior to providing a well-defined behavior is not a breaking change. This could be implemented in v1.1 without breaking any valid v1.0 code.

@glycerine
Copy link

Greg Clayton from Apple's LLVM/LLDB staff was kind enough to document how to elicit (with lldb libraries, a subproject of llvm) the necessary information to determine a binary's dependencies from the embedded symbol info (symbol imports); as well as those symbols exported by a binary (necessary to construct the complete dependency graph).

  • Jason

    On Mar 31, 2012, at 11:02 PM, Jason E. Aten wrote:

    Dear LLDB enthusiasts,

    I'm wondering if I can use the lldb library/libraries to replace the certain code running on OSX that now returns two lists of symbols-- similar to the output of (dyldinfo -lazy_bind -exports ); i.e. I need to list the symbols imported and exported by a binary shared object or bundle.

    My hope was that by using an lldb library, I would be able to use the same client code on OSX as on linux. (The linux version of the code currently uses libbfd and libdld to do the same thing, but the later is getting little love/maintenance).

    I'm looking through include/lldb/, as it seems like lldb would need this same info (imported symbol list, and exported symbol list for a Mach-O file) to function, but it's not clear which API to use. All suggestions/pointers to example code in lldb would be welcome!

    Thank you.
    Jason

    In case it is unclear what dyldinfo does, here is an example: (but I only need the symbol names; not the addresses or segments or sections):

    $ file /tmp/sample_bundle
    /tmp/sample_bundle: Mach-O 64-bit bundle x86_64

    $dyldinfo -lazy_bind -export /tmp/sample_bundle

    lazy binding information (from lazy_bind part of dyld info):
    segment section address index dylib symbol
    __DATA __la_symbol_ptr 0x00001030 0x0000 flat-namespace __mm_pop_chunk
    __DATA __la_symbol_ptr 0x00001038 0x0015 flat-namespace _dh_define
    export information (from trie):
    0x000008A0 _C_ipair
    0x00000920 _init_ipair
    0x00000BC0 _C_iprot
    0x00000C40 _C_ipi2
    0x00000CC0 _C_ipi1
    0x00001040 _K_ipair_R43808f40
    0x00001160 _K_ipi1_R5cb4475d
    0x00001260 _K_ipi2_R5cb4475d
    0x00001360 _K_iprot_Rfc8fe739
    0x00001460 _majver_ipair
    0x00001464 _minver_ipair

On Mon, Apr 2, 2012 at 3:13 PM, Greg Clayton gclayton@apple.com wrote:

Yes you can do this with LLDB. If you load a binary and dump its symbol table, you will see the information you want. For symbols that are lazily bound, you can look for "Trampoline" symbols:

cd lldb/test/lang/objc/foundation
make
lldb a.out
(lldb) target modules dump symtab a.out
Symtab, file = .../lldb/test/lang/objc/foundation/a.out, num_symbols = 54:
              Debug symbol
              |Synthetic symbol
              ||Externally Visible
              |||
Index   UserID DSX Type         File Address/Value Load Address       Size               Flags      Name
------- ------ --- ------------ ------------------ ------------------ ------------------ ---------- ----------------------------------
[    0]      0 D   SourceFile   0x0000000000000000                    Sibling -> [   15] 0x00640000 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/main.m
[    1]      2 D   ObjectFile   0x000000004f79f1ca                    0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/main.o
[    2]      4 D   Code         0x00000001000010f0                    0x00000000000000c0 0x000e0000 -[MyString initWithNSString:]
[    3]      8 D   Code         0x00000001000011b0                    0x0000000000000090 0x000e0000 -[MyString dealloc]
[    4]     12 D   Code         0x0000000100001240                    0x00000000000000a0 0x000e0000 -[MyString description]
[    5]     16 D   Code         0x00000001000012e0                    0x0000000000000020 0x000e0000 -[MyString descriptionPauses]
[    6]     20 D   Code         0x0000000100001300                    0x0000000000000030 0x000e0000 -[MyString setDescriptionPauses:]
[    7]     24 D   Code         0x0000000100001330                    0x0000000000000030 0x000e0000 -[MyString str_property]
[    8]     28 D   Code         0x0000000100001360                    0x0000000000000050 0x000e0000 -[MyString setStr_property:]
[    9]     32 D   Code         0x00000001000013b0                    0x0000000000000040 0x000f0000 Test_Selector
[   10]     36 D   Code         0x00000001000013f0                    0x0000000000000130 0x000f0000 Test_NSString
[   11]     40 D   Code         0x0000000100001520                    0x0000000000000120 0x000f0000 Test_MyString
[   12]     44 D   Code         0x0000000100001640                    0x00000000000001b0 0x000f0000 Test_NSArray
[   13]     48 D   Code         0x00000001000017f0                    0x00000000000000e1 0x000f0000 main
[   14]     56 D X Data         0x0000000100002680                    0x0000000000000000 0x00200000 my_global_str
[   15]     58 D   SourceFile   0x0000000000000000                    Sibling -> [   19] 0x00640000 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/my-base.m
[   16]     60 D   ObjectFile   0x000000004f79f1ca                    0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/my-base.o
[   17]     62 D   Code         0x00000001000018e0                    0x0000000000000020 0x000e0000 -[MyBase propertyMovesThings]
[   18]     66 D   Code         0x0000000100001900                    0x000000000000001f 0x000e0000 -[MyBase setPropertyMovesThings:]
[   19]     82     Data         0x0000000100002000                    0x0000000000000460 0x000e0000 pvars
[   20]     83     ObjCIVar     0x0000000100002518                    0x0000000000000148 0x001e0000 MyBase.propertyMovesThings
[   21]     84   X Data         0x0000000100002660                    0x0000000000000008 0x000f0000 NXArgc
[   22]     85   X Data         0x0000000100002668                    0x0000000000000008 0x000f0000 NXArgv
[   23]     86   X ObjCClass    0x00000001000024d8                    0x0000000000000028 0x000f0000 MyBase
[   24]     87   X ObjCClass    0x0000000100002460                    0x0000000000000028 0x000f0000 MyString
[   25]     88   X ObjCIVar     0x0000000100002510                    0x0000000000000008 0x000f0000 MyString._desc_pauses
[   26]     89   X ObjCIVar     0x0000000100002508                    0x0000000000000008 0x000f0000 MyString.date
[   27]     90   X ObjCIVar     0x0000000100002500                    0x0000000000000008 0x000f0000 MyString.str
[   28]     91   X ObjCMetaClass 0x00000001000024b0                    0x0000000000000028 0x000f0000 MyBase
[   29]     92   X ObjCMetaClass 0x0000000100002488                    0x0000000000000028 0x000f0000 MyString
[   30]     97   X Data         0x0000000100002678                    0x0000000000000008 0x000f0000 __progname
[   31]     98   X Data         0x0000000100000000                    0x00000000000010b0 0x000f0010 _mh_execute_header
[   32]     99   X Data         0x0000000100002670                    0x0000000000000008 0x000f0000 environ
[   33]    101   X Data         0x0000000100002680                    0x0000000000000000 0x000f0000 my_global_str
[   34]    102   X Code         0x00000001000010b0                    0x0000000000000040 0x000f0000 start
[   35]    103     Trampoline   0x0000000100001938                    0x0000000000000006 0x00010200 NSLog
[   36]    104   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSArray
[   37]    105   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010200 OBJC_CLASS_$_NSAutoreleasePool
[   38]    106   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSDate
[   39]    107   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSObject
[   40]    108   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010200 OBJC_CLASS_$_NSString
[   41]    109   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_METACLASS_$_NSObject
[   42]    110   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 __CFConstantStringClassReference
[   43]    111   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 _objc_empty_cache
[   44]    112   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 _objc_empty_vtable
[   45]    113     Trampoline   0x000000010000193e                    0x0000000000000006 0x00010300 exit
[   46]    114     Trampoline   0x0000000100001920                    0x0000000000000006 0x00010100 objc_getProperty
[   47]    115     Trampoline   0x0000000100001926                    0x0000000000000006 0x00010100 objc_msgSend
[   48]    116     Trampoline   0x000000010000192c                    0x0000000000000006 0x00010100 objc_msgSendSuper2
[   49]    117   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 objc_msgSend_fixup
[   50]    118     Trampoline   0x0000000100001932                    0x0000000000000006 0x00010100 objc_setProperty
[   51]    119     Trampoline   0x0000000100001944                    0x0000000000000006 0x00010300 printf
[   52]    120     Trampoline   0x000000010000194a                    0x0000000000000006 0x00010300 usleep
[   53]    121   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010300 dyld_stub_binder
(lldb)


All lazily bound symbols will have type Trampoline:

[   45]    113     Trampoline   0x000000010000193e                    0x0000000000000006 0x00010300 exit
[   46]    114     Trampoline   0x0000000100001920                    0x0000000000000006 0x00010100 objc_getProperty
[   47]    115     Trampoline   0x0000000100001926                    0x0000000000000006 0x00010100 objc_msgSend
[   48]    116     Trampoline   0x000000010000192c                    0x0000000000000006 0x00010100 objc_msgSendSuper2
[   50]    118     Trampoline   0x0000000100001932                    0x0000000000000006 0x00010100 objc_setProperty
[   51]    119     Trampoline   0x0000000100001944                    0x0000000000000006 0x00010300 printf
[   52]    120     Trampoline   0x000000010000194a                    0x0000000000000006 0x00010300 usleep

The other symbols that are exernal are marked with an "X" (which is a boolean flag on each symbol).

The symbols can be accessed via the SBModule:

   size_t
   SBModule::GetNumSymbols ();

   lldb::SBSymbol
   SBModule::GetSymbolAtIndex (size_t idx);

And then you can get the symbol type from each SBSymbol:

   SymbolType
   SBSymbol::GetType ();


I just added the ability to see if a symbol is externally visible:
% svn commit
Sending        include/lldb/API/SBSymbol.h
Sending        scripts/Python/interface/SBSymbol.i
Sending        source/API/SBSymbol.cpp
Transmitting file data ...
Committed revision 153893.


   bool
   SBSymbol::IsExternal();


So your flow should be:

SBDebugger::Initialize();
SBDebugger debugger(SBDebugger::Create());
SBTarget target (debugger.CreateTarget (const char *filename,
                                       const char *target_triple,
                                       const char *platform_name,
                                       bool add_dependent_modules,
                                       lldb::SBError& error));

SBFileSpec exe_file_spec (filename);
SBModule exe_module (target.FindModule(exe_file_spec));
if (exe_module.IsValid()
{
   const size_t num_symbols = exe_module. GetNumSymbols();
   for (size_t i=0; i<num_symbols; ++i)
   {
       SBSymbol symbol (exe_module. GetSymbolAtIndex(i));
       if (symbol.IsExternal())
       {
       }

       if (symbol.GetType() == lldb::eSymbolTypeTrampoline)
       {
       }
   }
}




> _______________________________________________
> lldb-dev mailing list
> lldb-dev@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

On Mon, Apr 2, 2012 at 4:05 PM, Greg Clayton gclayton@apple.com wrote:

A quick clarification on the args to CreateTarget:

"filename" is just a full path to the local object file you want to observer. "target_triple" is your <arch>-<vendor>-<os>, or something like "x86_64-apple-darwin" or "i386-pc-linux" and can be NULL if the file is only a single architecture. "platform_name" can be NULL. "add_dependent_modules" should be false, since you are only interested in seeing the one object file itself, an SBError instance  should be created and passed in.

@StefanKarpinski
Copy link
Member

@staticfloat
Copy link
Member

This is dredging up an amazingly old thread, but I realized I am possibly flirting around the edges of this problem with my codespeed benchmarking code. I repeatedly Core.include() files that contain two functions, listTests() and runTests(). I then simply call listTests() and runTests(), thereby redefining and then calling them every time I load a new benchmark file. Because I'm not redefining 2nd-tier functions I don't think I'll run into the problems listed here, but is redefining things like this a bad way to go about loading in multiple files with the same "API"?

@StefanKarpinski
Copy link
Member

Might be better to include them in a module repeatedly, but I realize that bumps up against the anonymous module problem that's also been raised on the dev list. In this case, it may be fine, however, since you can just define the same module multiple times and ignore the warning that emits.

@JeffBezanson
Copy link
Member

Perhaps an old thread, but as relevant as ever.
Maybe another approach is to use evalfile and have the file "return" a tuple of functions. What's the anonymous module problem?

@StefanKarpinski
Copy link
Member

I was referring to this: #3661.

@chriselrod
Copy link
Contributor

chriselrod commented Feb 3, 2021

I'm still getting this on Julia 1.5 and master on occasion, where code_typed changes but code_llvm and observed behavior do not. The behavior seems to be dependent on package-load order.
Here, I am redefining a couple functions from VectorizationBase. Some other functions in VectorizationBase depend on these, and are then called in VectorizedRNG. I get the correct behavior if I load both VectorizationBase and VectorizedRNG before redefining:

julia> using VectorizationBase, VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 514)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1785"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1790"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

We start with a 514, and then after the redefinitions, the offset is 130 (the offset equals 2 + 8 * number of integers it can process efficiently with SIMD instructions).
But now, we'll redefine the functions before loading VectorizedRNG:

julia> using VectorizationBase

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizationBase, VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1660"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

@code_typed is correct in the sense that it updated. It shows the correct new constant, 130.
@code_llvm is correct in the sense that this is the actual behavior the function shows. It shows the old constant, 514, which is still being used.

@mbauman
Copy link
Member

mbauman commented Feb 3, 2021

Might be related: https://github.com/chriselrod/VectorizationBase.jl/search?q=pure

@pure is the mechanism that opts out of this fix. Does it still occur without those annotations?

@chriselrod
Copy link
Contributor

Oops, might be my fault! Thanks -- I'll try taking them out.

Without @pure Julia won't optimize them, but LLVM shouldn't have a problem.

@chriselrod
Copy link
Contributor

chriselrod commented Feb 3, 2021

Unfortunately not:

(@v1.7) pkg> st VectorizationBase
      Status `~/.julia/environments/v1.7/Project.toml`
  [3d5dd08c] VectorizationBase v0.18.3 `~/.julia/dev/VectorizationBase`

julia> using VectorizationBase

julia> run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`) # all instances of `pure` are commented out
/home/chriselrod/.julia/dev/VectorizationBase/src/VectorizationBase.jl:14:# Base.@pure asvalbool(r) = Val(map(Bool, r))
/home/chriselrod/.julia/dev/VectorizationBase/src/VectorizationBase.jl:15:# Base.@pure asvalint(r) = Val(map(Int, r))
/home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/vbroadcast.jl:54:        # $(Expr(:meta,:pure,:inline))
Process(`grep -nr pure /home/chriselrod/.julia/dev/VectorizationBase/src`, ProcessExited(0))

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG
[ Info: Precompiling VectorizedRNG [33b4df10-0173-11e9-2a0c-851a7edac40e]

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("    \n\n    define void @entry(i64, i8, i64) alwaysinline {\n    top:\n        %ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void\n    }\n", "entry"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8}, UInt8, Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1414"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

EDIT: I just pushed the commit to remove all the pures to master, to remove that as a possible cause in case anyone wants to try the example.

EDIT: Although ArrrayInterface.jl is also using it at some locations. I'll try taking those out too.

EDIT: Also removing them from ArrayInterface solved the problem.

@chriselrod
Copy link
Contributor

chriselrod commented Feb 3, 2021

On 1.5, removing all @pure fixed it:

julia> using VectorizationBase

julia> !success(run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`, wait=false))
true

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1692"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> using ArrayInterface

julia> !success(run(`grep -nr "pure" $(dirname(pathof(ArrayInterface)))`, wait=false))
true

julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake-avx512)
Environment:
  JULIA_NUM_THREADS = auto

But on master:

julia> using VectorizationBase

julia> !success(run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`, wait=false))
true

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("    \n\n    define void @entry(i64, i8, i64) alwaysinline {\n    top:\n        %ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void\n    }\n", "entry"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8}, UInt8, Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1161"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> using ArrayInterface

julia> !success(run(`grep -nr "pure" $(dirname(pathof(ArrayInterface)))`, wait=false))
true

julia> versioninfo()
Julia Version 1.7.0-DEV.421
Commit 22858a0d29* (2021-02-01 19:04 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake-avx512)
Environment:
  JULIA_NUM_THREADS = auto

EDIT: No longer works on 1.5.

@vtjnash
Copy link
Member Author

vtjnash commented Feb 3, 2021

Note that @generated also may opt out of the fix, and it is similarly your responsibility to avoid writing them in such a way that the effect is likely to be visible to your users.

@chriselrod
Copy link
Contributor

chriselrod commented Feb 3, 2021

Could that explain why @code_typed is updated, but @code_llvm and the actual behavior are not?
I've been working on a minimal example, but haven't gotten a reproducer yet.

While there is a lot of @generated, I think that part of the code mostly relies on dispatch.

@chriselrod
Copy link
Contributor

chriselrod commented Feb 3, 2021

Also, redefining after the module has been loaded does trigger recompilation:

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1161"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> VectorizationBase.has_feature(Val{:x86_64_avx2}())
False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1287"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

The code created the bodies of generated functions shouldn't depend on any of these methods being redefined

julia> VectorizedRNG.getoffset()
514

julia> @code_typed VectorizedRNG.getoffset()
CodeInfo(
1return 130
) => Int64

julia> VectorizedRNG.getoffset()
514

julia> @code_warntype VectorizedRNG.getoffset()
MethodInstance for VectorizedRNG.getoffset()
  from getoffset() in VectorizedRNG at /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97
Arguments
  #self#::Core.Const(VectorizedRNG.getoffset)
Body::Int64
1%1 = VectorizedRNG.simd_integer_register_size()::Core.Const(static(16))
│   %2 = (4 * %1)::Core.Const(64)
│   %3 = (%2 * 2)::Core.Const(128)
│   %4 = (%3 + 2)::Core.Const(130)
└──      return %4


julia> VectorizedRNG.simd_integer_register_size()
static(16)

Downstream of simd_integer_register_size, all we have is multiplication.

By checking out the master branches, there is this simpler reproducer:

julia> using VectorizationBase

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG
[ Info: Precompiling VectorizedRNG [33b4df10-0173-11e9-2a0c-851a7edac40e]

julia> VectorizedRNG.getoffset()
514

julia> @code_warntype VectorizedRNG.getoffset()
MethodInstance for VectorizedRNG.getoffset()
  from getoffset() in VectorizedRNG at /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98
Arguments
  #self#::Core.Const(VectorizedRNG.getoffset)
Body::Int64
1%1 = VectorizedRNG.sirs()::Core.Const(static(16))
│   %2 = (4 * %1)::Core.Const(64)
│   %3 = (%2 * 2)::Core.Const(128)
│   %4 = (%3 + 2)::Core.Const(130)
└──      return %4


julia> VectorizedRNG.sirs()
static(16)

julia> VectorizedRNG.getoffset()
514

sirs is updated above. But the downstream multiplciation by integers is not.

topolarity added a commit to topolarity/julia that referenced this issue Jan 17, 2025
The intermediate data structure here (used for edge de-duplication)
was accidentally recording `invoke` edges as if they were `call` edges.

This bug is _very_ frequently benign, but if there are identical call
and invoke edges in the edge list and the invoke edge is scanned first,
the call edge will be unsoundly dropped, leading to invalidation (JuliaLang#265)
bugs.
topolarity added a commit to topolarity/julia that referenced this issue Jan 17, 2025
The intermediate data structure here (used for edge de-duplication)
was accidentally recording `invoke` edges as if they were `call` edges.

This bug is _very_ frequently benign, but if there are identical call
and invoke edges in the edge list and the invoke edge is scanned first,
the call edge will be unsoundly dropped, leading to invalidation (JuliaLang#265)
bugs.
topolarity added a commit to topolarity/julia that referenced this issue Jan 17, 2025
The intermediate data structure here (used for edge de-duplication)
was accidentally recording `invoke` edges as if they were `call` edges.

This bug is _very_ frequently benign, but if there are identical call
and invoke edges in the edge list and the invoke edge is scanned first,
the call edge will be unsoundly dropped, leading to invalidation (JuliaLang#265)
bugs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet