-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timer function wrapper and instant timer #90
base: main
Are you sure you want to change the base?
Conversation
tests/util/timer.py
Outdated
return psutil.swap_memory().used | ||
|
||
|
||
class FunctionTimer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a wrapper for the particular function:
can be used as the fowlloing
timer = FunctionTimer()
@timer
def some_function_to_time():
time.sleep(1)
@timer
def another_function_to_time():
time.sleep(2)
some_function_to_time()
another_function_to_time()
print(timer.get_log())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add name
as an argument for FunctionTimer
? For example, we may want to do something like below:
@timer("some")
def some_function_to_time():
time.sleep(1)
@timer("another")
def another_function_to_time():
time.sleep(2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timer("some") # label = "some"
def some_function_to_time():
time.sleep(1)
@timer # label = "another_function_to_time" # which is equivalent to func.__name__
def another_function_to_time():
time.sleep(2)
Now it supports both.
tests/util/timer.py
Outdated
return self.log | ||
|
||
|
||
class Timer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timer is used in a way like..
timer= Timer()
timer_name = "some_function"
timer.start_timer(timer_name, host_timer=False)
timer.stop_timer(timer_name)
timer_log = timer.get_info() # returns dictionary for each name.
print(f" elapsed time is {timer_log[timer_name]} miliseconds")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about assigning one timer for each AnaLog
instance (or each HookLogger
), and print all timer logs in update
with a nice-looking format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all in for that, but I am bit reluctant to shovel the Timer()
s into the current code, since..
- The code can get slower if I set timer for every bit.
- To solve 1, I would have to put
if timer_mode: then time the following
approach, so I think this may make the code bit more dirty.(a lot ofif else
branches) - The resoultion for this is to make the current tests codes to Time, and record in the PR actions.
I would like to ask your opinion on this.. I will follow whatever you suggest!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! Can you measure the slow down caused by Timer
in the MNIST example (w/ and w/o LoRA). I suspect that time.time() would only incur a negligible cost compared to neural network operations. If it's negligible in the MNIST example, which is one of the smallest NN experiments, we can ignore the cost in the larger scale experiments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARNING:root:CUDA is not set, setting the timer is set to host timer.
[2024-02-18 23:28:50] [INFO] 1: Linear(in_features=784, out_features=512, bias=False)
[2024-02-18 23:28:50] [INFO] 3: Linear(in_features=512, out_features=256, bias=False)
[2024-02-18 23:28:50] [INFO] 5: Linear(in_features=256, out_features=10, bias=False)
[2024-02-18 23:28:50] [INFO] Tracking the following modules:
[2024-02-18 23:28:50] [INFO] Total number of parameters: 535,040
logging time: 28.088079929351807
Function Timer Logs:
register_all_module_hooks:
operation costs 0.0004165172576904297 seconds
_forward_hook_fn:
operation costs 0.013933420181274414 seconds
_grad_hook_fn:
operation costs 0.0003361701965332031 seconds
_backward_hook_fn:
operation costs 0.0050966739654541016 seconds
buffer_write:
operation costs 0.09905576705932617 seconds
flush:
operation costs 5.245208740234375e-06 seconds
update:
operation costs 0.10047507286071777 seconds
build_log_dataset:
operation costs 0.0857551097869873 seconds
build_log_dataloader:
operation costs 0.08604812622070312 seconds
precondition:
operation costs 0.12075304985046387 seconds
compute_influence_all:
operation costs 6.2306530475616455 seconds
Computation time: 6.238124847412109
Top influential data indices: [[1281, 5025, 2658, 3210, 5242, 5000, 140, 5811, 3804, 5280]]
This is how it is logged. I find that the logging part is 28 secs and ana part is 6.2 sec.
For the logging part, it seems like it is not coming from update
but rather.. from the model training. It was not captured here. I tried other methods, but was not able to get it.
Can you check where else I may but the decorator?
tests/util/timer.py
Outdated
def get_gpu_memory(device_index=None): | ||
return torch.cuda.memory_allocated(device_index) | ||
|
||
|
||
def get_gpu_max_memory(device_index=None): | ||
return torch.cuda.max_memory_allocated(device_index) | ||
|
||
|
||
def get_device_memory(): | ||
process = psutil.Process(os.getpid()) | ||
return process.memory_info().rss | ||
|
||
|
||
def get_cpu_swap_memory(): | ||
return psutil.swap_memory().used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For memory, I think this is the best way to live monitor the memory pressure.
Say we do Operation A and want to measure the memory pressure.
If the memories are garbage collected after Operation A, the memory pressure cannot be accurately measured by subtracting after - before. Hence, we just need to print the memory inside the operation A using above functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to do more research to figure out the best way to profile GPU memory usage. For instance, PyTorch has torch.profiler.profile, and we want to make sure that our implementation has some advantages over it (e.g. less extensive but faster).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty fun! I will split the memory profile with this PR.
ae01df6
to
1f0c55b
Compare
requirements.txt
Outdated
@@ -2,6 +2,5 @@ numpy | |||
pandas | |||
torch | |||
einops | |||
|
|||
psutil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can just remove the psutil
, since this is for the memory monitoring.. and it seems like the memory monitoring is not really doing anything at this point.(calculating the memory difference of before & after) We may need to do this via profiling as mentioned before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can remove psutil for now. We can add it back once there is a request for CPU memory tracking from the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks good to me.
Two comments I have are:
- Instead of letting users manually print the timer log with
FunctionTimer.print_log()
, we can let AnaLog automatically print it, for example, infinalize
. - You may want to visually show the log in your terminal in the comment of this PR as below:
### TIMER LOG
Will print log like this @sangkeun00 |
b7042dc
to
2c00270
Compare
from torch.profiler import profile, ProfilerActivity | ||
|
||
|
||
def memory_profiler(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addding this to the example codes will be added in the next PR.
Details below