Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add profiling using cprofile #644

Draft
wants to merge 6 commits into
base: masterr
Choose a base branch
from
Draft

add profiling using cprofile #644

wants to merge 6 commits into from

Conversation

isaacmg
Copy link
Collaborator

@isaacmg isaacmg commented Mar 15, 2023

The goal of this PR is add automatic optimization/time calculation metrics. Also maybe do some optimization of some the code and find out where it does not run efficiently.

@isaacmg
Copy link
Collaborator Author

isaacmg commented Mar 16, 2023

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   4224/1    0.068    0.000   76.199   76.199 {built-in method builtins.exec}
        1    0.000    0.000   76.199   76.199 trainer.py:1(<module>)
        1    0.003    0.003   72.211   72.211 trainer.py:173(main)
        1    0.001    0.001   72.207   72.207 trainer.py:81(train_function)
        1    0.000    0.000   57.669   57.669 pytorch_training.py:73(train_transformer_style)
66173/1709    1.299    0.000   36.352    0.021 module.py:1188(_call_impl)
      816    0.204    0.000   36.102    0.044 dsanet.py:333(forward)
      411    0.878    0.002   31.533    0.077 linear_regression.py:54(simple_decode)
        1    0.035    0.035   29.468   29.468 pytorch_training.py:343(torch_single_train)
        2    2.582    1.291   28.099   14.049 pytorch_training.py:452(compute_validation)
     3264    0.290    0.000   22.240    0.007 dsanet.py:112(forward)
      816    0.033    0.000   21.326    0.026 dsanet.py:179(forward)
      816    0.479    0.001   14.274    0.017 dsanet.py:241(forward)
        1    0.000    0.000   13.037   13.037 trainer.py:16(handle_model_evaluation1)
        1    0.000    0.000   12.613   12.613 evaluator.py:70(evaluate_model)
        1    0.000    0.000   12.392   12.392 evaluator.py:194(infer_on_torch_model)
        1    0.000    0.000   12.294   12.294 evaluator.py:383(generate_predictions)
        1    0.000    0.000   12.293   12.293 evaluator.py:519(generate_decoded_predictions)
     3264    1.231    0.000   12.179    0.004 dsanet.py:54(forward)
      590    0.107    0.000   11.401    0.019 __init__.py:1(<module>)
       72    0.001    0.000   10.570    0.147 _tensor.py:429(backward)
       72    0.001    0.000   10.569    0.147 __init__.py:103(backward)
       72   10.461    0.145   10.566    0.147 {method 'run_backward' of 'torch._C._EngineBase' objects}
       72    0.092    0.001   10.008    0.139 optimizer.py:135(wrapper)
       72    0.003    0.000    9.909    0.138 optimizer.py:19(_use_grad)
       72    0.118    0.002    9.905    0.138 adam.py:168(step)
       72    0.001    0.000    9.774    0.136 adam.py:257(adam)
       72    4.426    0.061    9.770    0.136 adam.py:319(_single_tensor_adam)
     3264    1.274    0.000    9.543    0.003 dsanet.py:93(forward)
     1632    0.006    0.000    9.237    0.006 conv.py:462(forward)
     1632    0.096    0.000    9.228    0.006 conv.py:454(_conv_forward)
     1632    9.132    0.006    9.132    0.006 {built-in method torch.conv2d}
     6528    0.200    0.000    5.595    0.001 conv.py:312(forward)
     6528    0.108    0.000    5.294    0.001 conv.py:304(_conv_forward)
     6528    5.186    0.001    5.186    0.001 {built-in method torch.conv1d}
    17952    0.669    0.000    4.373    0.000 linear.py:113(forward)
5065/1554    0.021    0.000    4.244    0.003 <frozen importlib._bootstrap>:1022(_find_and_load)
 4116/710    0.015    0.000    4.212    0.006 <frozen importlib._bootstrap>:987(_find_and_load_unlocked)
 3899/668    0.015    0.000    4.135    0.006 <frozen importlib._bootstrap>:664(_load_unlocked)
 3598/667    0.008    0.000    4.114    0.006 <frozen importlib._bootstrap_external>:877(exec_module)
 5111/667    0.003    0.000    4.035    0.006 <frozen importlib._bootstrap>:233(_call_with_frames_removed)
      485    0.011    0.000    3.703    0.008 dataloader.py:623(__next__)
    17952    3.679    0.000    3.679    0.000 {built-in method torch._C._nn.linear}
      485    0.006    0.000    3.652    0.008 dataloader.py:1286(_next_data)
      482    0.002    0.000    3.306    0.007 dataloader.py:1253(_get_data)
      482    0.002    0.000    3.305    0.007 dataloader.py:1107(_try_get_data)
      482    0.006    0.000    3.303    0.007 queues.py:98(get)
      892    0.459    0.001    3.197    0.004 pytorch_training.py:275(compute_loss)
 1307/133    0.003    0.000    3.094    0.023 {built-in method builtins.__import__}
     3264    0.469    0.000    3.029    0.001 dsanet.py:17(forward)
4592/2041    0.009    0.000    2.836    0.001 <frozen importlib._bootstrap>:1053(_handle_fromlist)
      410    0.004    0.000    2.397    0.006 pytorch_training.py:229(handle_scaling)
      821    0.282    0.000    2.392    0.003 pytorch_loaders.py:123(inverse_scale)
     6528    0.203    0.000    2.385    0.000 normalization.py:189(forward)
       72    0.016    0.000    2.202    0.031 optimizer.py:246(zero_grad)
     5688    2.179    0.000    2.179    0.000 {method 'zero_' of 'torch._C._TensorBase' objects}

@isaacmg
Copy link
Collaborator Author

isaacmg commented Mar 17, 2023

So looking things over the speed bottlenecks on functions are generally where you would expect. For instance, on simple_decode the model(src) line takes up 90% of the compute. In PyTorch 2.0 we could possibly use torch.compile() however this raises the question how would that influence confidence intervals using dropout. Would we still get different values?

@isaacmg
Copy link
Collaborator Author

isaacmg commented Mar 18, 2023

Using torch.compile in simple_decode
421 ms ± 11.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Not using torch.compile in simple_decode
266 ms ± 18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

This was for generating 400 time steps into the future without a GPU. For a longer sequence maybe that would be faster? In general though I don't think the torch.compile is worth the potential problems with losing the confidence interval that might come with it.

@isaacmg
Copy link
Collaborator Author

isaacmg commented Mar 18, 2023

Timer unit: 1e-09 s

Total time: 0.0194293 s
File: <ipython-input-22-fdd33412dbaf>
Function: __getitem__ at line 102

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   102                                               @profiler
   103                                               def __getitem__(self, idx: int):
   104         2    4655940.0 2327970.0     24.0          rows = self.df.iloc[idx: self.forecast_history + idx]
   105         2       4402.0   2201.0      0.0          targs_idx_start = self.forecast_history + idx
   106         2       6252.0   3126.0      0.0          if self.no_scale:
   107         2     222211.0 111105.5      1.1              targ_rows = self.unscaled_df.iloc[targs_idx_start: self.forecast_length + targs_idx_start]
   108                                                   else:
   109                                                       targ_rows = self.df.iloc[
   110                                                           targs_idx_start: self.forecast_length + targs_idx_start
   111                                                       ]
   112         2   13969710.0 6984855.0     71.9          src_data = rows.to_numpy()
   113         2     430522.0 215261.0      2.2          src_data = torch.from_numpy(src_data).float()
   114         2      90531.0  45265.5      0.5          trg_dat = targ_rows.to_numpy()
   115         2      48803.0  24401.5      0.3          trg_dat = torch.from_numpy(trg_dat).float()
   116         2        931.0    465.5      0.0          return src_data, trg_dat

@isaacmg
Copy link
Collaborator Author

isaacmg commented Mar 22, 2023

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   347                                           @profile
   348                                           def torch_single_train(model: PyTorchForecast,
   349                                                                  opt: optim.Optimizer,
   350                                                                  criterion: Type[torch.nn.modules.loss._Loss],
   351                                                                  data_loader: DataLoader,
   352                                                                  takes_target: bool,
   353                                                                  meta_data_model: PyTorchForecast,
   354                                                                  meta_data_model_representation: torch.Tensor,
   355                                                                  meta_loss=None,
   356                                                                  multi_targets=1,
   357                                                                  forward_params: Dict = {}) -> float:
   358                                               """Function that performs training of a single model. Runs through one epoch of the data.
   359                                           
   360                                               :param model: The PyTorchForecast model that is trained
   361                                               :type model: PyTorchForecast
   362                                               :param opt: The optimizer to use in the code
   363                                               :type opt: optim.Optimizer
   364                                               :param criterion: [description]
   365                                               :type criterion: Type[torch.nn.modules.loss._Loss]
   366                                               :param data_loader: [description]
   367                                               :type data_loader: DataLoader
   368                                               :param takes_target: A boolean that indicates whether the model takes the target during training
   369                                               :type takes_target: bool
   370                                               :param meta_data_model: If supplied a model that handles meta-data else None.
   371                                               :type meta_data_model: PyTorchForecast
   372                                               :param meta_data_model_representation: [description]
   373                                               :type meta_data_model_representation: torch.Tensor
   374                                               :param meta_loss: [description], defaults to None
   375                                               :type meta_loss: [type], optional
   376                                               :param multi_targets: [description], defaults to 1
   377                                               :type multi_targets: int, optional
   378                                               :param forward_params: [description], defaults to {}
   379                                               :type forward_params: Dict, optional
   380                                               :raises ValueError: [description]
   381                                               :return: [description]
   382                                               :rtype: float
   383                                               """
   384                                           
   385         1       1544.0   1544.0      0.0      probablistic = None
   386         1       2490.0   2490.0      0.0      if "probabilistic" in model.params["model_params"]:
   387                                                   probablistic = True
   388         1     257081.0 257081.0      0.0      print('running torch_single_train')
   389         1        364.0    364.0      0.0      i = 0
   390         1        205.0    205.0      0.0      output_std = None
   391         1        215.0    215.0      0.0      mulit_targets_copy = multi_targets
   392         1        375.0    375.0      0.0      running_loss = 0.0
   393        72  306151827.0 4252108.7     48.1      for src, trg in data_loader:
   394        72    5728173.0  79558.0      0.9          opt.zero_grad()
   395        72      30887.0    429.0      0.0          if meta_data_model:
   396                                                       representation = meta_data_model.model.generate_representation(meta_data_model_representation)
   397                                                       forward_params["meta_data"] = representation
   398                                                       if meta_loss:
   399                                                           output = meta_data_model.model(meta_data_model_representation)
   400                                                           met_loss = compute_loss(meta_data_model_representation, output, torch.rand(2, 3, 2), meta_loss, None)
   401                                                           met_loss.backward()
   402        72      25828.0    358.7      0.0          if takes_target:
   403                                                       forward_params["t"] = trg
   404        72     127133.0   1765.7      0.0          elif "TemporalLoader" == model.params["dataset_params"]["class"]:
   405                                                       forward_params["x_mark_enc"] = src[1].to(model.device)
   406                                                       forward_params["x_dec"] = trg[1].to(model.device)
   407                                                       forward_params["x_mark_dec"] = trg[0].to(model.device)
   408                                                       src = src[0]
   409                                                       pred_len = model.model.pred_len
   410                                                       trg = trg[0]
   411                                                       trg[:, -pred_len:, :] = torch.zeros_like(trg[:, -pred_len:, :].long()).float().to(model.device)
   412                                                       # Assign to avoid other if statement
   413        72      40084.0    556.7      0.0          elif "SeriesIDLoader" == model.params["dataset_params"]["class"]:
   414                                                       pass
   415        72     304913.0   4234.9      0.0          src = src.to(model.device)
   416        72     498787.0   6927.6      0.1          trg = trg.to(model.device)
   417        72   80243336.0 1114490.8     12.6          output = model.model(src, **forward_params)
   418        72     691238.0   9600.5      0.1          if hasattr(model.model, "pred_len"):
   419                                                       multi_targets = mulit_targets_copy
   420                                                       pred_len = model.model.pred_len
   421                                                       output = output[:, :, 0:multi_targets]
   422                                                       labels = trg[:, -pred_len:, 0:multi_targets]
   423                                                       multi_targets = False
   424        72     102195.0   1419.4      0.0          if model.params["dataset_params"]["class"] == "GeneralClassificationLoader":
   425                                                       labels = trg
   426        72      43360.0    602.2      0.0          elif multi_targets == 1:
   427        72    2079566.0  28882.9      0.3              labels = trg[:, :, 0]
   428                                                   elif multi_targets > 1:
   429                                                       labels = trg[:, :, 0:multi_targets]
   430        72      31689.0    440.1      0.0          if probablistic:
   431                                                       output1 = output
   432                                                       output = output.mean
   433                                                       output_std = output1.stddev
   434        72      67559.0    938.3      0.0          if type(criterion) == list:
   435                                                       loss = multi_crit(criterion, output, labels, None)
   436                                                   else:
   437        72   11777639.0 163578.3      1.9              loss = compute_loss(labels, output, src, criterion, None, probablistic, output_std, m=multi_targets)
   438        72    1583531.0  21993.5      0.2          if loss > 100:
   439                                                       print("Warning: high loss detected")
   440        72  176070293.0 2445420.7     27.7          loss.backward()
   441        72   47679070.0 662209.3      7.5          opt.step()
   442        72    2301492.0  31965.2      0.4          if torch.isnan(loss) or loss == float('inf'):
   443                                                       raise ValueError("Error infinite or NaN loss detected. Try normalizing data or performing interpolation")
   444        72     141336.0   1963.0      0.0          running_loss += loss.item()
   445        72      58359.0    810.5      0.0          i += 1
   446         1     354967.0 354967.0      0.1      print("The running loss is: ")
   447         1      99189.0  99189.0      0.0      print(running_loss)
   448         1      90326.0  90326.0      0.0      print("The number of items in train is: " + str(i))
   449         1       1506.0   1506.0      0.0      total_loss = running_loss / float(i)

@codecov
Copy link

codecov bot commented Mar 22, 2023

Codecov Report

Patch coverage: 75.00% and project coverage change: -11.20 ⚠️

Comparison is base (e68c1db) 75.79% compared to head (4752f22) 64.59%.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #644       +/-   ##
===========================================
- Coverage   75.79%   64.59%   -11.20%     
===========================================
  Files          67       67               
  Lines        4846     4850        +4     
===========================================
- Hits         3673     3133      -540     
- Misses       1173     1717      +544     
Flag Coverage Δ
python 64.59% <75.00%> (-11.20%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
flood_forecast/basic/gru_vanilla.py 14.81% <ø> (ø)
flood_forecast/preprocessing/closest_station.py 69.31% <ø> (ø)
flood_forecast/pytorch_training.py 31.62% <75.00%> (-41.55%) ⬇️

... and 17 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant