reduce time not obviously #6

qinxianyuzi · 2021-07-16T08:44:24Z

Hi, thank you for your meaningful job. I change the model in benchmark.py and set batch size 64.

device = "cuda"
# model = nn.Sequential(*[nn.Linear(128, 128) for i in range(100)]).to(device)
model = LResNet18E().to(device)
print("Number of parameters: ", sum(p.numel() for p in model.parameters()))
x = torch.randn(64, 3, 224, 224).to(device)
y = torch.ones(64).to(device)
y = y.long()
model_copies = [deepcopy(model) for _ in range(2)]
# Benchmark original.
parameters = list(model_copies[0].parameters())
optimizer = torch.optim.SGD(parameters, lr=1e-3)
benchmark_model(model_copies[0], optimizer, parameters, "original_params")
# Benchmark contiguous.
parameters = ContiguousParams(model_copies[1].parameters())
optimizer = torch.optim.SGD(parameters.contiguous(), lr=1e-3)
benchmark_model(model_copies[1], optimizer, parameters.contiguous(),
                "contiguous_params")
# Ensure the parameter buffers are still valid.
parameters.assert_buffer_is_valid()

the print result is dissatisfactory.
Number of parameters: 11055816
Mean step time: 2.763813018798828 seconds. (Autograd profiler enabled: False)
Mean step time: 2.8434643745422363 seconds. (Autograd profiler enabled: True)
Mean step time: 2.057171106338501 seconds. (Autograd profiler enabled: False)
Mean step time: 2.271756172180176 seconds. (Autograd profiler enabled: True)

when the batch size is 128:
Number of parameters: 11055816
Mean step time: 4.793098592758179 seconds. (Autograd profiler enabled: False)
Mean step time: 4.904996871948242 seconds. (Autograd profiler enabled: True)
Mean step time: 4.080202102661133 seconds. (Autograd profiler enabled: False)
Mean step time: 4.198964834213257 seconds. (Autograd profiler enabled: True)

What's wrong in my code? Thanks for your answer.

The text was updated successfully, but these errors were encountered:

PhilJd · 2021-07-20T07:43:02Z

Hi,
sorry for the late reply. Could you explain what result you would expect? To me, these numbers look reasonable, i.e.,
you reduce the optimizer step time by ~0.7 seconds in both cases (The optimizer only considers parameters, so changing the batch size does not affect the optimizer step time). Did you take a look at the trace visualization?

Cheers,
Phil

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce time not obviously #6

reduce time not obviously #6

qinxianyuzi commented Jul 16, 2021 •

edited

Loading

PhilJd commented Jul 20, 2021

reduce time not obviously #6

reduce time not obviously #6

Comments

qinxianyuzi commented Jul 16, 2021 • edited Loading

PhilJd commented Jul 20, 2021

qinxianyuzi commented Jul 16, 2021 •

edited

Loading