Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single core of TPU gives prediction results different than the CPU results #8625

Open
mohamedamara7 opened this issue Jan 25, 2025 · 1 comment
Assignees

Comments

@mohamedamara7
Copy link

❓ Questions and Help

I encountered a discrepancy when training a model using PyTorch XLA on a TPU, where the results differed significantly from those obtained using CPU or GPU. After further investigation using a toy example, I noticed that the predictions made with PyTorch XLA were not consistent with those made on the CPU. Interestingly, when using PyTorch Lightning for training on TPU, the results were identical to the CPU output PyTorch Lightning depends on pytoch XLA when training with tpu. This led me to suspect that there may be some device-specific differences or initialization issues when using PyTorch XLA directly.

Code

def generate_random_data(batch_size=1, num_channels=3, height=224, width=224):
    return torch.randn(batch_size, num_channels, height, width, dtype=torch.float32)

def load_model():
    return models.efficientnet_b0(weights='DEFAULT').eval()

random_data = generate_random_data()
# CPU inference
model = load_model()
cpu_result = inference_on_device(model, torch.device('cpu'), random_data)

# TPU inference
tpu_device = xm.xla_device() #single core of tpu
tpu_result = inference_on_device(model, tpu_device, random_data)

# Compare CPU and TPU results
print("Difference between CPU and XLA TPU results:", np.abs(cpu_result - tpu_result).max())

Output

Difference between CPU and XLA TPU results: 0.025713682

Best Regard.

@miladm
Copy link
Collaborator

miladm commented Jan 27, 2025

thank you for sharing this bug - cc @ysiraichi to assist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants