Enable Tensor Parallelism (TP)

## **Summary**
Wire full TP support through miner and validator so runs work by setting `torchtitan.tp_degree > 1` in hparams. The hparams surface and Titan parallelization are already present; we need to complete the plumbing across init, gradient pipeline, and checkpoints. 

## **Scope**

1. **Model init / mesh wiring**

* Miner & Validator: build Titan LLaMA via our factory and parallelize with TP using existing helpers. Validate degrees/world-size via the factory checks. 
* Keep validator mesh consistent with miner for evaluation parity. 

2. **Gradient pipeline (DTensor-safe)**

* Ensure `prepare_gradient_dict(...)` owner/rendezvous logic correctly handles TP-sharded DTensors during encode→compress. 
* In `outer_step(...)`, keep the per-param flow: reconstruct dense grad on source, then broadcast/`distribute_tensor` into DTensor grads on all ranks. Verify placements under TP. 

3. **Checkpointing / catch-up**

* Confirm DCP save/load and catch-up apply cleanly with TP sharding (Titan distributed state dicts + pointer publishing). 

## **Acceptance criteria**

* Setting `torchtitan.tp_degree > 1` runs miner and validator without placement/shape errors and completes windows. 
* Gradients compress/apply under TP (no DTensor/mesh asserts in `prepare_gradient_dict` or `outer_step`). 
* Checkpoints save, restore, and catch-up on TP meshes. 

## **Notes**

* Double-check `owned_params`/ownership vs. TP partitioning to avoid double work or gaps. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Tensor Parallelism (TP) #599

Summary

Scope

Acceptance criteria

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Enable Tensor Parallelism (TP) #599

Description

Summary

Scope

Acceptance criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions