Refactors for publishing on Source Academy#7
Refactors for publishing on Source Academy#7veehz wants to merge 10 commits intosource-academy:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request renames the package to @sourceacademy/torch, updates project documentation, and introduces a Python bridge (pyodide_bridge.py) to allow the library to be used within Pyodide. Feedback focuses on several critical issues in the Python bridge implementation: the replacement of the example bridge file with a path string will cause syntax errors, the Tensor slicing logic is inefficient and breaks PyTorch's view semantics, and the getattr wrapper fails to pass keyword arguments to the underlying JavaScript functions. Additionally, improvements are needed for the result wrapping logic to handle non-tensor objects and to ensure _NNModule calls use the forward method correctly.
I am having trouble creating individual review comments. Click here to see my feedback.
examples/pyodide/bridge.py (1)
The file examples/pyodide/bridge.py has been replaced with a single line containing a relative path string ../../pyodide_bridge.py. However, examples/pyodide/main.js (line 18) reads this file's content and passes it directly to pyodide.runPython(). This will result in a SyntaxError in Python because a path string is not valid Python code.
You should either update main.js to read from the root pyodide_bridge.py or ensure that bridge.py contains valid Python code that imports or executes the logic from the root file.
pyodide_bridge.py (480-482)
_NNModule.__call__ attempts to invoke self._module.call(*js_args). In most JavaScript reimplementations of PyTorch, module instances are objects that expose a forward method rather than a call method. Unless the underlying JS library specifically adds a call method to all modules, this will fail. It is more standard and robust to have __call__ invoke self.forward(*args), which is already correctly implemented to call the JS forward method.
def __call__(self, *args):
return self.forward(*args)
pyodide_bridge.py (304-307)
The implementation of slicing for Tensor objects creates a new tensor by iterating over the range and converting each element to a Python list and then back to a JS tensor. This is highly inefficient for large tensors and, crucially, it returns a copy instead of a view. In PyTorch, slicing a tensor should return a view that shares the same underlying storage.
Since the JS Tensor.index method (line 351 in src/tensor.ts) already supports creating views for single indices, you should consider extending the JS implementation to support slicing or implementing a more efficient view-based slicing mechanism in the bridge to maintain PyTorch semantics.
pyodide_bridge.py (332-334)
The method wrapper returned by __getattr__ accepts **kwargs but does not pass them to the underlying JavaScript function. Many PyTorch operations rely on keyword arguments for important options (e.g., dim, keepdim, requires_grad). Ignoring these arguments will lead to incorrect behavior when they are used from Python.
Consider transforming kwargs into a JavaScript object (using to_js(kwargs)) and passing it as an argument if the JS API supports options objects.
pyodide_bridge.py (14-24)
_wrap_result automatically wraps any JsProxy returned from a JavaScript call into a Python Tensor. This is problematic if the JavaScript function returns something other than a tensor, such as a list of shapes, a boolean, or a plain configuration object. In such cases, the returned value will be incorrectly converted into a Tensor object.
You should add a check to verify that the JsProxy actually represents a tensor (e.g., by checking for the existence of a shape or id property) before wrapping it.
There was a problem hiding this comment.
Pull request overview
This PR refactors the project for publishing under the scoped package name @sourceacademy/torch, updates consumer references (tests/examples/HTML importmaps), and introduces a standalone Pyodide bridge file intended for distribution.
Changes:
- Renames the package from
torchto@sourceacademy/torchand updates imports/importmaps accordingly. - Adjusts the npm published file set to include
pyodide_bridge.py(and removesexamplesfrom the publishedfileslist). - Splits publishing into a dedicated GitHub Actions workflow and updates docs/CONTRIBUTING content.
Reviewed changes
Copilot reviewed 31 out of 34 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
package.json |
Renames the package scope and updates published files list to include pyodide_bridge.py. |
yarn.lock |
Updates workspace locator/name to @sourceacademy/torch. |
pyodide_bridge.py |
Adds a distributable Pyodide bridge implementing a PyTorch-like Python API over the JS library. |
.github/workflows/publish.yml |
Adds a dedicated publish workflow for package previews. |
.github/workflows/build-test.yml |
Removes publishing from CI and adjusts docs artifact preparation. |
test/*.test.{js,ts} |
Updates test imports to use @sourceacademy/torch. |
test/index.html, test/umd.html |
Updates import maps to use the new scoped package name. |
examples/pyodide/* |
Updates example dependency/imports to use @sourceacademy/torch. |
examples/basic_backpropagation.js |
Updates the Node ESM build filename extension used by the example. |
README.md |
Expands project overview and contribution guidance. |
CONTRIBUTING.md |
Adds a commands section and corrects/updates structure notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - [`functions`](functions) contains all functions that tensors can perform. | ||
| - [`nn`](nn) contains all neural network modules (for everything under `torch.nn`). | ||
| - [`optim`](optim) contains all optimizers (for everything under `torch.optim`). | ||
| - [`creation`](creation) contains all tensor creation functions (all functions that create a tensor not from scratch, including `zeros`, `randn`). |
There was a problem hiding this comment.
These links are broken: functions, nn, optim, and creation are not top-level directories; they live under src/. Update the links to point at src/functions, src/nn, src/optim, and src/creation so the Codebase Structure section is navigable.
| - [`functions`](functions) contains all functions that tensors can perform. | |
| - [`nn`](nn) contains all neural network modules (for everything under `torch.nn`). | |
| - [`optim`](optim) contains all optimizers (for everything under `torch.optim`). | |
| - [`creation`](creation) contains all tensor creation functions (all functions that create a tensor not from scratch, including `zeros`, `randn`). | |
| - [`functions`](src/functions) contains all functions that tensors can perform. | |
| - [`nn`](src/nn) contains all neural network modules (for everything under `torch.nn`). | |
| - [`optim`](src/optim) contains all optimizers (for everything under `torch.optim`). | |
| - [`creation`](src/creation) contains all tensor creation functions (all functions that create a tensor not from scratch, including `zeros`, `randn`). |
| start, stop, step = key.indices(self.shape[0]) | ||
| data = [Tensor(self._js.index(i)).tolist() for i in range(start, stop, step)] | ||
| return Tensor(data) |
There was a problem hiding this comment.
The slice path builds a Python list via .tolist() and then constructs a new tensor from that data. This loses view semantics and breaks autograd connectivity to the original tensor (it becomes a fresh leaf). Either implement slicing using JS tensor ops (e.g. repeated index + stack/cat) or explicitly raise NotImplementedError for slices to avoid silently returning a detached copy.
| start, stop, step = key.indices(self.shape[0]) | |
| data = [Tensor(self._js.index(i)).tolist() for i in range(start, stop, step)] | |
| return Tensor(data) | |
| raise NotImplementedError( | |
| "Slice indexing is not implemented because converting through Python " | |
| "lists would return a detached copy and break tensor/autograd semantics" | |
| ) |
| data = self.tolist() | ||
| if not isinstance(data, list): | ||
| raise TypeError("iteration over a 0-d tensor") | ||
| for item in data: | ||
| yield Tensor(item) |
There was a problem hiding this comment.
__iter__ iterates via tolist() and wraps each Python element with Tensor(item), which creates brand-new tensors and breaks gradient/view semantics. Iteration should yield views/slices of the original tensor (e.g. yield self[i] / Tensor(self._js.index(i))) so that autograd and in-place ops behave as expected.
| data = self.tolist() | |
| if not isinstance(data, list): | |
| raise TypeError("iteration over a 0-d tensor") | |
| for item in data: | |
| yield Tensor(item) | |
| if len(self.shape) == 0: | |
| raise TypeError("iteration over a 0-d tensor") | |
| for i in range(len(self)): | |
| yield self[i] |
| # bridge.py | ||
| # Provides a PyTorch-compatible Python API over js_torch (the TypeScript torch library). | ||
| # | ||
| # Before loading this file, set the following globals in Pyodide: | ||
| # js_torch - the torch module (window.torch from the UMD build) | ||
|
|
There was a problem hiding this comment.
This file appears to be a copy of examples/pyodide/bridge.py (same header/comments and implementation). Maintaining two identical copies is error-prone (they can drift). Consider making one canonical source (e.g. keep pyodide_bridge.py and have the example reference/copy it during build) to avoid duplication.
| def squeeze(self, dim=None): | ||
| if dim is None: | ||
| new_shape = [s for s in self.shape if s != 1] | ||
| return Tensor(self._js.reshape(to_js(new_shape or [1]))) |
There was a problem hiding this comment.
squeeze(dim=None) computes new_shape correctly but then forces new_shape or [1], which prevents squeezing a [1] tensor down to a 0‑d tensor (PyTorch returns a scalar/shape []). Remove the [1] fallback (and let reshape([]) happen) or delegate to the JS squeeze implementation to match semantics.
| return Tensor(self._js.reshape(to_js(new_shape or [1]))) | |
| return Tensor(self._js.reshape(to_js(new_shape))) |
| # Iteration and length | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def __len__(self): |
There was a problem hiding this comment.
__len__ will raise IndexError for 0‑d tensors because shape[0] doesn’t exist. In PyTorch, len(scalar_tensor) raises TypeError. Consider checking self.dim() == 0 and raising TypeError("len() of a 0-d tensor") (or similar) for compatibility.
| def __len__(self): | |
| def __len__(self): | |
| if self.dim() == 0: | |
| raise TypeError("len() of a 0-d tensor") |
| # Returned JsProxy objects are wrapped in Tensor; primitives pass through. | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def __getattr__(self, name): | ||
| if name.startswith('_'): | ||
| raise AttributeError(name) | ||
| def method(*args, **kwargs): |
There was a problem hiding this comment.
__getattr__ accepts **kwargs but ignores them, which can silently drop user arguments (e.g. tensor.some_op(dim=1) will behave like tensor.some_op()). Either forward keyword args into the JS call (transforming values via to_js as needed) or explicitly error when kwargs is non-empty to avoid surprising behavior.
| # Returned JsProxy objects are wrapped in Tensor; primitives pass through. | |
| # ------------------------------------------------------------------ | |
| def __getattr__(self, name): | |
| if name.startswith('_'): | |
| raise AttributeError(name) | |
| def method(*args, **kwargs): | |
| # Returned JsProxy objects are wrapped in Tensor; primitives pass through. | |
| # Keyword arguments are rejected explicitly so they are not silently dropped. | |
| # ------------------------------------------------------------------ | |
| def __getattr__(self, name): | |
| if name.startswith('_'): | |
| raise AttributeError(name) | |
| def method(*args, **kwargs): | |
| if kwargs: | |
| unexpected = ', '.join(sorted(kwargs.keys())) | |
| raise TypeError( | |
| f"{name}() does not support keyword arguments in this bridge; " | |
| f"got unexpected keyword argument(s): {unexpected}" | |
| ) |
This PR includes multiple refactors to prepare for publishing under Source Academy:
torchto@sourceacademy/torchAdds new feature: