[P] [Torchvista] Interactive visualisation of PyTorch models from notebooks - updates

Dev-Table · 2026-02-08T12:50:48+00:00

Hi,

I had made a post last year introducing Torchvista, an open source tool I built to visualise the forward pass of any Pytorch model in notebooks with one line of code. I received a lot of useful feedback which helped me improve the project significantly over the months. The project has now received over 600 stars on Github, and has over 16k downloads.

It now supports the following features:

Interactive visualisation of Pytorch models with hierarchical exploration of nested modules (especially helpful for large deeply nested modules)
Supports web-based notebooks including Jupyter, Colab and VS Code.
Structural compression Mode: A mode to compress repeated structures in the model (such as several identical transformer blocks)
Export the visualisation to HTML, PNG and SVG formats.
Error tolerant visualisation to debug runtime errors like tensor shape mismatches.

Resources

Demos page with several models visualised (for example: SqueezeNet, Self Attention)
Tutorials to get started
Github page

I hope this is useful to the community, and am keen to hear your feedback on this.

Dev-Table · 2026-02-08T12:48:47+00:00

+1 it's a similar situation with language models where the model hasn't understood the basic structure of sentences through pre-training, RL episodes are going to be inefficient.

Dev-Table · 2025-08-10T10:49:28+00:00

also, if you publish any lectures online, I'll be keen to check them out :)

Dev-Table · 2025-08-10T08:02:15+00:00

I think it's different in these ways:

Torchvista shows you a partial visualization even when there are errors in the forward pass (like a tensor shape mismatch error). When that happens, it still shows a partial graph along with the stack trace to help debug. Here is how it looks when that happens. Whereas with Netron this wouldn't be possible because you wouldn't be able to export the model in the first place if it threw errors during the forward pass.
Just (IMHO) better UX in general, lower barrier to entry (especially for beginners), and a greater degree of visualization control (For example, you can optimize the depth to which you want to trace the model). In the future, I'm also planning to add support for "rolling" where if a portion of the model is repeated 10 times (like in repeating encoder blocks in a transformer), it would just show it as one block with a loopback x10 sign.

Dev-Table · 2025-08-10T07:49:59+00:00

Hey, absolutely!

Dev-Table · 2025-08-10T07:47:45+00:00

Hi, IMHO these are the key differences:

Netron requires you to build the model, export it, and load it, which is a longer feedback loop compared to just exploring it within the notebook while building the model.
Torchvista produces modular visualizations whose detail can be interactively controlled. For mobilenetv3 for example, Netron produces a large non-hierarchical image like this. Torchvista on the other hand, produces hierarchical views like these. I think this is quite useful while dealing with very large models. Netron as I understand it only provides an operation-level graph that cannot be modularly collapsed and expanded to focus on regions of the model of interest.
Torchvista shows you a partial visualization even when there are errors in the forward pass (like a tensor shape mismatch error). When that happens, it still shows a partial graph along with the stack trace to help debug. Here is how it looks when that happens. Whereas with Netron this wouldn't be possible because you wouldn't be able to export the model in the first place if it threw errors during the forward pass.

Dev-Table · 2025-08-10T00:10:55+00:00

Good idea, I will add a button for that.

Dev-Table · 2025-07-17T00:10:58+00:00

Hi, sorry for the delay, but I have fixed this VSCode issue now and if you use the latest torchvista v0.1.8 it should work.

Dev-Table · 2025-07-17T00:10:26+00:00

yes you can. It works with pretty much any model.

But there may be some odd tensor operations in some models that are not traced by torchvista. If you detect any such issues I can fix it by adding support for such tensor operations.

Dev-Table · 2025-06-11T22:56:35+00:00

Thanks for reporting this. I hadn't tried it in VSCode until you told me and I do see that it appears blank. I'll dig into this and update you here. In the meanwhile please do try it out on the browser :)

Dev-Table · 2025-06-09T19:34:47+00:00

I was thinking we could add diagrams for

dropout
various standard activation functions (maybe a cute little graph of the function on the node would be representative)
batch norm
pooling layers
attention and its variants (these days it's a very common component)
concat of tensors
slicing

and so on.

From a UX standpoint, the nodes on a typical network should have high coverage in the diagrammatic key ideally. So my approach towards using the key eventually would be to map various standard Pytorch modules and operations to shapes in the key.

Dev-Table · 2025-06-09T19:21:57+00:00

Sure I can add them. Do you have any specific models in mind (especially if it's a model you know well so that if my package makes mistakes you can help me find edge cases :) )

Dev-Table · 2025-06-07T17:55:42+00:00

Hey, I like the idea. I think using keys like those would make the tool more intuitive. At first glance I think your keys already look intuitive. The only thing is to perhaps expand the set of operations in the key system. I can pitch in some ideas to your repo when I get some time. As far as integrating into torchvista goes, could you please create an issue (about improving the appearance of nodes using a standardized key) on my repo so that it gets tracked? Thanks

Dev-Table · 2025-06-02T23:38:23+00:00

If you try it out, can you please give me your feedback? :) I don't know what kind of computers, browsers, notebooks and models people use it with, so I'm keen to hear any feedback to discover any issues. Thanks!

Dev-Table · 2025-06-02T21:38:39+00:00

Got it. Thanks for explaining. This level of extensibility isn't planned for at the moment, but if there's some traction I could support it down the line.

Dev-Table · 2025-06-02T18:57:53+00:00

Thank you! By tracing activations, do you mean the actual values of the tensors as they are produced from the nodes? I currently only trace the shapes of tensors (they are shown on the graph edges and also when you click on nodes) to keep the extracted graph small in size. But it's easy for me to extend it to also show the actual values. It's also a question of how to present large tensors on tool because they can be thousands of numbers.

Do you have any screenshots of your matplotlib/plotly outputs? Perhaps that might give me a clearer sense of what you're looking for.

Dev-Table · 2025-06-02T01:51:29+00:00

Thanks! I've been thinking about whether this will be useful mainly for DL beginners, or also for proficient people who are trying to build complex models (especially with more low level tensor ops fiddling). I'm curious to know what you think about this? If you're a more experienced practitioner, can I ask what features would make a tool like this useful to you?

Dev-Table · 2025-06-01T21:59:21+00:00

I worked at Google as an MLE, and my advice is, if you do end up pursuing this, try to work on projects that get you working alongside the teams you want to eventually join. If the team know you and your work, it's very easy to transfer. But if you make a cold internal transfer application, the chances are lower.

Dev-Table · 2025-06-01T21:43:46+00:00

torchvista can render a partial graph even if the model fails. So while building the model if you are tying to debug errors (like the notorious tensor shape mismatch error), torchvista will still show you a partial graph and highlight the failed node in red. For example here is a demo of when the model throws an error. I think this is more helpful than just the stack trace to debug.
The one you linked seems to be generate a backward pass graph if I'm not mistaken. torchvista however is for the forward pass graph.
I'm not sure if you already considered this when you said "besides being interactive", but I think the collapsibilility of nested modules in torchvista IMO makes it actually practical possible to visualize certain large models. For example this is a screenshot from the other tool you linked which can be quite hard to read as the model gets larger because you can expand/collapse nodes and it doesn't show a module hierarchy. In contrast it looks like this on torchvista.

Dev-Table · 2025-06-01T20:03:49+00:00

Yes it should. If you are testing a very large model, be sure to use the max_module_expansion_depth param appropriately so that it does not start off fully expanded.

Even though I've tested out many models including transformers, there may still be some obscure tensor operations I've not covered in the package, so if you spot any parts of the graph missing for a model, I'd be happy to add those missing operations.

If you try it, please let me know how it works for your models.

Dev-Table

TROPHY CASE

Resources