Link to TSP repo

author: Anthony Wang 2024-12-11 13:07:33 -0500
committer: Anthony Wang 2024-12-11 13:07:33 -0500
commit: 9aa3c7a3e6dae93b8f0d5879ad33bca2563e9863 (patch)
tree: d4630f1a782d87d32cbf14a5e6a79e4396fdf41e /content
parent: a69ed79d1ec6aac4460f0168fb1a5e83ce3c4e3d (diff)
1 files changed, 2 insertions, 0 deletions
diff --git a/content/posts/solving-shortest-paths-with-transformers.md b/content/posts/solving-shortest-paths-with-transformers.md
index 32431e7..94457c9 100644
--- a/content/posts/solving-shortest-paths-with-transformers.md
+++ b/content/posts/solving-shortest-paths-with-transformers.md
@@ -263,6 +263,8 @@ In this post, we've investigated off-distribution generalization behavior of tra
 We demonstrated mathematically the existence of a transformer computing shortest paths, and also found such a transformer from scratch via gradient descent. 
 We showed that a transformer trained to compute shortest paths between two specific vertices $v_1,v_2$ can be efficiently fine-tuned to compute shortest paths to other vertices that lie on the shortest $v_1$-$v_2$ path, suggesting that our transformers learned representations implicitly carry rich information about the graph. Finally, we showed that the transformer was able to generalize off-distribution quite well in some settings, but less well in other settings. The main conceptual take-away from our work is that it's hard to predict when models will and won't generalize. 
 
+You can find our code [here](https://github.com/awestover/transformer-shortest-paths).
+
 ## Appendix
 
 ```python
author	Anthony Wang	2024-12-11 13:07:33 -0500
committer	Anthony Wang	2024-12-11 13:07:33 -0500
commit	9aa3c7a3e6dae93b8f0d5879ad33bca2563e9863 (patch)
tree	d4630f1a782d87d32cbf14a5e6a79e4396fdf41e /content
parent	a69ed79d1ec6aac4460f0168fb1a5e83ce3c4e3d (diff)