From 1fe0e023ab6787f798bbfcfddc1cf0d05e64f5eb Mon Sep 17 00:00:00 2001 From: SIPB Date: Mon, 11 Nov 2024 12:55:28 -0500 Subject: Minor README tweaks --- README.md | 12 ++++++------ proposal.pdf | Bin 22659 -> 22706 bytes 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index f88de18..a7b45b6 100644 --- a/README.md +++ b/README.md @@ -17,9 +17,9 @@ humans cannot provide helpful labels. Deep learning models often perform remarka We will use a synthetic task to test our hypothesis that models will generalize truthfully off-distribution. The synthetic task is computing the distance between various vertices in an input graph. Our experiment will have three parts: -1. Pre-train a transformer to predict the distance between two fixed vertices $s,t$ on graphs with $n\in [16, 64]$ vertices. -2. Fine-tune a transformer to predict the distances between $s,t'$ for any $t'$ which is on the shortest path from $s$ to $t$, but only do fine-tuning on graphs with $n\in [16,32]$ vertices. -3. Test whether the transformer can accurately predict the distances between $s,t'$ for any $t'$ on the shortest path from $s$ to $t$ for graphs with $n\in [16,64]$ vertices. +1. Pre-train a transformer to predict the distance between two fixed vertices $s,t$ on graphs with $n\in [8, 32)$ vertices. +2. Fine-tune a transformer to predict the distances between $s,t'$ for any $t'$ which is on the shortest path from $s$ to $t$, but only do fine-tuning on graphs with $n\in [8,16)$ vertices. +3. Test whether the transformer can accurately predict the distances between $s,t'$ for any $t'$ on the shortest path from $s$ to $t$ for graphs with $n\in [16,32)$ vertices. ## Data @@ -31,9 +31,9 @@ The full input to our model will additionally add the target vertex after the pa We have three separate datasets. -- **Pre-train data**: For each $n \in [16,64]$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We always set the target vertex to be $2$ here. -- **Fine-tune data**: For each $n \in [16,32]$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We select the target vertex to be a random vertex on the shortest path from $1$ to $2$. -- **Generalization testing data**: The same as the fine-tune data, except we sample $n \in [32,64]$ instead. +- **Pre-train data**: For each $n \in [8,32)$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We always set the target vertex to be $2$ here. +- **Fine-tune data**: For each $n \in [8,16)$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We select the target vertex to be a random vertex on the shortest path from $1$ to $2$. +- **Generalization testing data**: The same as the fine-tune data, except we sample $n \in [16,32)$ instead. As a side note, we are also curious whether the transformer learns to generalize to different distributions of graphs, such as denser graphs or graphs with different properties. Time permitting, we will also investigate this. diff --git a/proposal.pdf b/proposal.pdf index 20d1c52..9c8ff49 100644 Binary files a/proposal.pdf and b/proposal.pdf differ -- cgit v1.2.3-70-g09d2