aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSIPB2024-11-11 12:55:28 -0500
committerSIPB2024-11-16 15:48:36 -0500
commit1fe0e023ab6787f798bbfcfddc1cf0d05e64f5eb (patch)
treef6a52d0cae5ea7e2b7c28aa187fe1b1f59dba495
parentd3e7c9091bae6e51dcbd562a508861e89d0fb2be (diff)
Minor README tweaks
-rw-r--r--README.md12
-rw-r--r--proposal.pdfbin22659 -> 22706 bytes
2 files changed, 6 insertions, 6 deletions
diff --git a/README.md b/README.md
index f88de18..a7b45b6 100644
--- a/README.md
+++ b/README.md
@@ -17,9 +17,9 @@ humans cannot provide helpful labels. Deep learning models often perform remarka
We will use a synthetic task to test our hypothesis that models will generalize truthfully off-distribution. The synthetic task is computing the distance between various vertices in an input graph. Our experiment will have three parts:
-1. Pre-train a transformer to predict the distance between two fixed vertices $s,t$ on graphs with $n\in [16, 64]$ vertices.
-2. Fine-tune a transformer to predict the distances between $s,t'$ for any $t'$ which is on the shortest path from $s$ to $t$, but only do fine-tuning on graphs with $n\in [16,32]$ vertices.
-3. Test whether the transformer can accurately predict the distances between $s,t'$ for any $t'$ on the shortest path from $s$ to $t$ for graphs with $n\in [16,64]$ vertices.
+1. Pre-train a transformer to predict the distance between two fixed vertices $s,t$ on graphs with $n\in [8, 32)$ vertices.
+2. Fine-tune a transformer to predict the distances between $s,t'$ for any $t'$ which is on the shortest path from $s$ to $t$, but only do fine-tuning on graphs with $n\in [8,16)$ vertices.
+3. Test whether the transformer can accurately predict the distances between $s,t'$ for any $t'$ on the shortest path from $s$ to $t$ for graphs with $n\in [16,32)$ vertices.
## Data
@@ -31,9 +31,9 @@ The full input to our model will additionally add the target vertex after the pa
We have three separate datasets.
-- **Pre-train data**: For each $n \in [16,64]$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We always set the target vertex to be $2$ here.
-- **Fine-tune data**: For each $n \in [16,32]$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We select the target vertex to be a random vertex on the shortest path from $1$ to $2$.
-- **Generalization testing data**: The same as the fine-tune data, except we sample $n \in [32,64]$ instead.
+- **Pre-train data**: For each $n \in [8,32)$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We always set the target vertex to be $2$ here.
+- **Fine-tune data**: For each $n \in [8,16)$, we will generate several graphs on $n$ vertices. We generate these graphs by inserting $2n$ random edges into the graph. We select the target vertex to be a random vertex on the shortest path from $1$ to $2$.
+- **Generalization testing data**: The same as the fine-tune data, except we sample $n \in [16,32)$ instead.
As a side note, we are also curious whether the transformer learns to generalize to different distributions of graphs, such as denser graphs or graphs with different properties. Time permitting, we will also investigate this.
diff --git a/proposal.pdf b/proposal.pdf
index 20d1c52..9c8ff49 100644
--- a/proposal.pdf
+++ b/proposal.pdf
Binary files differ