Evaluation Report: iter2_larger_gpt_384d12

Run Name: iter2_larger_gpt_384d12

Model Type: gpt

Checkpoint: local/checkpoints/gpt_iter2_larger/best.pth

Dataset: local/datasets/single-action-shoulder-pan-700-combined

Date: 2026-03-17 23:03:22

Val Samples: 80

Analysis Notes

ITERATION 2: Larger GPT model (more capacity) ============================================== Changes from Iteration 1: - embed_dim: 256 -> 384 (50% increase) - depth: 8 -> 12 (50% more transformer blocks) - num_heads: 8 -> 12 (matched to embed_dim divisibility) - Added gradient clipping (max_norm=1.0) via training script - Parameter count: ~6.9M -> ~22M (3x more capacity) Rationale: Iteration 1 showed that 7x more data improved all metrics substantially, confirming the model was data-starved. With 700 samples and a well-fitting model (train_val_gap only -0.002), there's room to increase capacity. The main issues from iter1: 1. TF/FR gap = 0.007 (exposure bias / error compounding) 2. SSIM = 0.70 (below 0.8 — predictions lack structural detail) 3. Motor consistency error = 1.17 (position/velocity disagree) 4. Dynamic MSE still 10x static MSE More capacity should help with: - Better learning of spatial transformations (reducing dynamic MSE) - Richer representations for accurate visual prediction (improving SSIM) - Better attention over context (potentially reducing error compounding) It will NOT fix: - Fundamental autoregressive error compounding (architectural limitation) - Motor consistency (needs explicit consistency loss)

Metrics

MetricValue
val_mse0.010166
val_mse_visual0.010166
ssim0.755969
psnr20.385377
val_mse_motor_strip0.005807
val_mse_action_10.009340
val_mse_action_20.011126
val_mse_static0.001515
val_mse_dynamic0.021146
motor_position_mae_mean0.611813
motor_velocity_mae_mean0.051904
motor_direction_accuracy0.831250
motor_consistency_error1.007142
gpt_teacher_forcing_loss0.004073
gpt_free_running_loss0.010087
gpt_tf_fr_gap0.006014
action_discrimination_score0.030760
motor_discrimination_score0.048709
motor_position_mae_per_joint[1.7026, 0.5398, 0.6532, 0.1276, 0.6183, 0.0293]
motor_position_mae_action_10.590617
motor_position_mae_action_20.636447

Recommendations

Motor position and velocity predictions are inconsistent. Consider adding a consistency loss or simplifying the velocity encoding.

Counterfactual Action Grids

Each grid: Row 1 = GT, Row 2 = STAY (red), Row 3 = MOVE+ (green), Row 4 = MOVE- (blue)

Sample 0

Counterfactual grid 0
Error heatmap 0

Error heatmap (jet colormap)

JointGT PosSTAYMOVE+MOVE-
J0-39.7233-14.8561-40.8908-57.8810
J1-90.9068-89.4978-90.6043-90.5141
J266.322465.375966.807365.9938
J339.284539.076639.349739.3356
J48.50838.57278.53418.6252
J510.389610.417410.510910.2965

Sample 1

Counterfactual grid 1
Error heatmap 1

Error heatmap (jet colormap)

JointGT PosSTAYMOVE+MOVE-
J0-59.4725-14.1578-33.9736-55.1146
J1-89.7081-89.0437-89.8928-89.7522
J272.112970.125172.510971.5609
J339.284539.092639.270139.3785
J48.50838.66708.48148.6364
J510.312710.436910.521710.2955

Sample 2

Counterfactual grid 2
Error heatmap 2

Error heatmap (jet colormap)

JointGT PosSTAYMOVE+MOVE-
J033.1292-12.670534.683613.4138
J1-89.7081-89.2315-90.0498-89.7038
J270.580170.566770.281770.0349
J339.284539.166739.234839.1792
J48.39538.83368.59338.8080
J510.543310.477710.512610.3922

Sample 3

Counterfactual grid 3
Error heatmap 3

Error heatmap (jet colormap)

JointGT PosSTAYMOVE+MOVE-
J043.2233-2.415341.357223.1997
J1-89.1403-86.6571-89.4305-89.1867
J284.204884.729684.191384.2750
J340.307739.591239.617639.6883
J48.50838.83018.67238.8720
J510.543310.548910.567210.4516

GPT Per-Position Loss (last frame, raster order)

PositionMSE
00.004366
10.004235
20.003318
30.004040
40.002635
50.001353
60.000418
70.000364
80.000265
90.000246
100.000554
110.004441
120.006497
130.007173
140.003540
150.001071
160.001446
170.001381
180.000739
190.000488
200.000415
210.000355
220.000406
230.002242
240.005738
250.005935
260.007013
270.010850
280.000748
290.001387
300.000550
310.001322
320.001046
330.000242
340.000376
350.000318
360.000559
370.004274
380.006134
390.005908
400.005478
410.007304
420.001007
430.000864
440.001030
450.001296
460.000529
470.000316
480.000372
490.000272
500.000996
510.004273
520.004642
530.004821
540.005584
550.007514
560.000399
570.000553
580.000409
590.000374
600.000364
610.000376
620.000300
630.000257
640.000903
650.002732
660.004458
670.004950
680.003752
690.006010
700.000486
710.000554
720.000332
730.000226
740.000300
750.000171
760.000147
770.001418
780.003291
790.002393
800.001631
810.002451
820.002643
830.002439
840.000826
850.000499
860.000241
870.000165
880.000203
890.000145
900.000141
910.000288
920.001330
930.000699
940.000518
950.001501
960.002196
970.001605
980.000403
990.000292
1000.000256
1010.000236
1020.000177
1030.000142
1040.000150
1050.000269
1060.000559
1070.000684
1080.000365
1090.000513
1100.000863
1110.004003
1120.000776
1130.000211
1140.000236
1150.000224
1160.000210
1170.000189
1180.000146
1190.000265
1200.000439
1210.000329
1220.000277
1230.000170
1240.001398
1250.004803
1260.000410
1270.000219
1280.000220
1290.000247
1300.000230
1310.000151
1320.000131
1330.000149
1340.000238
1350.000241
1360.000186
1370.000337
1380.002012
1390.002559
1400.000188
1410.000143
1420.000157
1430.000178
1440.000183
1450.000156
1460.000135
1470.000112
1480.000228
1490.000217
1500.000362
1510.001074
1520.001461
1530.002125
1540.000207
1550.000159
1560.000153
1570.000141
1580.000149
1590.000129
1600.000121
1610.000072
1620.000072
1630.000194
1640.000284
1650.001801
1660.000704
1670.001515
1680.000172
1690.000140
1700.000133
1710.000116
1720.000118
1730.000124
1740.000116
1750.000082
1760.000070
1770.000096
1780.000281
1790.000624
1800.000224
1810.000846
1820.000163
1830.000129
1840.000129
1850.000110
1860.000117
1870.000134
1880.000232
1890.000292
1900.000277
1910.000149
1920.000168
1930.000431
1940.000392
1950.000784
1960.026444
1970.016779
1980.011916
1990.013906
2000.013947
2010.014100
2020.017798
2030.015685
2040.023945
2050.018929
2060.018423
2070.022384
2080.018985
2090.020869
2100.016290
2110.011444
2120.007635
2130.010872
2140.007458
2150.006765
2160.010697
2170.011760
2180.012712
2190.014239
2200.013095
2210.010110
2220.009889
2230.008705
2240.016485
2250.011683
2260.009467
2270.011115
2280.008075
2290.006564
2300.010089
2310.011900
2320.012526
2330.013578
2340.009336
2350.006940
2360.009023
2370.006196
2380.013014
2390.010126
2400.008455
2410.009669
2420.007123
2430.006064
2440.008147
2450.010423
2460.010479
2470.010630
2480.007288
2490.006928
2500.008154
2510.008875
2520.012215
2530.006741
2540.007989
2550.007715
2560.007044
2570.005791
2580.007569
2590.008973
2600.010641
2610.009927
2620.008423
2630.007324
2640.005127
2650.005218
2660.009777
2670.005100
2680.008150
2690.006397
2700.006410
2710.007435
2720.006923
2730.007574
2740.010194
2750.006569
2760.004474
2770.003714
2780.004069
2790.004044
2800.009561
2810.005394
2820.005804
2830.005639
2840.005988
2850.007470
2860.006579
2870.012233
2880.006802
2890.004772
2900.004192
2910.002671
2920.004689
2930.003695
2940.005938
2950.006439
2960.006257
2970.004979
2980.007706
2990.003895
3000.006391
3010.009795
3020.008051
3030.004552
3040.002750
3050.002076
3060.003142
3070.002814
3080.004724
3090.003036
3100.003898
3110.004720
3120.001165
3130.010703
3140.006815
3150.004560
3160.005716
3170.004160
3180.003500
3190.003082
3200.001376
3210.001634
3220.004898
3230.002882
3240.006093
3250.001429
3260.000291
3270.010254
3280.007048
3290.001154
3300.005798
3310.004422
3320.002151
3330.001953
3340.000941
3350.001118
3360.005934
3370.005235
3380.005460
3390.000290
3400.000201
3410.008306
3420.007402
3430.000770
3440.002799
3450.005110
3460.002407
3470.001624
3480.001121
3490.001382
3500.010047
3510.007766
3520.004653
3530.000211
3540.000194
3550.011328
3560.009775
3570.000897
3580.000547
3590.004684
3600.002996
3610.001049
3620.001481
3630.000746
3640.007438
3650.009012
3660.001253
3670.000142
3680.000303
3690.012323
3700.004122
3710.001815
3720.000307
3730.000449
3740.001999
3750.002026
3760.000923
3770.000524
3780.011223
3790.003464
3800.000136
3810.000113
3820.000555
3830.007499
3840.005095
3850.001606
3860.000187
3870.000127
3880.000720
3890.001608
3900.001177
3910.000720
3920.000494
3930.002463
3940.000447
3950.000173
3960.016853
3970.000055
3980.000128
3990.019005
4000.021321
4010.000157
4020.011824
4030.003638
4040.000020
4050.000026