It's easier to work backwards from Pascal's ability then trying to work out what the handheld is and move up from there...
Firstly, I think having active cooling in the handheld is a bad idea, unless it is passively cooled on the go, and only active when plugged into the dock. Actively cooled, the device could potentially hit normal pascal clocks if the cuda core count is low enough (keeping the TDP low)
Looking at the X1, 256 cuda cores @ 500mhz is only 1.5 watts (just the GPU) so if we are talking about pascal, 16nm and X2, 1.1 watts from 500mhz makes sense with power consumption (similar to the drop between GTX 980 and GTX 1060)
Pascal can clock between 1.5 ghz and 1.6 ghz indefinitely. X2 will likely have 1 of 2 configurations, either 256 cuda cores like X1, or 384 cuda cores. This leads to a performance for the docked device running at full speed, somewhere between 768 GFLOPs - 819GFLOPs for 256 cuda cores OR 1152 GFLOPs - 1228 GFLOPs for 384 cuda cores. Targeting 1080p.
If the handheld is 540p, you'd require 1/4th the flops, meaning 1/4th the clock, putting the handheld at 375mhz - 400mhz on the go, this works out to 192 GFLOPs - 205 GFLOPs for 256 cuda cores OR 288 GFLOPs - 307 GFLOPs for 384 cuda cores. Because of the lower clocks, you should be looking at ~1 watt for the GPU here give or take .2 watts depending on config.
Pascal GFLOPs are better than GCN GFLOPs in OpenGL and DX11 when comparing GCN to these numbers, the docked device would be similar in performance to 1 TFLOPs for the docked 768GFLOPs device OR 1.7 TFLOPs for the docked 1228 GFLOPs device. NX using X2 and clocking to pascal's stable clocks, should be on par with XB1 and PS4, anywhere from slightly under XB1 to slightly under PS4. Because the handheld would likely use 540p resolution for the screen, Nintendo shooting for 1228 GFLOPs makes more sense, and remember this is Nvidia we are talking about, Nintendo is just the costumer, they would want to compete with AMD here and get as close to or beat out PS4 if possible (1.7ghz clocks needed)
It should also be noted that if an alternative dock (console) had this same X2 chip, they could bridge them with SLI when connected, and Nintendo could end up with a 2.5 TFLOPs Pascal device, which would be close to a ~3.2 TFLOPs GCN device. This could also be a way for Nintendo to introduce VR at a later date as well...