Here's an image I quickly drew up as to what I meant in my last post.
The circle in the middle signifies the play/player's camera.
Everything in the blue circle is rendered locally (everything you can directly interact with and the immediate area surrounding you).
Everything in grey is being done completely in the cloud.
The orange lines signify the players PoV/FoV.
The brownish lines are the buffer, that is what the Wii U/X-box/PS is being sent from the cloud so that you have some lee-way to make up for latency in rendering.
Think of those 3D room renders you see on websites and hotels, the cloud would basically be doing the same for medium to far off objects, but only sending a slice marginally larger than what your FoV is to save on bandwidth.
The problem would be keeping those synced, but with modern latencies in the 50-100ms range, I'd say it's completely doable. The game could, as a protection against unexpected ping spikes, download a "key frame" of the entire 360 view (say once every 10 seconds) so if there is lag, you wouldn't be left with absolutely no background in part of your FOV. With your attention mostly on the foreground most people wouldn't even notice.
That would be just one way it COULD be done (not saying how it will be done). If the issues could be dealt with, it'd be a great way to free up system resources (mostly ram since you could then just worry about streaming nearby environments on the fly instead of preloading everything) and allow you to improve on the area immediately around the player character where the primary focus will be at.