This is a pretty extensive question, but I will try to explain where we came from and what decisions we took.
Since our games are built for an online community that wants to get together and have a kick-ass time blowing shit up, not having online support was a non-starter. It has always been taken for granted, and hence not a budget question per se. I can’t speak for the other studios that choose not to support it, but it’s a decision they have to make based on their goals, scope, money and so on. It’s not a feature that you just “throw in”.
Networking a game is difficult. It takes time and always requires maintenance. It is a constant search for good trade-offs and for where to hide latency. We, for example, chose to make our games peer-to-peer based. The main reason for this choice is that we feel that the players view of the world should never feel incorrect. If I kill an enemy, it should die. If I get shot, I need to (have been able to) see it happen. A consequence for this is a world state that is never actually “correct” on any machine, but is an evolving state from constant stream of compromises. Another reason for going peer-to-peer is to avoid using dedicated servers, that could possibly get shut down.
One large drawback of this is that it is more difficult to prevent cheating, since players constantly try to agree about the state of the world. In a competitive game, this instead could be a non-starter for the peer-to-peer decision.
We used the Bitsquid engine for Helldivers and Gauntlet which has pretty extensive network support. It supports the peer-to-peer model as well as the server-client one, and have mechanisms for dealing with connections, game sessions, sending information via Remote Procedure Calls (RPC) or via a stream of Game Object Data. They also have support for dealing with the major platform APIs for server browsing, friends, lobbies and so on. This is something that saved us quite some development time. My point is that the engine is a factor in this decision as well. You might not need a team of experienced network programmers if you already have a solid and robust engine/foundation for handling networking.
Some of the twin-stick shooters out there are extremely fast paced. Depending on network model used and the need for accuracy/anti-cheat, it could potentially put more strain on the bandwidth than acceptable, and hence not pursued. As mentioned, it’s a trade-off and I’m sure they have their reasons for going down their path.
Now, returning to you main question; I would say that the most difficult issues that we ran into stems from the choice of network model, i.e. the choice to go with peer-to-peer. Since no one is fully in charge, we can receive information about something that is no longer valid. For example, if one peer kills an owned enemy, it can later receive a kill request from someone else. This requires all systems/components to validate incoming data and act accordingly. It can be difficult to remember and think of all the edge-cases that can occur when implementing features for the game.
Another big one is host-migration. Since we don’t have dedicated servers, we strive to implement mechanisms for handling that anyone drops at anytime, including the one who started the game. For example, if a peer owns an enemy that has a certain plan it is executing, and then that peer drops, someone else needs to take over. Now, we could sync all enemies active plans, but that might be prohibitively expensive (bandwidth wise). Another strategy is to re-evaluate what the enemy should do once it migrates to the new peer, which might cause a glitch in the matrix. Again, trade-offs.
Lastly, the biggest is probably to think of good ways to hide the latency for all cases. In Gauntlet, when an enemy gets knocked down, locally it is sent flying immediately but it doesn’t get up until the owner decides to, meaning that the latency is hidden in the duration that the enemy is lying down, (which would lead to longer periods of lying down, the higher the latency). Another example from Gauntlet is when you get attacked by enemies, they actually request to hit you so that we can match up the start of the ability with the reaction of the player who gets hit, meaning that we hide the latency in a small gap before the enemy actually attacks. Note that we can only do this based on the assumption that the player’s own local actions are the most important, which in turn is based on the coop nature and our design-goals.
I’m sure that server-client would have a different set of challenges, but these were some of ours.
To answer your last set of questions; Money is part of it, there is basically no way around it. Supporting network in a game requires more development time and time for testing. So if budget is tight and the developers feel that it won’t add enough to the game, it might get cut.
We were a pretty small studio and didn’t have a dedicated network team, but since online coop is one of the main features, it also got a lot of time allocated for it.
It’s not harder in an overhead/isometric twin-stick space per se, but a higher paced game can be more difficult depending on requirements (since higher paced often means more stuff to sync).
I hope this answers your questions somewhat satisfactory!