Today at a client site I was testing the deployment of a WCM site across to an external DMZ network. I had setup a blank site collection on the destination server, SSP etc, configured the Central Administration to accept deployment requests, then switched back to the development server to start the deployment. The next 2 hours would be painful...
When trying to setup the deployment path, it kept failing connecting to the destination Central Administration site. With the help of a local infrastructure guru we went into troubleshooting mode. There was no proxy configured and the web.config file of each site on the server was configured to not detect for one, yet on the firewall trace we could see that the server was communicating with the Proxy as an anonymous user. We could have allowed anonymous access on the proxy to fix the problem (which we did initially, but this is not the preferred approach), but their had to be a better way. We went off on tangents for a while, checking the registry against each of the system accounts to see if their settings were configured to use a proxy or to auto detect (which they weren't). We couldn't find where this 'Autodetect' IE setting was coming from (i.e. why was it going to the proxy).
Before I go into spelling out how we fixed it, here's a little lesson that I learnt from my infrastructure friend. Whenever a server looks for a server, it accesses a record called 'wpad' within the DNS. The wpad tells the client where to go find the server. The default setting was the proxy server (i.e. the ISA Server). The client then communicates with this server (i.e. the ISA Server) for a configuration file, which tells it which domains it can try to talk directly with and which ones have to go via the proxy. In our case, this config file was telling the client that every request must go through the proxy.
To fix the problem we modified ISA's automatic detections script. We added the external domains to the domain include list, which essentially tells it that a client request can try to talk to the remote server directly. We then modified the firewall rules (not ISA in this case) to allow communication from the specific source MOSS Server to the destination MOSS server on the nominated central administration port.
This fixed the communication problem immediately.
Next Steps - I configured the Quick Deploy settings to allow quick deployments to occur, then created another job for a full deployment. After trying to start the quick deploy and let it sit there for a while on the 'Preparing' status (I don't know whether this is a problem because of the OWSTimer bug working on UTC time and not GMT time), I cancelled the job. Funny thing was, shortly after this, the status changed to Running and then eventually to Failed.
I've now kicked off the full deployment, but at the moment it is still in the 'Preparing' state. I will keep you posted on whether or not this works (BTW, I have yet to get deployment to work on a VPC environment, where the source and destination sites are on the same server (and SQL is on the same Server too). The error I usually get is a timeout error. It would be good if Microsoft could provide some more technical notes about what is happening under the covers when a deployment occurs. There is some coding samples, but if the underlying plumbing is not working, whats the point.