
On 6/7/19 1:16 PM, Giles Orr via talk wrote:
To forestall the inevitable suggestion: no, the solution is not to move to git. At least not yet: for various reasons, it can't happen right now. This is the last holdout, all our other repos are already git.
<snip>
I've so far failed at getting more logging out of SVN and Apache: what I do have doesn't tell me much useful, at least not related to these failures.
You could turn on trace logging for mod_dav, and if you are worried about spamming logs, put some conditionals around the jenkins host's IP. e.g. 'LogLevel info dav:trace3' would turn on trace3 level logging for dav and leave everything else at info.
This problem is intermittent and infrequent. I'm thinking the next step is network sniffing - although I'm hoping someone can suggest something better. I'm relatively inexperienced with Wireshark and tcpdump (and SVN ...), but what experience I do have suggests all I'm going to get is to learn that SVN stopped providing data without finding out why or how to fix it.
First thing I'd look at is MTU between Jenkins and the remote server. If there's some route churn you could conceivably end up with different MTUs which can lead to inconsistent fragmentation or timeouts. With a large SVN repo and lots of propfind requests, the overhead of a bad MTU somewhere along the line would be quite noticeable. Try tracepath & tracepath6 to see what things look like between the hosts. Also check to see if there's some mixed IPv4/IPv6 business going on. I doubt it, but I've seen inconsistent behaviour with dual stack applications that aren't explicitly configured to support one or both. Otherwise, to eliminate whether it is SVN on Windows that's the issue, try rsyncing the underlying repository and bypass SVN entirely. Cygwin has SSH & rsync support, so you can do fast differential rsyncs. Then in the jenkins job, specify whatever svn operations you need to unlock and checkout the correct branch & revision. Let us know what you find!