Calling all networking and SVN gurus

7 Jun 2019

      To forestall the inevitable suggestion: no, the solution is not to move to
git.  At least not yet: for various reasons, it can't happen right now.
This is the last holdout, all our other repos are already git.

I apologize for the length of this dissertation: I've been doing my
homework as fast as I can, and I want to provide complete information.

I've just moved a previously semi-local (different building but "on
campus") SVN server to a cloud instance (running Debian, SVN 1.9.5, and
Apache 2.4.25, access via https://).  For the most part it went smoothly,
but now our "on campus" Jenkins server intermittently loses the network
connection on 'svn up' so the deploy fails.  That's bad.  What's far worse
is that while Jenkins initially resumed these broken updates politely when
the deployment was re-run, it's now decided that these resumes have left a
locked workspace and it has to do a fresh checkout.  One of the problems
with moving this repository to git is that SVN trunk is ~5G in size: a
checkout to a local client takes 5-10 minutes, but for some reason a
Jenkins checkout takes about 30 minutes (this is all pretty new to me, and
I haven't had time to investigate that time difference yet).  An update
takes a 60 seconds, but a new checkout takes 30 minutes - you can see where
that causes delays in deployment.

One possible mitigation is to trap the SVN failure, do a clean-up on the
directory and re-run.  I may have to try this, but ... that's just
mitigation, not a solution.

The Jenkins server is on Windows (I wasn't given a choice) and mostly works
well.  It uses Cygwin for all the SVN stuff (SVN version 1.11.x).  It's
also at a different physical location from me with different network rules.

The critical lines of the failure error:

    org.tmatesoft.svn.core.SVNException: svn: E175002: Connection reset

    svn: E175002: REPORT request failed on '/svn/repo/!svn/vcc/default'

(It being Java, the errors run to 40 or 50 lines: I think this is the only
part that's important.)  Unfortunately, this is one of those errors that
Google searches produce lots of questions, lots of speculations ... and no
solid answers.  At least not that I've found.  Likewise, a lot of people
want to know, as I did, about the relatively unusual filepath
("!svn/vcc/default") but I've never seen a solid answer as to what that's
about either.  The logs show Jenkins requests against that filepath with
both REPORT and PROPFIND, but Jenkins is only failing on REPORT.  Both of
these request types are WebDAV extensions.

Our staff don't seem to be having any trouble checking out or updating the
repository across a mix of Windows and Mac clients.

I've so far failed at getting more logging out of SVN and Apache: what I do
have doesn't tell me much useful, at least not related to these failures.

This problem is intermittent and infrequent.  I'm thinking the next step is
network sniffing - although I'm hoping someone can suggest something
better.  I'm relatively inexperienced with Wireshark and tcpdump (and SVN
...), but what experience I do have suggests all I'm going to get is to
learn that SVN stopped providing data without finding out why or how to fix
it.

Any suggestions welcomed, thanks.

-- 
Giles
https://www.gilesorr.com/
gilesorr@gmail.com

Giles Orr

William Porquet

Jamon Camisso

James Knott

Jamon Camisso

James Knott

lsorense＠csclub.uwaterloo.ca

Giles Orr

tags

participants (5)