
We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to rotate our non-standard logs. Unfortunately we deleted all the old Debian 9 VMs before I noticed this problem, so they're not readily available for comparison. The logrotate config files worked fine on Debian 9 (provisioning is with Ansible, so it's consistent). The failures aren't detailed enough to help. Here's the config: # /etc/logrotate.d/ruby /opt/rubyapp/log/*.log { daily missingok rotate 28 compress delaycompress copytruncate } The parent configuration is standard Debian 10: # /etc/logrotate.conf # (system-supplied comments removed) weekly rotate 4 create include /etc/logrotate.d Unfortunately my paranoia is such that I'm redacting or modifying machine names and folder names ... I apologize for that. But I don't think the path involved is the problem. Here's one of the errors: # systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17 EDT; 12h ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 29004 (code=exited, status=1/FAILURE) Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files... Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/newrelic_agent.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stderr.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stdout.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/traffic.log.1 for compression Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed with result 'exit-code'. Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log files. Here's the folder contents: # cd /opt/rubyapp/log # ls -l -rw-rw-r--+ 1 root root 1982 Oct 16 15:08 newrelic_agent.log -rw-rw-r--+ 1 root root 7194 Oct 16 13:37 newrelic_agent.log.1 -rw-rw-r--+ 1 root root 2549 Oct 10 17:45 newrelic_agent.log.2.gz -rw-rw-r--+ 1 root root 154290 Oct 17 12:34 puma.stderr.log -rw-rw-r--+ 1 root root 573253 Oct 16 13:37 puma.stderr.log.1 -rw-rw-r--+ 1 root root 512648 Oct 10 17:45 puma.stderr.log.2.gz -rw-rw-r--+ 1 root root 238 Oct 16 15:08 puma.stdout.log -rw-rw-r--+ 1 root root 722 Oct 16 13:37 puma.stdout.log.1 -rw-rw-r--+ 1 root root 701 Oct 10 17:45 puma.stdout.log.2.gz -rw-rw-r--+ 1 root root 4747006453 Oct 17 12:37 traffic.log -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1 -rw-rw-r--+ 1 root root 850646513 Sep 20 18:12 traffic.log.2.gz I note that in /var/log/ - where logrotate continues to work fine - that files are owned mostly root:adm (what is 'adm', and does it matter in this context?) and the permissions are 640 rather than 664. There are ACLs attached to the files/folder shown above ... does _that_ matter? Where this gets weirder is that if I run 'logrotate --force /etc/logrotate.d/ruby' it gets rotated fine. It runs fine if run by hand, it fails if run on a SystemD timer. Which suggests a difference in permissions, but don't timers run as root:root? Any thoughts appreciated. As you can see, these are damn big logs, and we have this problem across multiple machines so I'd really like to fix it ... Errors on other servers aren't always consistent with this: a fix for this may or may not help with them, so I may be coming back for more. Thanks all. -- Giles https://www.gilesorr.com/ gilesorr@gmail.com

On 2019-10-17 1:59 p.m., Giles Orr via talk wrote:
We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to rotate our non-standard logs. Unfortunately we deleted all the old Debian 9 VMs before I noticed this problem, so they're not readily available for comparison. The logrotate config files worked fine on Debian 9 (provisioning is with Ansible, so it's consistent). The failures aren't detailed enough to help. Here's the config:
# /etc/logrotate.d/ruby /opt/rubyapp/log/*.log { daily missingok rotate 28 compress delaycompress copytruncate }
The parent configuration is standard Debian 10:
# /etc/logrotate.conf # (system-supplied comments removed) weekly rotate 4 create include /etc/logrotate.d
Unfortunately my paranoia is such that I'm redacting or modifying machine names and folder names ... I apologize for that. But I don't think the path involved is the problem.
Here's one of the errors:
# systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17 EDT; 12h ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 29004 (code=exited, status=1/FAILURE)
Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files... Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/newrelic_agent.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stderr.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stdout.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/traffic.log.1 for compression Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed with result 'exit-code'. Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log files.
Here's the folder contents:
# cd /opt/rubyapp/log # ls -l -rw-rw-r--+ 1 root root 1982 Oct 16 15:08 newrelic_agent.log -rw-rw-r--+ 1 root root 7194 Oct 16 13:37 newrelic_agent.log.1 -rw-rw-r--+ 1 root root 2549 Oct 10 17:45 newrelic_agent.log.2.gz -rw-rw-r--+ 1 root root 154290 Oct 17 12:34 puma.stderr.log -rw-rw-r--+ 1 root root 573253 Oct 16 13:37 puma.stderr.log.1 -rw-rw-r--+ 1 root root 512648 Oct 10 17:45 puma.stderr.log.2.gz -rw-rw-r--+ 1 root root 238 Oct 16 15:08 puma.stdout.log -rw-rw-r--+ 1 root root 722 Oct 16 13:37 puma.stdout.log.1 -rw-rw-r--+ 1 root root 701 Oct 10 17:45 puma.stdout.log.2.gz -rw-rw-r--+ 1 root root 4747006453 Oct 17 12:37 traffic.log -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1 -rw-rw-r--+ 1 root root 850646513 Sep 20 18:12 traffic.log.2.gz
I note that in /var/log/ - where logrotate continues to work fine - that files are owned mostly root:adm (what is 'adm', and does it matter in this context?) and the permissions are 640 rather than 664. There are ACLs attached to the files/folder shown above ... does _that_ matter? Where this gets weirder is that if I run 'logrotate --force /etc/logrotate.d/ruby' it gets rotated fine. It runs fine if run by hand, it fails if run on a SystemD timer. Which suggests a difference in permissions, but don't timers run as root:root?
Any thoughts appreciated. As you can see, these are damn big logs, and we have this problem across multiple machines so I'd really like to fix it ...
Errors on other servers aren't always consistent with this: a fix for this may or may not help with them, so I may be coming back for more.
Thanks all.
-- Giles https://www.gilesorr.com/ gilesorr@gmail.com <mailto:gilesorr@gmail.com>
--- Post to this mailing list talk@gtalug.org Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk
I note that the versions of logrotate in stable (and also in unstable) have Bug "#831764 [logrotate] logrotate: does not rotate logs if delaycompress is used" logged against them. I'm not sure your problem sounds like the one posted in the bug report though. I think I remember from a previous life that logrotate will fail if the logging entity (rubyapp?) holds the file open. In that case you can try and use the post-rotate directive to "killall -hup .. " the entity, which should kick it off the file. Does /var/lib/logrotate/status have anything to say about the log? It should indicate when the last rotate occurred. -- Michael Galea

On Thu, 17 Oct 2019 at 19:46, Michael Galea via talk <talk@gtalug.org> wrote:
On 2019-10-17 1:59 p.m., Giles Orr via talk wrote:
We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to rotate our non-standard logs. Unfortunately we deleted all the old Debian 9 VMs before I noticed this problem, so they're not readily available for comparison. The logrotate config files worked fine on Debian 9 (provisioning is with Ansible, so it's consistent). The failures aren't detailed enough to help. Here's the config:
# /etc/logrotate.d/ruby /opt/rubyapp/log/*.log { daily missingok rotate 28 compress delaycompress copytruncate }
The parent configuration is standard Debian 10:
# /etc/logrotate.conf # (system-supplied comments removed) weekly rotate 4 create include /etc/logrotate.d
Unfortunately my paranoia is such that I'm redacting or modifying machine names and folder names ... I apologize for that. But I don't think the path involved is the problem.
Here's one of the errors:
# systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17 EDT; 12h ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 29004 (code=exited, status=1/FAILURE)
Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files... Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/newrelic_agent.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stderr.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stdout.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/traffic.log.1 for compression Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed with result 'exit-code'. Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log files.
Here's the folder contents:
# cd /opt/rubyapp/log # ls -l -rw-rw-r--+ 1 root root 1982 Oct 16 15:08 newrelic_agent.log -rw-rw-r--+ 1 root root 7194 Oct 16 13:37
newrelic_agent.log.1
-rw-rw-r--+ 1 root root 2549 Oct 10 17:45 newrelic_agent.log.2.gz -rw-rw-r--+ 1 root root 154290 Oct 17 12:34 puma.stderr.log -rw-rw-r--+ 1 root root 573253 Oct 16 13:37 puma.stderr.log.1 -rw-rw-r--+ 1 root root 512648 Oct 10 17:45
puma.stderr.log.2.gz
-rw-rw-r--+ 1 root root 238 Oct 16 15:08 puma.stdout.log -rw-rw-r--+ 1 root root 722 Oct 16 13:37 puma.stdout.log.1 -rw-rw-r--+ 1 root root 701 Oct 10 17:45
puma.stdout.log.2.gz
-rw-rw-r--+ 1 root root 4747006453 Oct 17 12:37 traffic.log -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1 -rw-rw-r--+ 1 root root 850646513 Sep 20 18:12 traffic.log.2.gz
I note that in /var/log/ - where logrotate continues to work fine - that files are owned mostly root:adm (what is 'adm', and does it matter in this context?) and the permissions are 640 rather than 664. There are ACLs attached to the files/folder shown above ... does _that_ matter? Where this gets weirder is that if I run 'logrotate --force /etc/logrotate.d/ruby' it gets rotated fine. It runs fine if run by hand, it fails if run on a SystemD timer. Which suggests a difference in permissions, but don't timers run as root:root?
Any thoughts appreciated. As you can see, these are damn big logs, and we have this problem across multiple machines so I'd really like to fix it ...
Errors on other servers aren't always consistent with this: a fix for this may or may not help with them, so I may be coming back for more.
Thanks all.
-- Giles
I note that the versions of logrotate in stable (and also in unstable) have Bug "#831764 [logrotate] logrotate: does not rotate logs if delaycompress is used" logged against them. I'm not sure your problem sounds like the one posted in the bug report though.
I think I remember from a previous life that logrotate will fail if the logging entity (rubyapp?) holds the file open. In that case you can try and use the post-rotate directive to "killall -hup .. " the entity, which should kick it off the file.
Does /var/lib/logrotate/status have anything to say about the log? It should indicate when the last rotate occurred.
Yup, Ruby does hold the file open, but the logrotate option 'copytruncate' was created for exactly that reason and worked on Debian 9: again, this works when run by hand on Debian 10, which I think suggests that's not the problem. I was aware of the /var/lib/logrotate/status file, but didn't look at it closely until you mentioned it. And in looking at it closely, it mostly looks as it should BUT I see a problem, an artifact of a previous attempt to do log rotation on the parent folder: "/opt/rubyapp/*.log" 2019-9-17-0:0:0 "/opt/rubyapp/log/traffic.log" 2019-10-18-0:0:5 "/opt/rubyapp/log/puma.stdout.log" 2019-10-18-0:0:5 (lots of other lines deleted). Notice that first line: we've been inconsistent across our projects about where the Ruby/Puma logs were stored, so I tried to rotate the logs in both places. I've removed that artifact. It's a long shot, but at this point I'm giving percentage to long shots ... I'm also removing "delaycompress" from one of the hosts, we'll see if that does anything. Again, seems like a long shot, but ... Thanks. -- Giles https://www.gilesorr.com/ gilesorr@gmail.com

On Thu, Oct 17, 2019 at 01:59:18PM -0400, Giles Orr via talk wrote:
We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to rotate our non-standard logs. Unfortunately we deleted all the old Debian 9 VMs before I noticed this problem, so they're not readily available for comparison. The logrotate config files worked fine on Debian 9 (provisioning is with Ansible, so it's consistent). The failures aren't detailed enough to help. Here's the config:
# /etc/logrotate.d/ruby /opt/rubyapp/log/*.log { daily missingok rotate 28 compress delaycompress copytruncate }
The parent configuration is standard Debian 10:
# /etc/logrotate.conf # (system-supplied comments removed) weekly rotate 4 create include /etc/logrotate.d
Unfortunately my paranoia is such that I'm redacting or modifying machine names and folder names ... I apologize for that. But I don't think the path involved is the problem.
Here's one of the errors:
# systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17 EDT; 12h ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 29004 (code=exited, status=1/FAILURE)
Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files... Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/newrelic_agent.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stderr.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stdout.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/traffic.log.1 for compression Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed with result 'exit-code'. Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log files.
Here's the folder contents:
# cd /opt/rubyapp/log # ls -l -rw-rw-r--+ 1 root root 1982 Oct 16 15:08 newrelic_agent.log -rw-rw-r--+ 1 root root 7194 Oct 16 13:37 newrelic_agent.log.1 -rw-rw-r--+ 1 root root 2549 Oct 10 17:45 newrelic_agent.log.2.gz -rw-rw-r--+ 1 root root 154290 Oct 17 12:34 puma.stderr.log -rw-rw-r--+ 1 root root 573253 Oct 16 13:37 puma.stderr.log.1 -rw-rw-r--+ 1 root root 512648 Oct 10 17:45 puma.stderr.log.2.gz -rw-rw-r--+ 1 root root 238 Oct 16 15:08 puma.stdout.log -rw-rw-r--+ 1 root root 722 Oct 16 13:37 puma.stdout.log.1 -rw-rw-r--+ 1 root root 701 Oct 10 17:45 puma.stdout.log.2.gz -rw-rw-r--+ 1 root root 4747006453 Oct 17 12:37 traffic.log -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1 -rw-rw-r--+ 1 root root 850646513 Sep 20 18:12 traffic.log.2.gz
I note that in /var/log/ - where logrotate continues to work fine - that files are owned mostly root:adm (what is 'adm', and does it matter in this context?) and the permissions are 640 rather than 664. There are ACLs attached to the files/folder shown above ... does _that_ matter? Where this gets weirder is that if I run 'logrotate --force /etc/logrotate.d/ruby' it gets rotated fine. It runs fine if run by hand, it fails if run on a SystemD timer. Which suggests a difference in permissions, but don't timers run as root:root?
Any thoughts appreciated. As you can see, these are damn big logs, and we have this problem across multiple machines so I'd really like to fix it ...
Errors on other servers aren't always consistent with this: a fix for this may or may not help with them, so I may be coming back for more.
Thanks all.
Maybe systemd runs the logrotate with some priviledges dropped and perhaps your ACLs are affecting things. What are the permissions on the log directory itself? -- Len Sorensen

On Fri, 18 Oct 2019 at 12:11, Lennart Sorensen <lsorense@csclub.uwaterloo.ca> wrote:
We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to rotate our non-standard logs. Unfortunately we deleted all the old Debian 9 VMs before I noticed this problem, so they're not readily available for comparison. The logrotate config files worked fine on Debian 9 (provisioning is with Ansible, so it's consistent). The failures aren't detailed enough to help. Here's the config:
# /etc/logrotate.d/ruby /opt/rubyapp/log/*.log { daily missingok rotate 28 compress delaycompress copytruncate }
The parent configuration is standard Debian 10:
# /etc/logrotate.conf # (system-supplied comments removed) weekly rotate 4 create include /etc/logrotate.d
Unfortunately my paranoia is such that I'm redacting or modifying machine names and folder names ... I apologize for that. But I don't think the path involved is the problem.
Here's one of the errors:
# systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17 EDT; 12h ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 29004 (code=exited, status=1/FAILURE)
Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files... Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/newrelic_agent.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stderr.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stdout.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/traffic.log.1 for compression Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main
exited, code=exited, status=1/FAILURE Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed with result 'exit-code'. Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log files.
Here's the folder contents:
# cd /opt/rubyapp/log # ls -l -rw-rw-r--+ 1 root root 1982 Oct 16 15:08 newrelic_agent.log -rw-rw-r--+ 1 root root 7194 Oct 16 13:37 newrelic_agent.log.1 -rw-rw-r--+ 1 root root 2549 Oct 10 17:45 newrelic_agent.log.2.gz -rw-rw-r--+ 1 root root 154290 Oct 17 12:34 puma.stderr.log -rw-rw-r--+ 1 root root 573253 Oct 16 13:37 puma.stderr.log.1 -rw-rw-r--+ 1 root root 512648 Oct 10 17:45 puma.stderr.log.2.gz -rw-rw-r--+ 1 root root 238 Oct 16 15:08 puma.stdout.log -rw-rw-r--+ 1 root root 722 Oct 16 13:37 puma.stdout.log.1 -rw-rw-r--+ 1 root root 701 Oct 10 17:45 puma.stdout.log.2.gz -rw-rw-r--+ 1 root root 4747006453 Oct 17 12:37 traffic.log -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1 -rw-rw-r--+ 1 root root 850646513 Sep 20 18:12 traffic.log.2.gz
I note that in /var/log/ - where logrotate continues to work fine - that files are owned mostly root:adm (what is 'adm', and does it matter in
context?) and the permissions are 640 rather than 664. There are ACLs attached to the files/folder shown above ... does _that_ matter? Where this gets weirder is that if I run 'logrotate --force /etc/logrotate.d/ruby' it gets rotated fine. It runs fine if run by hand, it fails if run on a SystemD timer. Which suggests a difference in permissions, but don't timers run as root:root?
Any thoughts appreciated. As you can see, these are damn big logs, and we have this problem across multiple machines so I'd really like to fix it ...
Errors on other servers aren't always consistent with this: a fix for
On Thu, Oct 17, 2019 at 01:59:18PM -0400, Giles Orr via talk wrote: process this this
may or may not help with them, so I may be coming back for more.
Thanks all.
Maybe systemd runs the logrotate with some priviledges dropped and perhaps your ACLs are affecting things. What are the permissions on the log directory itself?
Good point ...: # getfacl log/ # file: log/ # owner: builder # group: builder user::rwx group::rwx #effective:r-x group:deployers:rwx #effective:r-x mask::r-x other::r-x default:user::rwx default:group::rwx default:group:deployers:rwx default:mask::rwx default:other::r-x The ownership of /opt/rubyapp/ (and all its children) is modified to allow a specific non-privileged users to be able to do deployments. How do I determine the exact permissions that SystemD runs its timers with? I haven't dug deeply for that - I've mostly been looking at logrotate itself. I notice there's a 'su <user> <group>' option for logrotate: I haven't tried yet, but I thought 'su root root' would be good but probably wouldn't work because if it doesn't already have those perms it probably can't get them. But this suggests that 'su builder deployers' might work? Except, except ... everything in the folder still claims to be owned root.root (but is group writable ...). (Michael G - still parsing yours out, I'll get back to you. This was more immediately answerable.) -- Giles https://www.gilesorr.com/ gilesorr@gmail.com

On Thu, 17 Oct 2019 at 13:59, Giles Orr <gilesorr@gmail.com> wrote:
We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to rotate our non-standard logs. Unfortunately we deleted all the old Debian 9 VMs before I noticed this problem, so they're not readily available for comparison. The logrotate config files worked fine on Debian 9 (provisioning is with Ansible, so it's consistent). The failures aren't detailed enough to help. Here's the config:
# /etc/logrotate.d/ruby /opt/rubyapp/log/*.log { daily missingok rotate 28 compress delaycompress copytruncate }
The parent configuration is standard Debian 10:
# /etc/logrotate.conf # (system-supplied comments removed) weekly rotate 4 create include /etc/logrotate.d
Unfortunately my paranoia is such that I'm redacting or modifying machine names and folder names ... I apologize for that. But I don't think the path involved is the problem.
Here's one of the errors:
# systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17 EDT; 12h ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 29004 (code=exited, status=1/FAILURE)
Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files... Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/newrelic_agent.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stderr.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/puma.stdout.log.1 for compression Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open /opt/rubyapp/log/traffic.log.1 for compression Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed with result 'exit-code'. Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log files.
Here's the folder contents:
# cd /opt/rubyapp/log # ls -l -rw-rw-r--+ 1 root root 1982 Oct 16 15:08 newrelic_agent.log -rw-rw-r--+ 1 root root 7194 Oct 16 13:37 newrelic_agent.log.1 -rw-rw-r--+ 1 root root 2549 Oct 10 17:45 newrelic_agent.log.2.gz -rw-rw-r--+ 1 root root 154290 Oct 17 12:34 puma.stderr.log -rw-rw-r--+ 1 root root 573253 Oct 16 13:37 puma.stderr.log.1 -rw-rw-r--+ 1 root root 512648 Oct 10 17:45 puma.stderr.log.2.gz -rw-rw-r--+ 1 root root 238 Oct 16 15:08 puma.stdout.log -rw-rw-r--+ 1 root root 722 Oct 16 13:37 puma.stdout.log.1 -rw-rw-r--+ 1 root root 701 Oct 10 17:45 puma.stdout.log.2.gz -rw-rw-r--+ 1 root root 4747006453 Oct 17 12:37 traffic.log -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1 -rw-rw-r--+ 1 root root 850646513 Sep 20 18:12 traffic.log.2.gz
I note that in /var/log/ - where logrotate continues to work fine - that files are owned mostly root:adm (what is 'adm', and does it matter in this context?) and the permissions are 640 rather than 664. There are ACLs attached to the files/folder shown above ... does _that_ matter? Where this gets weirder is that if I run 'logrotate --force /etc/logrotate.d/ruby' it gets rotated fine. It runs fine if run by hand, it fails if run on a SystemD timer. Which suggests a difference in permissions, but don't timers run as root:root?
Any thoughts appreciated. As you can see, these are damn big logs, and we have this problem across multiple machines so I'd really like to fix it ...
Errors on other servers aren't always consistent with this: a fix for this may or may not help with them, so I may be coming back for more.
Thanks all.
I thought I should report the slightly unsatisfactory conclusion to this month-plus epic fight. I'm still fairly sure that the problem is caused by the ACLs applied to the folder and files. Which would suggest that the SystemD timer that runs 'logrotate' each night has some weird not-quite-root permissions. I did try to use logrotate's "su" directive to run logrotate as the user/group that have the added perms. This didn't work either. For all that, I never conclusively proved it was a perms or ACLs issue. Having previously and repeatedly proven that running logrotate by hand as root always successfully rotated the logs, I moved /etc/logrotate.d/ruby to /etc/logrotate.ruby.conf and it's now run on a nightly root cron job instead of the SystemD timer. Essentially I've reverted the behaviour - for the one application log folder - from Debian 10 to Debian 9. <sigh> I'm not happy about adding complexity in the form of an oddball exception, but with Ansible automation it's not too hard to maintain. -- Giles https://www.gilesorr.com/ gilesorr@gmail.com
participants (3)
-
Giles Orr
-
lsorense@csclub.uwaterloo.ca
-
Michael Galea