How much do you know about how linux executes binaries?

Someone at work just made a slight mistake: root@ECA:~# cd /var root@ECA:/var# mv core.* /* [clearly that trailing * was not meant to be there] [bunch of errors about target already existing] root@ECA:/var# ls -bash: /bin/ls: No such file or directory root@ECA:/var# So what do you do now? I was able to fix it in about 5 minutes without using anything other than what was on the running system? Interesting little problem. The original state of the system was: root@ECA:/var# ls -l / drwxr-xr-x 2 root root 4096 Dec 9 11:47 bin drwxr-xr-x 2 root root 4096 Dec 6 11:16 boot drwxr-xr-x 10 root root 2340 Dec 6 11:18 dev drwxr-xr-x 46 root root 4096 Dec 9 11:48 etc drwxr-xr-x 6 root root 4096 Dec 5 12:09 home lrwxrwxrwx 1 root root 10 Dec 5 12:09 init -> /sbin/init drwxr-xr-x 4 root root 4096 Dec 5 12:09 lib drwxr-xr-x 5 root root 4096 Dec 5 12:09 lib64 drwx------ 2 root root 4096 Dec 5 12:09 lost+found drwxr-xr-x 2 root root 4096 Nov 19 10:07 media drwxr-xr-x 4 root root 4096 Dec 5 12:09 mnt drwxrwxrwx 6 root root 4096 Dec 5 12:24 opt drwxr-xr-x 3 root root 4096 Dec 5 12:09 persistdata drwxr-xr-x 5 root root 4096 May 25 2018 persistent dr-xr-xr-x 473 root root 0 Dec 6 11:17 proc drwxr-xr-x 5 root root 4096 Dec 9 11:38 root drwxr-xr-x 12 root root 780 Dec 6 11:18 run drwxr-xr-x 2 root root 4096 Dec 9 11:47 sbin dr-xr-xr-x 11 root root 0 Dec 6 11:17 sys drwxrwxrwt 8 root root 400 Dec 11 12:14 tmp drwxr-xr-x 12 root root 4096 Dec 5 12:09 usr drwxr-xr-x 15 root root 4096 Dec 5 12:15 var -- Len Sorensen

Someone at work just made a slight mistake:
root@ECA:~# cd /var root@ECA:/var# mv core.* /* [clearly that trailing * was not meant to be there] [bunch of errors about target already existing] root@ECA:/var# ls -bash: /bin/ls: No such file or directory root@ECA:/var#
So what do you do now? My first idea would be to echo $PATH and see if the path is messed up or /etc/profile or other bash startup scripts are not screwed up. Not sure what
On 12/11/19 12:27 PM, Lennart Sorensen via talk wrote: the errors were but that's the first place to start. If that's fine then /bin is a symlink from /usr/bin these days to something in /usr/bin and I would less if my symlinks for /bin are now screwed up and fix that. The real question through is how much of root is overwritten in this case. Nick
I was able to fix it in about 5 minutes without using anything other than what was on the running system? Interesting little problem.
The original state of the system was:
root@ECA:/var# ls -l / drwxr-xr-x 2 root root 4096 Dec 9 11:47 bin drwxr-xr-x 2 root root 4096 Dec 6 11:16 boot drwxr-xr-x 10 root root 2340 Dec 6 11:18 dev drwxr-xr-x 46 root root 4096 Dec 9 11:48 etc drwxr-xr-x 6 root root 4096 Dec 5 12:09 home lrwxrwxrwx 1 root root 10 Dec 5 12:09 init -> /sbin/init drwxr-xr-x 4 root root 4096 Dec 5 12:09 lib drwxr-xr-x 5 root root 4096 Dec 5 12:09 lib64 drwx------ 2 root root 4096 Dec 5 12:09 lost+found drwxr-xr-x 2 root root 4096 Nov 19 10:07 media drwxr-xr-x 4 root root 4096 Dec 5 12:09 mnt drwxrwxrwx 6 root root 4096 Dec 5 12:24 opt drwxr-xr-x 3 root root 4096 Dec 5 12:09 persistdata drwxr-xr-x 5 root root 4096 May 25 2018 persistent dr-xr-xr-x 473 root root 0 Dec 6 11:17 proc drwxr-xr-x 5 root root 4096 Dec 9 11:38 root drwxr-xr-x 12 root root 780 Dec 6 11:18 run drwxr-xr-x 2 root root 4096 Dec 9 11:47 sbin dr-xr-xr-x 11 root root 0 Dec 6 11:17 sys drwxrwxrwt 8 root root 400 Dec 11 12:14 tmp drwxr-xr-x 12 root root 4096 Dec 5 12:09 usr drwxr-xr-x 15 root root 4096 Dec 5 12:15 var

On Wed, Dec 11, 2019 at 01:46:04PM -0500, Nicholas Krause via talk wrote:
My first idea would be to echo $PATH and see if the path is messed up or /etc/profile or other bash startup scripts are not screwed up. Not sure what the errors were but that's the first place to start.
If that's fine then /bin is a symlink from /usr/bin these days to something in /usr/bin and I would less if my symlinks for /bin are now screwed up and fix that. The real question through is how much of root is overwritten in this case.
Nothing at all was overwritten. A lot was moved however to the wrong place. -- Len Sorensen

On 12/11/19 4:47 PM, Lennart Sorensen wrote:
My first idea would be to echo $PATH and see if the path is messed up or /etc/profile or other bash startup scripts are not screwed up. Not sure what the errors were but that's the first place to start.
If that's fine then /bin is a symlink from /usr/bin these days to something in /usr/bin and I would less if my symlinks for /bin are now screwed up and fix that. The real question through is how much of root is overwritten in this case. Nothing at all was overwritten. A lot was moved however to the wrong
On Wed, Dec 11, 2019 at 01:46:04PM -0500, Nicholas Krause via talk wrote: place.
I got your own answer. Were you the person me and Hugh were talking to as maybe you were. If so on the GCC wiki during the holidays I'm going to start writing up my ideas for multi-threading GCC. A lot of it will apply to LLVM just in different ways. The linker way works correctly and off the top of my head I couldn't remember the exact variable for it, Nick

On Wed, Dec 11, 2019 at 04:55:00PM -0500, Nicholas Krause via talk wrote:
I got your own answer. Were you the person me and Hugh were talking to as maybe you were. If so on the GCC wiki during the holidays I'm going to start writing up my ideas for multi-threading GCC. A lot of it will apply to LLVM just in different ways.
Yep, that was me. Maybe you can give us a link to the wiki to look at.
The linker way works correctly and off the top of my head I couldn't remember the exact variable for it,
Does llvm use binutils for linking and assembling or does it have it's own for that? I have never looked at the details of how it does things. -- Len Sorensen

On Wednesday, December 11 2019, Lennart Sorensen via talk wrote:
Someone at work just made a slight mistake:
root@ECA:~# cd /var root@ECA:/var# mv core.* /* [clearly that trailing * was not meant to be there] [bunch of errors about target already existing] root@ECA:/var# ls -bash: /bin/ls: No such file or directory root@ECA:/var#
So what do you do now?
Everything that could be moved was moved under /var, because it's the last directory (alphabetically) on /. You will have to invoke the dynamic loader by hand in this case, because it has been moved as well. On Debian GNU/Linux 64-bit, you can find it at /var/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2. You will also have to set LD_LIBRARY_PATH accordingly, otherwise the binary to be executed will not be able to find its required libraries (mainly libc.so, in this case): # export LD_LIBRARY_PATH=/var/lib/x86_64-linux-gnu # /var/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /var/bin/ls You can now invoke 'mv' and move the contents back to /. Cheers, -- Sergio GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36 Please send encrypted e-mail if possible http://sergiodj.net/

On Wed, Dec 11, 2019 at 03:28:30PM -0500, Sergio Durigan Junior via talk wrote:
Everything that could be moved was moved under /var, because it's the last directory (alphabetically) on /.
You will have to invoke the dynamic loader by hand in this case, because it has been moved as well. On Debian GNU/Linux 64-bit, you can find it at /var/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2. You will also have to set LD_LIBRARY_PATH accordingly, otherwise the binary to be executed will not be able to find its required libraries (mainly libc.so, in this case):
# export LD_LIBRARY_PATH=/var/lib/x86_64-linux-gnu # /var/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /var/bin/ls
You can now invoke 'mv' and move the contents back to /.
Congratulations. You know your dynamic loader. :) This particular system is yocto multilib based, so it was /var/lib64 that had the needed files, and mv happens to be in busybox with a symlink pointing the wrong way so it needed busybox called explicitly with mv as an argument, but that wasn't the important part. -- Len Sorensen

On Wednesday, December 11 2019, Lennart Sorensen wrote:
On Wed, Dec 11, 2019 at 03:28:30PM -0500, Sergio Durigan Junior via talk wrote:
Everything that could be moved was moved under /var, because it's the last directory (alphabetically) on /.
You will have to invoke the dynamic loader by hand in this case, because it has been moved as well. On Debian GNU/Linux 64-bit, you can find it at /var/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2. You will also have to set LD_LIBRARY_PATH accordingly, otherwise the binary to be executed will not be able to find its required libraries (mainly libc.so, in this case):
# export LD_LIBRARY_PATH=/var/lib/x86_64-linux-gnu # /var/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /var/bin/ls
You can now invoke 'mv' and move the contents back to /.
Congratulations. You know your dynamic loader. :)
This particular system is yocto multilib based, so it was /var/lib64 that had the needed files, and mv happens to be in busybox with a symlink pointing the wrong way so it needed busybox called explicitly with mv as an argument, but that wasn't the important part.
Cool. BTW, Debian offers busybox-static to help with this kind of scenario. Cheers, -- Sergio GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36 Please send encrypted e-mail if possible http://sergiodj.net/

Sergio Durigan Junior via talk wrote:
... BTW, Debian offers busybox-static to help with this kind of scenario.
The problem is that you have to already have that installed before that kind of oops, and it has to be just that kind of oops, ie not something minor that doesn't need that kind of tool, nor something major that calls for a complete reinstall/restore because on a clear disk you can seek forever. Back in the day when Unix was a complete PITA to install that kind of heroic effort was something sysadmins aspired to, but especially with test instances and virtual machines nowadays it's almost always easier to just roll a fresh one. Granted, if you have root on both it can be too easy to pull a trigger on your workstation or server rather than the test instance you meant to test so it's good to see that those skills haven't died out of the world and are still exercised in 2019. Pro tip from back in the day was to always resist the urge to hit reboot, because a partially-hosed machine that still had a root shell open or such gave you a state you might not be able to boot back to. -- Anthony de Boer

On 2019-12-11 12:27 p.m., Lennart Sorensen via talk wrote:
Someone at work just made a slight mistake:
root@ECA:~# cd /var root@ECA:/var# mv core.* /* [clearly that trailing * was not meant to be there] [snip] So what do you do now?
All the files and directories in the root directory will have been moved in to the /var directory with the exception of the /lib and /opt directories which already existed in /var. First step is move the bin directory back to its normal position using "bin/mv bin /". After that command you will have normal access to ls and mv once more and you can move other files and directories out of /var and back to / where they belong. -- Cheers! Kevin. http://www.ve3syb.ca/ | "Nerds make the shiny things that https://www.patreon.com/KevinCozens | distract the mouth-breathers, and | that's why we're powerful" Owner of Elecraft K2 #2172 | #include <disclaimer/favourite> | --Chris Hardwick

On Wed, Dec 11, 2019 at 03:46:25PM -0500, Kevin Cozens via talk wrote:
All the files and directories in the root directory will have been moved in to the /var directory with the exception of the /lib and /opt directories which already existed in /var.
First step is move the bin directory back to its normal position using "bin/mv bin /". After that command you will have normal access to ls and mv once more and you can move other files and directories out of /var and back to / where they belong.
Unfortunately /lib64 did move to /var and made executing anything tricky. /lib was fine, but not helpful for system tools. -- Len Sorensen

| From: Lennart Sorensen via talk <talk@gtalug.org> | root@ECA:~# cd /var | root@ECA:/var# mv core.* /* [clearly that trailing * was not meant to be there] | [bunch of errors about target already existing] "/*" matched every name in / (not dotfiles) The last name matched was /var. We know this because of the list that Lennart gave us later -- we might have had to figure this out ourselves. Ignoring the clashing names and /var, everything that was in / was moved to /var. | root@ECA:/var# ls | -bash: /bin/ls: No such file or directory Clearly some fundamental tools are impaired. The first reason (but not the only one) is that /bin has been moved to /var/bin (but we may not realize this right away). Note: usually echo is built into the shell so echo * would work as a substitute for ls | So what do you do now? Take a break to calm down. Anything you do in a panic is likely to make things worse. Make a plan. Before you take any step that might be irreversible. Exploring is OK. | I was able to fix it in about 5 minutes without using anything other | than what was on the running system? Interesting little problem. It's true that this is an interesting problem. But getting tricky when confronted with an emergency isn't always wise. I would probably boot from a live system on a USB stick, mount / somewhere, and look around. Fixing things from such a system is actually simpler than trying to do so from the busted system. You don't need a deep understanding of the mechanisms for run-time linking. You won't need to get the dynamic linker and libraries back onstream. A clean shutdown will be a challenge but it probably doesn't matter. After booting the live system, do an fsck on the original system's / to make up for a bad shutdown. | The original state of the system was: That's really nice to know. In many scenarios you would not know this. - this lets you figure out where everything went without any detective work (/var) - this lets you figure out what must be moved back.

On Wed, Dec 11, 2019 at 07:12:57PM -0500, D. Hugh Redelmeier via talk wrote:
"/*" matched every name in / (not dotfiles)
The last name matched was /var. We know this because of the list that Lennart gave us later -- we might have had to figure this out ourselves.
Ignoring the clashing names and /var, everything that was in / was moved to /var.
Clearly some fundamental tools are impaired.
The first reason (but not the only one) is that /bin has been moved to /var/bin (but we may not realize this right away).
Note: usually echo is built into the shell so echo * would work as a substitute for ls
Yes echo * is very important.
| So what do you do now?
Take a break to calm down. Anything you do in a panic is likely to make things worse.
Rebooting would have made it a lot harder for sure. Fortunately the guy came and asked me for help instead.
Make a plan. Before you take any step that might be irreversible. Exploring is OK.
It's true that this is an interesting problem. But getting tricky when confronted with an emergency isn't always wise.
Well it was only his test system, but it would have been annoying to have to recreate it.
I would probably boot from a live system on a USB stick, mount / somewhere, and look around. Fixing things from such a system is actually simpler than trying to do so from the busted system. You don't need a deep understanding of the mechanisms for run-time linking. You won't need to get the dynamic linker and libraries back onstream.
I am not actually sure that particular system can even boot from USB, and it has serial console only, no VGA.
A clean shutdown will be a challenge but it probably doesn't matter. After booting the live system, do an fsck on the original system's / to make up for a bad shutdown.
That's really nice to know. In many scenarios you would not know this.
That's for sure.
- this lets you figure out where everything went without any detective work (/var)
- this lets you figure out what must be moved back.
-- Len Sorensen
participants (6)
-
Anthony de Boer
-
D. Hugh Redelmeier
-
Kevin Cozens
-
lsorense@csclub.uwaterloo.ca
-
Nicholas Krause
-
Sergio Durigan Junior