Discussion:
Why '-c DEV' option for switch_root?
ChenQi
2013-12-05 06:59:54 UTC
Permalink
Hi All,

I noticed that switch_root in our busybox supports an extra option '-c
DEV' which is not supported by the same command from util-linux.
According to the source file, it's used to 'reopen stdio to DEV after
switch'.

Does anyone know why we need this option? Is there any use case where
this option is useful or maybe necessary?

Best Regards,
Chen Qi
Harald Becker
2013-12-05 10:10:39 UTC
Permalink
Hi !
Post by ChenQi
I noticed that switch_root in our busybox supports an extra
option '-c DEV' which is not supported by the same command from
util-linux. According to the source file, it's used to 'reopen
stdio to DEV after switch'.
Does anyone know why we need this option? Is there any use case
where this option is useful or maybe necessary?
You probably need to reopen stdio after switching your root. As
long as you keep devices open, there are open file descriptors in
the kernel and the device i-nodes on the old file system can't
be deleted. Some programs do there own handling of devices and
reopen stdio in the new root, other may miss this reopen step
and you have to care of it for your own. In this case the -c
option simplify the switch usage, as it bundles the reopen in
same step as switching the root file system.

--
Harald
ChenQi
2013-12-06 02:25:26 UTC
Permalink
Post by Harald Becker
Hi !
Post by ChenQi
I noticed that switch_root in our busybox supports an extra
option '-c DEV' which is not supported by the same command from
util-linux. According to the source file, it's used to 'reopen
stdio to DEV after switch'.
Does anyone know why we need this option? Is there any use case
where this option is useful or maybe necessary?
You probably need to reopen stdio after switching your root. As
long as you keep devices open, there are open file descriptors in
the kernel and the device i-nodes on the old file system can't
be deleted. Some programs do there own handling of devices and
reopen stdio in the new root, other may miss this reopen step
and you have to care of it for your own. In this case the -c
option simplify the switch usage, as it bundles the reopen in
same step as switching the root file system.
--
Harald
Hi Harald,

Thanks for your explanation.

I don't know if I understand it right.
Does it mean that as long as we don't do some strange redirections of
standard IO in our initramfs, we don't need to add this option when
switching root? So in most cases we don't need to reopen the stdio, right?

Best Regards,
Chen Qi
Laurent Bercot
2013-12-06 10:28:26 UTC
Permalink
Post by ChenQi
I don't know if I understand it right.
Does it mean that as long as we don't do some strange redirections of standard IO in our initramfs, we don't need to add this option when switching root? So in most cases we don't need to reopen the stdio, right?
Unless your real root's /dev/console is the same file as your
initramfs's /dev/console, you probably do.

When your system boots, process 1 (and all its offspring, if there's
no redirection) has fds 0, 1 and 2 pointing to your initramfs's
/dev/console. That device will remain open as long as you don't close
those fds. switch_root needs to reopen 0, 1 and 2 to your new /dev/console
to be able to really clean up your initramfs.

(The best solution is to stop using initramfs and boot directly on your
real root filesystem, using a tmpfs to perform early operations that need
writable space, and using pivot_root if you really need to change root
filesystems. initramfs is a trap.)
--
Laurent
ChenQi
2013-12-09 02:09:51 UTC
Permalink
Post by Laurent Bercot
Post by ChenQi
I don't know if I understand it right.
Does it mean that as long as we don't do some strange redirections of
standard IO in our initramfs, we don't need to add this option when
switching root? So in most cases we don't need to reopen the stdio, right?
Unless your real root's /dev/console is the same file as your
initramfs's /dev/console, you probably do.
When your system boots, process 1 (and all its offspring, if there's
no redirection) has fds 0, 1 and 2 pointing to your initramfs's
/dev/console. That device will remain open as long as you don't close
those fds. switch_root needs to reopen 0, 1 and 2 to your new
/dev/console
to be able to really clean up your initramfs.
Got it.

Thanks,
Chen Qi
Post by Laurent Bercot
(The best solution is to stop using initramfs and boot directly on your
real root filesystem, using a tmpfs to perform early operations that need
writable space, and using pivot_root if you really need to change root
filesystems. initramfs is a trap.)
Rob Landley
2013-12-16 01:08:18 UTC
Permalink
Post by ChenQi
Post by Laurent Bercot
Post by ChenQi
I don't know if I understand it right.
Does it mean that as long as we don't do some strange redirections of
standard IO in our initramfs, we don't need to add this option when
switching root? So in most cases we don't need to reopen the stdio, right?
Unless your real root's /dev/console is the same file as your
initramfs's /dev/console, you probably do.
When your system boots, process 1 (and all its offspring, if there's
no redirection) has fds 0, 1 and 2 pointing to your initramfs's
/dev/console. That device will remain open as long as you don't close
those fds. switch_root needs to reopen 0, 1 and 2 to your new
/dev/console
to be able to really clean up your initramfs.
Got it.
It's actually probably obsolete these days, now that we have initmpfs
and can mount --move the initrmpfs into the new root filesystem. My code
did it because the klibc code was doing it, but I vaguely recall the
initial reason was not relying on a node that would no longer be
accessible and thus provide confusing information in /proc/1/fd, or some
such. (Really, you'd have to ask the klibc guys why.)
Post by ChenQi
Thanks,
Chen Qi
Post by Laurent Bercot
(The best solution is to stop using initramfs and boot directly on your
real root filesystem, using a tmpfs to perform early operations that need
writable space, and using pivot_root if you really need to change root
filesystems. initramfs is a trap.)
That's actively bad advice.

The most recent kernel has my initmpfs patches, meaning initramfs can
now be a tmpfs instead of ramfs.

This means it has size limits (50% by default, changeable with -o
remount, so "cat /dev/zero > /blah" won't fill up memory and freeze the
box). It means it reports available space so trying to use something
like "rpm" that checks available space before proceeding should work out
of one of those. And it means that it should show up in a sanely written
df (which filters out 0 block filesystems so it doesn't show /proc and
/sys and stuff). Plus the usual "memory in this can swap to swap
partitions if you've got 'em" that tmpfs provides.

Of course some people special case initramfs in their code because they
don't understand how the system works, and thus when you use initramfs
as your rootfs their code doesn't see it even when it's not overmounted.
But that's because those people wrote broken code.

Rob
Laurent Bercot
2013-12-16 08:05:55 UTC
Permalink
Post by Rob Landley
The most recent kernel has my initmpfs patches, meaning initramfs
can now be a tmpfs instead of ramfs.
[snip blurb]
You're listing reasons why initramfs (or initmpfs, if you prefer) is
more logical than it was before, more convenient, etc. All this may be
true, but it does not mean initramfs is actually *useful*.

I have yet to see a case where initramfs is really needed. Every time
I've seen a system boot on initramfs, the same goals could have been
achieved via booting on the real root filesystem and doing work during
initialization, which implies a lot less code, and is more maintainable,
and safer (if something fails early on).

I liked initramfs back in the day. It looked flexible and powerful,
which it is, and maybe your initmpfs patches make it even more so. But
I've come to realize it's just a fancy toy, and yes, a trap : people are
blinded by the shinies and diverted from simpler solutions.
--
Laurent
James B
2013-12-16 08:33:36 UTC
Permalink
On Mon, 16 Dec 2013 08:05:55 +0000
Post by Laurent Bercot
I have yet to see a case where initramfs is really needed. Every time
I've seen a system boot on initramfs, the same goals could have been
achieved via booting on the real root filesystem and doing work during
initialization, which implies a lot less code, and is more maintainable,
and safer (if something fails early on).
You are making a very *big* assumption that the kernel can find the real root filesystem without userspace help. In cases where the kernel can't - initramfs/initmpfs is in fact very useful.
--
James B <jamesbond3142 at gmail.com>
Laurent Bercot
2013-12-16 11:00:34 UTC
Permalink
Post by James B
You are making a very *big* assumption that the kernel can find
the real root filesystem without userspace help. In cases where
the kernel can't - initramfs/initmpfs is in fact very useful.
Show me a precise, real-life example, and I'll tell you how I would
proceed. Or agree with you.

Actually, Lauri has a point: if the root filesystem has to be in RAM
(storage-less machines) then yes, initramfs is useful - as a real
root filesystem; you probably don't even need to switch_root, which
simplifies things a lot.
But as far as machines with some kind of storage go, I've seen my
fair share of them, and I've never encountered a case where initramfs
was the best solution. I don't claim to have seen it all, though -
there might be use cases I'm not aware of.
--
Laurent
James B
2013-12-16 15:14:38 UTC
Permalink
On Mon, 16 Dec 2013 11:00:34 +0000
Post by Laurent Bercot
Post by James B
You are making a very *big* assumption that the kernel can find
the real root filesystem without userspace help. In cases where
the kernel can't - initramfs/initmpfs is in fact very useful.
Show me a precise, real-life example, and I'll tell you how I would
proceed. Or agree with you.
Lauri and Didier have given you good examples - I thought their use cases (PXE / NFS root) are immediately obvious as the "classic" need for initramfs. I will add another one - running aufs or overlayfs as root filesystem.
Post by Laurent Bercot
Actually, Lauri has a point: if the root filesystem has to be in RAM
(storage-less machines) then yes, initramfs is useful - as a real
root filesystem; you probably don't even need to switch_root, which
simplifies things a lot.
Yes. Whether to switch_root or not, it depends on the needs, though - if one wants to have persistent changes saved across sessions, it's probably best to switch root to network-based filesystem or use some kind of overlayfs I hinted above to do it.
Post by Laurent Bercot
But as far as machines with some kind of storage go, I've seen my
fair share of them, and I've never encountered a case where initramfs
was the best solution. I don't claim to have seen it all, though -
there might be use cases I'm not aware of.
In this case there are always alternative solutions. In devices with storage where the initial root filesystem is always accessible (e.g. in ROM) you can always do your initialisation straight in the real root and then pivot_root to do something else if need be (this effectively makes the your real root's init script does the work that is normally done in initramfs). Whether this is better than doing it in initramfs is debatable; but as I said earlier, the initramfs/initmpfs is indisposable if you need to use rootfs which can't be mounted directly by the kernel without userspace help.
--
James B <jamesbond3142 at gmail.com>
Lauri Kasanen
2013-12-16 09:33:01 UTC
Permalink
Post by Laurent Bercot
Post by Rob Landley
The most recent kernel has my initmpfs patches, meaning initramfs
can now be a tmpfs instead of ramfs.
[snip blurb]
You're listing reasons why initramfs (or initmpfs, if you prefer) is
more logical than it was before, more convenient, etc. All this may be
true, but it does not mean initramfs is actually *useful*.
I have yet to see a case where initramfs is really needed. Every time
I've seen a system boot on initramfs, the same goals could have been
achieved via booting on the real root filesystem and doing work during
initialization, which implies a lot less code, and is more maintainable,
and safer (if something fails early on).
I disagree, initramfs (and now enhanced upstream with tmpfs) is very
useful for running in RAM without additional steps. There may not be a
"real root", maybe we booted via PXE or an unsupported SCSI controller,
etc.

Without initramfs you'd have hacks like initrd -> switch to tmpfs ->
more code, more complexity. Without initrd either you'd have even worse
hacks with the kernel itself.

- Lauri
--
http://www.fastmail.fm - Choose from over 50 domains or use your own
Didier Kryn
2013-12-16 10:33:20 UTC
Permalink
Hi Laurent.

I am using initramfs with static Busybox on VME Single Board
Computers. These are used as servers and can be plugged into various
locations where there is a dedicated NFS server. A USB or sata disk can
also be connected. The kernel boots from onboard flash and the console
is a serial port.

The user-space code in the initramfs configures the network,
requests its own hostname from DNS, detects available bootable
filesystems (disk or NFS), figures out from a list at which address
there might be an NFS server, then presents a list of systems to boot to
-- after 15 seconds, selects the first in the list -- Then mounts proc,
sys and dev in the mounted filesystem, and switch_root to /sbin/init.
The mounted filesystem is currently Debian-7.

I apreciate your knowledge of Linux and the boot process. But about
the usefullness of initramfs, I think you are wrong. I don't think I
could do the job without it.

Didier
Post by Laurent Bercot
Post by Rob Landley
The most recent kernel has my initmpfs patches, meaning initramfs
can now be a tmpfs instead of ramfs.
[snip blurb]
You're listing reasons why initramfs (or initmpfs, if you prefer) is
more logical than it was before, more convenient, etc. All this may be
true, but it does not mean initramfs is actually *useful*.
I have yet to see a case where initramfs is really needed. Every time
I've seen a system boot on initramfs, the same goals could have been
achieved via booting on the real root filesystem and doing work during
initialization, which implies a lot less code, and is more maintainable,
and safer (if something fails early on).
I liked initramfs back in the day. It looked flexible and powerful,
which it is, and maybe your initmpfs patches make it even more so. But
I've come to realize it's just a fancy toy, and yes, a trap : people are
blinded by the shinies and diverted from simpler solutions.
Laurent Bercot
2013-12-16 18:45:13 UTC
Permalink
But about the usefullness of initramfs, I think you are wrong.
I don't think I could do the job without it.
Hi Didier,

Eh, you could. You have some flash to boot the kernel from: you could
as well have a small squashfs root filesystem on it with a script that
performs all the initialization work you're currently performing in the
initramfs.
Whether or not that's better or not than initramfs is debatable. I find
it better - and I think that you'll find it simpler to maintain if you
try it out - but I realize that my views about this are not shared by many.
The best approach, as always, is whatever works for you, your team, and
your users.
--
Laurent
Harald Becker
2013-12-16 19:11:44 UTC
Permalink
Hi Laurent,

I agree on many of your topics about initramfs, but to pick one
question at this point ...
Post by Laurent Bercot
as well have a small squashfs root filesystem on it with a
script that performs all the initialization work you're
currently performing in the initramfs.
Whether or not that's better or not than initramfs is
debatable.
No debate, at this, everybody has to decide for himself which
solution fits "better" (in his eyes).
Post by Laurent Bercot
- and I think that you'll find it simpler to maintain
if you try it out -
And here the question: How is it simpler to maintain? squashfs is
a read only file system, so you can't change things directly.
Same as in initramfs: You need a copy of the root file system
tree to put in your changes, then you need to create your new
root file system (squashfs on one hand, cpio archiv on the other
hand). Then you need to install your new root file system to the
flash. How can you feel this being simpler to maintain?

--
Harald
Laurent Bercot
2013-12-16 21:57:31 UTC
Permalink
Post by Harald Becker
And here the question: How is it simpler to maintain? squashfs is
a read only file system, so you can't change things directly.
Same as in initramfs: You need a copy of the root file system
tree to put in your changes, then you need to create your new
root file system (squashfs on one hand, cpio archiv on the other
hand). Then you need to install your new root file system to the
flash. How can you feel this being simpler to maintain?
A real filesystem (squashfs or otherwise) is independent from the
kernel. This gives you more flexibility. You don't have to reboot to
test it in a real live working environment. You can develop it in an
emulator to avoid cross-compiling. You can copy the archive around
and mount it as is on another machine. You can keep your userland
firmware and your kernel entirely separate, and even perform live
firmware upgrades.
Additionally, disk or even flash is cheaper than RAM. If you have
a little mass storage, you can store a bajillion utilities or
recovery stuff on it, and fail gracefully if your remote server is
down, whereas you probably don't want to keep too many things in an
initramfs.
--
Laurent
Harald Becker
2013-12-16 22:48:09 UTC
Permalink
Hi Laurent,

I won't say you are wrong, as most of your arguments are the
reason why I prefer to use real root file systems, but ...
Post by Laurent Bercot
A real filesystem (squashfs or otherwise) is independent from
the kernel.
Despite it's possible to bundle an initramfs with the kernel, the
initramfs is just an cpio archiv and may be loaded separate from
kernel. So it's independent of a specific kernel.
Post by Laurent Bercot
You don't have to reboot to test it in a real live working
environment.
mount a tmpfs and extract cpio archiv to this tmpfs then you may
test in you live working environment. That may be one command
more to extract the archive, compared with other file systems (if
you have mount which does loop device handling).
Post by Laurent Bercot
You can develop it in an emulator to avoid cross-compiling.
Why shall this not be possible with an intramfs? Mount tmpfs,
extract archiv, run emulator?
Post by Laurent Bercot
You can copy the archive around and mount it as is on another
machine.
Can't you copy around a cpio archive? You only describe initramfs
bundled with a kernel, but this is a special case.
Post by Laurent Bercot
You can keep your userland firmware and your kernel entirely
separate, and even perform live firmware upgrades.
Live firmware upgrades on a read only file system? How do you do
this? You need to at least create a new squashfs image. If yourr
flash does not contain an underlying file system with lots of
extra space you can't install the new image. If your flash device
does only contain the squashfs as bare file system, you need to
overwrite this image. Live update?
Post by Laurent Bercot
Additionally, disk or even flash is cheaper than RAM. If you
have a little mass storage, you can store a bajillion utilities
or recovery stuff on it, and fail gracefully if your remote
server is down,
What is different here? Why can't you access your utilities
from a mass storage drive when booting with an initramfs? Just
check and mount the local file system then you have access.
Post by Laurent Bercot
whereas you probably don't want to keep too many things in an
initramfs.
Right, initramfs shouldn't be overloaded. Putting there only the
required minimum to bring up the system, but from there to more
local storage is just a mount.

... still have the question: How is a read only file system, like
squashfs easier to maintain than a initramfs? IMO you only
compare with kernel bundled initramfs usage, but initramfs means
cpio archiv which may be loaded separately.

Please don't misunderstand. I'm pro static root file system and
avoid using big kernel bundled initramfs or initrd usage, but I
don't see it being easier to maintain, except on full read
writeable file systems (but we were talking about squashfs).

--
Harald
Laurent Bercot
2013-12-17 10:50:36 UTC
Permalink
Hi Harald,
Post by Harald Becker
... still have the question: How is a read only file system, like
squashfs easier to maintain than a initramfs? IMO you only
compare with kernel bundled initramfs usage, but initramfs means
cpio archiv which may be loaded separately.
Well, my point was comparing with a kernel-bundled initramfs, yes.
I agree that a simple cpio archive isn't any worse than any other
file archive made with mksquashfs/mcramfs/whatever.
Honestly, you might be right, and it has also been some time since
I've used initramfs; back then, I found it impractical, then tried
doing without it when I had full powers on a project, and liked that
approach better.

My main dislike isn't actually with initramfs, it's with switch_root.
I find pivot_root elegant and intuitive, whereas switch_root is kludgy
and ugly (especially since I had to do everything by hand for some
reason I can't remember). I wouldn't argue much against initramfs if
Post by Harald Becker
Live firmware upgrades on a read only file system? How do you do
this? You need to at least create a new squashfs image. If yourr
flash does not contain an underlying file system with lots of
extra space you can't install the new image. If your flash device
does only contain the squashfs as bare file system, you need to
overwrite this image. Live update?
Of course I assumed you had enough space on your storage for both
your old and your new image. You can mount your new image,
pivot_root to it, live test, and set bootloader flags if it works. Just
pivot_root back if it fails. Of course you need to restart your
application anyway, but you might have some lighter test procedures to
perform first.
If your firmware lives in an initramfs, you have to reboot for the
upgrade, and reboot again if you need to roll back.
Post by Harald Becker
What is different here? Why can't you access your utilities
from a mass storage drive when booting with an initramfs? Just
check and mount the local file system then you have access.
Well if you have local storage, then you don't need an initramfs. xD
It's just less of a hassle to have a fully populated /bin as soon as
you boot. With an initramfs, you have to duplicate a few binaries and
perform additional work to get to the same point. Nothing deal-breaking,
but it still makes your life harder than it needs to be.
--
Laurent
Michael Tokarev
2013-12-17 03:31:57 UTC
Permalink
Post by Laurent Bercot
Post by Harald Becker
And here the question: How is it simpler to maintain? squashfs is
a read only file system, so you can't change things directly.
Same as in initramfs: You need a copy of the root file system
tree to put in your changes, then you need to create your new
root file system (squashfs on one hand, cpio archiv on the other
hand). Then you need to install your new root file system to the
flash. How can you feel this being simpler to maintain?
A real filesystem (squashfs or otherwise) is independent from the
kernel. This gives you more flexibility. You don't have to reboot to
test it in a real live working environment. You can develop it in an
emulator to avoid cross-compiling. You can copy the archive around
and mount it as is on another machine. You can keep your userland
firmware and your kernel entirely separate, and even perform live
firmware upgrades.
initramfs is no more different from squashfs here. It is not supposed
to be kernel-dependent (or environment-dependent), and you dont need
to reboot to test it.

Guys, seriously, this (almost religious) question has been discussed
so many times back when initrd has been introduced in kernel.. The
same arguments are being repeated now.

initramfs gives you flexibility. It is mostly due to this reason all
general-purpose distributions adopted it.

For a general-purpose machine there's no need to build its own kernel
unless you use it in a very special environment or for some very very
special task. Not using some mechanism for initial boot environment
basically forces you to do so, unless you're using just a simplest
configuration. And this is time to configure, build and test (with
reboots!) stuff, which isn't cheap.

If you don't need or like this, you're not forced to use initramfs.

There's no need to re-iterate all this again after 10+ years.

Thanks,

/mjt
Didier Kryn
2013-12-17 13:14:50 UTC
Permalink
Hi Laurent.

I actually have a jffs2 filesystem on another part of the flash
memory, but I decided not to use it for the purpose, for one reason: A
single stupid bug might make the device non-bootable. Because, even if
the jffs2 or squashfs is initially mounted read-only, the main reason to
use it is the possibility to remount it read-write and modify it. You
see the point? Better work hard in the beginning to work out a good
stable initramfs which nobody can change/corrupt.

With the initramfs, you restart always from a clean situation (once
debugged of course). And the userland in my initramfs has a possibility
to escape the normal sequence and start an interactive session, which
allows to debug the environment.

By the way, you wrote that initramfs is a trap. What is the trap,
is it initramfs or the need for swhtch_root? Is it different if I
switch_root from another filesystem?

Didier
Post by Laurent Bercot
But about the usefullness of initramfs, I think you are wrong.
I don't think I could do the job without it.
Hi Didier,
Eh, you could. You have some flash to boot the kernel from: you could
as well have a small squashfs root filesystem on it with a script that
performs all the initialization work you're currently performing in the
initramfs.
Whether or not that's better or not than initramfs is debatable. I find
it better - and I think that you'll find it simpler to maintain if you
try it out - but I realize that my views about this are not shared by many.
The best approach, as always, is whatever works for you, your team, and
your users.
Laurent Bercot
2013-12-17 15:05:07 UTC
Permalink
Hi Didier,
Post by Didier Kryn
I actually have a jffs2 filesystem on another part of the flash
memory, but I decided not to use it for the purpose, for one reason: A
single stupid bug might make the device non-bootable. Because, even if
the jffs2 or squashfs is initially mounted read-only, the main reason to
use it is the possibility to remount it read-write and modify it. You
see the point? Better work hard in the beginning to work out a good
stable initramfs which nobody can change/corrupt.
Having an unmodifiable rootfs is indeed a very important guarantee to
have. But having it on the flash device doesn't mean it's a good idea
to remount it read-write ! As you say, your rootfs should remain
read-only all the time - that's why I traditionally used cramfs or
squashfs. You can partition the flash to contain both your read-only
rootfs and some jffs2 containing your read-write data.
Post by Didier Kryn
With the initramfs, you restart always from a clean situation (once
debugged of course). And the userland in my initramfs has a possibility
to escape the normal sequence and start an interactive session, which
allows to debug the environment.
Same thing with a read-only rootfs stored in flash, with the additional
benefit that your debug environment doesn't have to be loaded into RAM
if you don't need it. Put a full recovery system in your rootfs, make it
as big and friendly as will fit in half your flash (to leave room for
firmware upgrades!), don't get constrained by RAM.
Post by Didier Kryn
By the way, you wrote that initramfs is a trap. What is the trap,
is it initramfs or the need for swhtch_root? Is it different if I
switch_root from another filesystem?
initramfs is not a lethal trap, it's just that in my experience, using
it will end up making you work more than not using it, for several
reasons. switch_root is one of those reasons; if you're booting from
another filesystem, nothing forces you to switch_root. The simplest,
easiest design is to directly boot on your real rootfs - and whenever
you need to change roots, pivot_root is a lot friendlier than
switch_root, letting you keep working, open fds on the old root and
clean up later.
--
Laurent
Continue reading on narkive:
Loading...