Busybox tar(i) finally got in my way enough that it found itself promoted to my punching bag of the week - a week which turned into two by the time I was through with it. The more problems I worked on, the more acquainted I got with the code, the more potential problems I spotted, and the better equipped I became to fix them; and I was keen to get to the end of it while the whole structure was still fresh in memory.
jfw: in gales linux annoyances, busybox tar not preserving extracted directory timestamps.
jfw: or rather, it does when encountering a record in the tar file for the directory but if any files contained in it are listed later (as they typically are), their creation bumps the directory mtime and tar doesn't restore it.jfw: search turns up the addition of a related TODO but no fixes.
jfw: this suggests a workaround at least when the source filesystem is still available: "find . -type d -exec tar --no-recursion -cf dirtimes.tar {} +" then extract that after the main one with the file data
jfw: ha, worked perfect for my present case of migrating a directory containing many years of projects.jfw: I suppose fixing it would require, gasp, a tar that uses memory, hence busybox hasn't done it.
jfw: or it could be done with a two-pass scan of the tar file, but that would have significant overhead (especially if compressed) and prevent streaming usage
jfw: ah well, onto the TODOs it goes.jfw: ooh, maybe it creates a temporary file in the destination as it goes, into which it copies the directory records instead of acting on them, then feeds that right back into the usual extractor.
jfw: perhaps with the unix trick of unlinking the temp file as soon as it's opened, so it's automatically freed on either exit or abort.dorion: http://jfxpt.com/2024/jwrd-logs-for-Mar-2024/#10704 -- oh noes !!
[2024-03-11 04:13:59 (#jwrd) jfw: I suppose fixing it would require, gasp, a tar that uses memory, hence busybox hasn't done it.]
Since the public channel didn't really engage me on it, the discussion went private (Eulora2, of course), where it got a lot more juice. I was still finding that my first plans aren't always the best ones; in particular the "uses unbounded memory" thing doesn't sit well no matter which way you try to push it.
Vivian Sporepress: I have a coding not-quite-problem - uneasiness, perhaps - which seems pretty obscure but maybe there's something deeper to it. I set to fixing busybox tar so that it actually preserves directory timestamps when extracting (it tries, but fails when files get later created under the directory in question)
Vivian Sporepress: mentioned at http://jfxpt.com/2024/jwrd-logs-for-Mar-2024/#10699 , and I worked up a solution along the lines of http://jfxpt.com/2024/jwrd-logs-for-Mar-2024/#10707 (not exactly, since learning the actual code informed otherwise)
[jfw: in gales linux annoyances, busybox tar not preserving extracted directory timestamps.]
[jfw: ooh, maybe it creates a temporary file in the destination as it goes, into which it copies the directory records instead of acting on them, then feeds that right back into the usual extractor.]
Vivian Sporepress: but basically, it has to store the timestamp metadata somewhere as it processes the otherwise stream-oriented extraction, so as to apply it afterward. it could store in working memory (heap), but this seems very wrong to me because the volume of data in question is unlimited, perhaps up to the whole filesystem size. so I'm writing it to a temp file in the working directory i.e. where the tarball is being extracted to (either the initial one or as modified by tar -C)
Vivian Sporepress: but I'm still not quite happy with this; for one thing it has side effects besides what's strictly required for the extraction (changing mtime of that top directory), and it's not necessarily the case that directory is on the same filesystem where the data is ultimately going
Vivian Sporepress: http://trilema.com/2020/forum-logs-for-25-oct-2018/#2489518 came up, it had informed going with "." in the first place as opposed to /tmp or something, and has me figuring it's probably fine or best that can be done for now anyway
[mircea_popescu: http://btcbase.org/log/2018-10-24#1865690 << can not put files above . !]
[Logged on 2018-10-24 19:57 phf: http://btcbase.org/log/2018-10-23#1865503 << i threw your patch on btcbase, it looks good, though i'm not sure i agree with the decision to put temp file in /tmp. the point of putting it in same hierarchy as press, was to avoid the whole cross-file-system issue]Diana Coman: Vivian Sporepress, what's your uneasiness there, the fact that you need that ./tmp?
Vivian Sporepress: something like that; that it's touching things which it wasn't explicitly asked to touchDiana Coman: from what you describe in there, if the issue is that files in a dir bump up the dir's own mtime, it would need basically to go back and change that at the end i.e. a change to how it processes?
Vivian Sporepress: that's showing up at two levels - the first, the directories that the program is creating, which indeed I'm having it go back and change at the end; but second, in order to do that it's accidentally bumping also the initial directory which it previously wasn't (e.g. if no new subdirs were created, but files extracted into existing ones)
Vivian Sporepress: plus possibly writing data to a different, unintended filesystem
Vivian Sporepress: basically there's no solution that doesn't introduce new problems in one case or another !!Diana Coman: tbh at this first pass and without having actually looked into how it works, it sounds like an issue with how it approaches the ("simple!") task of preserving the timestamps so if there is some clean solution for that root cause, it would likely require a relatively deep change to how it processes the files, hm
Diana Coman: I'm not really familiar with how it processes - e.g. why can't it treat each file with full path and thus change (every time if needed) the mtime for all that so it doesn't need to keep a full list of anything?
Vivian Sporepress: files/directories are listed in a tar file in arbitrary order, as full paths indeed and each with its own metadata (owner/mode/mtime)Diana Coman: ok, so file a/b/c/f1.txt means it updates/creates the full path a/b/c/f1.txt as needed + updates the timestamps for a ,b, c, f1.txt according to the mtimes in the tar file - are these present?
Vivian Sporepress: typically yes, but if the records for a/, a/b/, and a/b/c/ come before a/b/c/f1.txt, then it's already processed them and no longer has their mtime info available
Vivian Sporepress: I guess it could check the parent dir's mtime any time it might be creating a new file and restore each time based on that
Diana Coman: yes
Diana Coman: either it has /keeps a full view (which is not quite desirable for the reasons you identified earlier) OR it checks what is there at every step, not sure what other option there is
Diana Coman: it seems to me quite correct that every step should set the timestamps that it has and otherwise *preserve* any other timestamps that it doesn't have, no?
Vivian Sporepress: more work, but it doesDiana Coman: kind of fixing the side effect of file creation, perhaps so in this sense the uneasiness of that ./tmp does seem even more justified if anything since myeah, it's a sort of more mess (of the same sort, too!) to fix earlier mess
Vivian Sporepress: like we really want a "create file without bumping directory timestamp" system call, but lacking one, have to check-create-restore
Diana Coman: and yes, more work for sure and not very elegant either but it seems to me it's caused by the lack of the environment i.e. exactly that, as you say, the "create file without bumping directory timestamp"
Diana Coman: exactly
Diana Coman: so you can "fix it" by... adding that system call instead, if you prefer :DVivian Sporepress: except ugh, if the tarball only lists a/b/c/f1.txt, without any of the directories, then you *would* expect mtime to bump on a/b/c/.
Diana Coman: why?
Vivian Sporepress: because the idea is to extract timestamps as listed in the archive, not to suppress timestamp updates altogether
Diana Coman: uhm, this seems to me a rather conflicted requirement really. Specifically: asking it to "preserve timestamps as listed in the archive" *means* that you do NOT want local timestamps so ...no, no updating of local timestamps, I don't see why would it
Diana Coman: if you want to have it both ways , well, it's bound to spell trouble, indeed
Diana Coman: either you want the current timestamps or you don't want them and if you want a mix then indeed, there is no clean way to have it without an overall index of sorts so either a first pass or an accumulated view hence memory or disk with its own troublesVivian Sporepress: for a simpler example, say it just has one file in it at top level, a.txt. you extract and want a.txt to have the mtime from the archive. but now, even the working directory will not be bumped. dunno, maybe that's fine, I never quite thought that far about it
Diana Coman: yeah, I got what you meant but from my pov it's exactly what I would expect and want, precisely for the reasons above
Diana Coman: maybe give it some more thought and see
Vivian Sporepress: will do, thanks!Vivian Sporepress: heh, perhaps this is a first example of gales busybox becoming more solid than not just the original busybox but than the GNU stuff too.
Raphael Nethersmell: Vivian Sporepress, and cheers to that @!@!!Diana Coman: how it tends to happen with things that are genuinely still useful - regrinding, pretty much, so yes, more solid
Diana Coman: hopefully there isn't any snag in there wrt unix/other stuff's expectations on those dirs' timestamps/updatesVivian Sporepress: well, one coming to mind that might contribute to my reservations about changing the behavior to preserve timestamps of all touched dirs, is backup tools that rely on metadata to guess whether something has changed (eg rsync I believe does that by default)
Vivian Sporepress: but directories might not be relevant there, as you can't skip descending just because the top level hasn't changed, as lower levels still might have
Vivian Sporepress: I also noticed busybox tar has the abort-on-first-error behavior whereas gnu tar will print the warning, continue then return the error status. thoughts on whether that's a big deal?
Vivian Sporepress: for something involved in data backup & restoration I tend to think it is.Diana Coman: Vivian Sporepress, does rsync really stop/check based on mtime alone?
Diana Coman: this actually reminds me of the "filesystem in unix is a mess" discussion that I had with MP for VaMP, hm
Diana Coman: I wouldn't be very surprised if the correct clean cut above for preserving timestamps from the archive is then flaring up all sorts of weird assumptions otherwise, indeedDiana Coman: abort-on-first-error or not depends really on what sort of error and esp on what is printed then as result. E.g. on unacceptable inputs, VaMP quite on purpose does not really "abort on first error" precisely to collect and give a clear view of *all* that needs fixing because *that* is most helpful for the user really.
Diana Coman: most helpful for the user because if there are problem files, I want to know at the very least which ones but also how many - maybe if it's just a couple I'll look at them manually but maybe if it's a lot of them a script is needed and maybe if there's really just too much it needs further considerationDiana Coman: getting back to timestamps, from a purely logical perspective, preserving timestamps in itself already an anomaly really, possibly exposing the meaninglessness of those timestamps in the first place - because what exactly do they stand for? Creation time? Last change/update time and then is a move change/update or creation? And a dir's own timestamp is even worse in that if it's to mean anything, it would logically be extracted from its contents rather than on its own but well, if one accepts empty dirs and empty files it gets even more nonsensical for sure
Diana Coman: not to mention when the time...changes, of courseVivian Sporepress: "Without [the -c] option, rsync uses a "quick check" that (by default) checks if each file's size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly."
Vivian Sporepress: mtime stands for modification time; there's no field for creation time afaik. a move is neither a change nor a creation, as far as the file contents are concerned which is what the timestamps refer to; it is however a modification of whichever directories, since that's where the names are.
Vivian Sporepress: pretty sure the rsync thing isn't a problem here, for the reason noted that size-and-mtime match on a directory doesn't save it from having to look inside the directory.Vivian Sporepress: meanwhile on implementation, it might actually be less code to *not* restore the top-level dir's timestamp when it's not named explicitly, as it's a special case (extracting a/b/c means restoring mtime on the "a/b" prefix but extracting just "a" means restoring on ".", a completely different string)
Vivian Sporepress: not a big difference but it's there.Vivian Sporepress: the change to restore parent directory times is made & refined a bit (the function in question is a bit complicated since there's all different file types and special cases to deal with) and seems working fine as far as it goes. there's one more case to deal with, where it calls out to an external function to create the chain of parent dirs when missing. and then, it comes to my attention on looking into one of their TODOs that the thing is totally, hopelessly vulnerable to symlink attacks when extracting untrusted tarballs
Vivian Sporepress: ie attacker can overwrite any file accessible to the running user.Vivian Sporepress: looking at the strace, it's a pretty similar situation where GNU tar deals with it by memorizing all the symlinks it's asked to create then doing it at the end.
Vivian Sporepress: I expect the "wasteful" check of parent directory at each step to preserve mtimes will serve to check for symlinks too. I guess a question is if you already have a symlink in the tree, does that mean you want tar to follow it and put stuff wherever it may lead? or just throw an error
Vivian Sporepress: GNU does neither - it deletes the symlink and replaces.Diana Coman: Vivian Sporepress, the . vs top level dir distinction and different treatment makes sense to me. Symlinks in tar archives seem iffy to me. How's your experience with them in practice? The GNU approach seems to me at least consistent with the rest in the sense that a file or a symlink is similarly treated as overwriting from what I gather. For my part, I tend to prefer to make any symlinks myself either manually or via script(s) at most i.e. as a matter of "install" sort of thing but I don't know whether this is possibly too restrictive overall - is it?
Diana Coman: possibly the default would list the symlinks and exit with error + info on a flag to force it through in which case it would go pretty much gnu style i.e. delete the symlink and replace.Vivian Sporepress: Diana Coman: on "different treatment makes sense to me" do you mean as in the simpler implementation where times aren't restored on . as they would be for named subdirs, or different treatment by the code for the seemingly more consistent behavior?
Vivian Sporepress: I use tar as a backup tool, and see it as the de facto standard for archiving as in creating a compact and exportable representation of a unix filesystem preserving all the relevant structure. thus if we have symlinks at all, it needs to handle themDiana Coman: Vivian Sporepress, as in the simpler implementation
Vivian Sporepress: cool
Diana Coman: wrt symlinks, I see what you mean - how's that default list+error & flag to overwrite otherwise sound?
Vivian Sporepress: the delete-and-replace approach makes perfect sense on consideration, since that's what it does by default to anything else
Vivian Sporepress: there is an overwrite-in-place (O_TRUNC) mode, there I might go with an error since you can't do that to a symlink
Diana Coman: ahaVivian Sporepress: grr, my view of the gnu behavior was incomplete. if the archive explicitly lists a directory that is a symlink locally, it indeed replaces; but if the archive doesn't have an entry for that directory as such, and then it gives a file under it, it will trustingly follow the path ie any existing symlinks.
Vivian Sporepress: i.e. "tar -cf test.tar a/b/file.txt" then extract in a place where a (or a/b) is a link
Vivian Sporepress: but lacking the global view (deferred link creations), I can't distinguish that from the situation where this very tarball sneakily created that link (mkdir a; ln -s / a/b; tar -cf sneaky.tar a; rm a/b; mkdir a/b; touch a/b/youlose.txt; tar --append -f sneaky.tar a/b/youlose.txt) so I guess that has to be an error
Vivian Sporepress: and when unpacking a/b/file.txt I have to check not just the immediate parent a/b but also each step of the path (a, a/b) for symlinks. now basically supplementing the kernel's path traversal for want of an "open without following symlinks" flag much like the prior "open preserving parent's mtime"
Vivian Sporepress: and makes for a quadratic algorithm in the path depth. which could probably be addressed with openat/fstatat tricks, though I don't expect it'll be necessaryDiana Coman: Vivian Sporepress, well, basically trusting what is present locally seems to me was at least initially the approach
Vivian Sporepress: in the seemingly unending tar tuning, I've fixed the symlink handling at various levels, changed the death on first error behavior to instead continue to next archive entry and report all errors, fixed the atomic replace-by-renaming mode which had clearly been a half-assed effort, and fixed the lack of error detection on restoring metadata (owners, permissions & times)
Vivian Sporepress: latest is that, as I expected, now that it's reporting those errors, it becomes more annoying that it's attempting to restore owner/group even when run as non-root where that's not possible & doesn't make sense
Vivian Sporepress: it has only a -o option to disable that, nothing to enable if it were off by default. my plan is to follow GNU, add a --same-owner option and make it the default for root only. but, there could be an argument for not making any such user vs root distinction, so never restoring owners from the archive unless explicitly told; I'm used to how it is but did find it kind of weird.
Vivian Sporepress: I mean I'm used to the gnu behavior, restoring owners when possible.Vivian Sporepress: basically all of data_extract_all.c , bb's generic extraction code, is now reground as my own work, though it was certainly helpful to have the original structure & working/tested system rather than starting from zero.
Diana Coman: that sounds like quite some work done on tar, possibly quite a bit more than tuning, heh
Diana Coman: Vivian Sporepress on the root vs non-root, I guess the original approach was to do stuff "when possible/it makes sense", hence the distinction. The root "user" is an outlier from all points of view and it seems to me that if one looks at weirdness, one could easily say it starts from there really i.e. the user that is not quite "just a user" and all that
Diana Coman: I suppose there's also the fact that running it as root rarely means that one wants the resulting files to actually be owned by root and esp in restore use cases, it seems more likely to aim at least to fully restore hence including the owners and permissions indeed
Diana Coman: so dunno, from my pov the gnu behaviour seems fine really
Vivian Sporepress: yep, and got it working.
Diana Coman: and congrats on the regrind!
Vivian Sporepress: "implementing tar for production use" basically
Let's indulge briefly in a comparison with the grass on that other side of the fence, where by first impressions everything was already working fine and I wouldn't have had to do anything at all. Maybe I just enjoy coding work too much and so I masochistically pick things that ensure me a never-ending supply of it? Here's the weight of the files that constitute the bulk of the busybox tar implementation, in their initial state, noting that some are shared with other archiving programs; in lines, words, and characters:
15 48 418 archival/libarchive/data_align.c 213 701 6154 archival/libarchive/data_extract_all.c 138 349 3504 archival/libarchive/data_extract_to_command.c 14 34 328 archival/libarchive/data_extract_to_stdout.c 12 33 294 archival/libarchive/data_skip.c 17 58 410 archival/libarchive/filter_accept_all.c 19 60 463 archival/libarchive/filter_accept_list.c 61 200 1772 archival/libarchive/filter_accept_list_reassign.c 38 123 883 archival/libarchive/filter_accept_reject_list.c 54 188 1183 archival/libarchive/find_list_entry.c 453 1992 14635 archival/libarchive/get_header_tar.c 12 44 305 archival/libarchive/header_list.c 10 33 226 archival/libarchive/header_skip.c 69 162 1607 archival/libarchive/header_verbose_list.c 22 57 580 archival/libarchive/init_handle.c 359 1044 8770 archival/libarchive/open_transformer.c 19 54 371 archival/libarchive/seek_by_jump.c 16 66 388 archival/libarchive/seek_by_read.c 36 103 632 archival/libarchive/unsafe_prefix.c 1224 5096 40596 archival/tar.c 263 1106 9645 include/bb_archive.h 3064 11551 93164 total
On the other hand, just the file count from the GNU tar 1.29 release:
# tar -tf /usr/portage/distfiles/tar-1.29.tar.bz2 | wc -l 1051
To show it in the most flattering light possible, let's exclude all documentation, build system files, auto-generated files, test cases, translation files, and automagically-copy-n-pasted "library" files typical of GNU releases and weigh only what they would have you believe (by inventing a "src" namespace) is the relevant source code for tar itself:
28 202 1279 arith.h 2048 6028 52008 buffer.c 438 1192 9012 checkpoint.c 974 3861 31768 common.h 649 1731 15764 compare.c 1972 6772 52969 create.c 394 1160 10120 delete.c 329 981 7351 exclist.c 39 160 1085 exit.c 1820 6027 48757 extract.c 1805 5862 46045 incremen.c 1463 4987 40977 list.c 283 809 5893 map.c 1250 4474 30758 misc.c 1838 5660 44901 names.c 1295 4034 36846 sparse.c 114 401 2687 suffix.c 900 2485 20398 system.c 2852 8833 77208 tar.c 379 2017 13183 tar.h 637 1720 13818 transform.c 237 699 5375 unlink.c 234 779 6283 update.c 99 323 2221 utf8.c 107 257 2367 warning.c 755 2367 19273 xattrs.c 50 271 2184 xattrs.h 1798 5525 44341 xheader.c 24787 79617 644871 total
That's 7-8 times as much code by any metric. So while the Busybox version is certainly rather "batteries not included" and covered with splinters besides, there IS an end to the mess. For our pains in cleaning it we get to a more favorable place: breathable air, essentially, with a data archival and recovery standard I'm much happier with using and signing off on.
As it happens, I did enjoy the process, too, which I'd say speaks well of the Busybox code for all its various failings.
For the sake of showing and explaining the work, I've again formatted my git commits as patch files.(ii) They are given as I made them, with mistakes and later fixes and all, as I can't quite be bothered to revise the history into something more "logical", digestible, or separable; they build on each other as well as prior work in the Gales tree. If any other Busybox publishers out there want to apply the fixes to their own trees, they're invited to write in and share their progress, and maybe I'll give it a look.
- 01-distinguish-read-errors.patch(iii)
- 02-preserve-parent-times.patch(iv)
- 03-preserve-nanoseconds.patch(v)
- 04-ar-extraction-fixes.patch(vi)
- 05-note-symlink-vulns.patch(vii)
- 06-full-extract-path-validation.patch(viii)
- 07-less-special-hardlink-extraction.patch(ix)
- 08-more-symlink-and-dir-fixes.patch(x)
- 09-deferred-reindent.patch(xi)
- 10-fix-optimistic-comment.patch(xii)
- 11-unlink-consistency.patch(xiii)
- 12-continue-on-error.patch(xiv)
- 13-refine-selinux-context-extraction.patch(xv)
- 14-fix-rename-mode-and-improve-metadata-restoring.patch(xvi)
- 15-check-size-and-fix-improper-perror.patch(xvii)
- 16-same-owner-complementary-options.patch(xviii)
- 17-link-unlink-posix-compat.patch(xix)
- 18-fix-size-overflows-upstream.patch(xx)
Since most of the changes accrue to archival/libarchive/data_extract_all.c, the format-agnostic workhorse of the extraction process, to the extent that I've touched almost every line, I'll wrap up here by showing the final state of this file in its entirety, so that the effect of the changes can be fully seen in context.
/* vi: set sw=4 ts=4: */ /* * Licensed under GPLv2 or later, see file LICENSE in this source tree.(xxi) */ #include "libbb.h" #include "bb_archive.h" /* Emulate Linux behavior for systems with the POSIX behavior that unlinking a directory returns the ambiguous EPERM (this includes BSDs). * Even worse, UFS on Solaris reportedly allows root to unlink non-empty directories, leading to orphaned files. Dealing with that would seem to require an lstat pre-check, which could still fail as a race condition. * TODO see if this is applicable elsewhere in BB. */ static int bb_unlink(char const *path) { int ret = unlink(path); if (ret != 0 && errno == EPERM) { if (is_directory(path, 0)) { errno = EISDIR; } else { errno = EPERM; /* needs restoring after the is_directory call */ } } return ret; } /* Like bb_copyfd_exact_size but without death on error. */ static int copyfd_exact_size(int src, int dst, off_t size) { off_t sz = bb_copyfd_size(src, dst, size); if (sz == size) { return 0; } if (sz != -1) { bb_error_msg("short read"); } /* if sz == -1, bb_copyfd_XX already complained */ return -1; } static int restore_uid_gid(char *name, archive_handle_t *archive_handle) { file_header_t *file_header = archive_handle->file_header; int res; uid_t uid = file_header->uid; gid_t gid = file_header->gid; #if ENABLE_FEATURE_TAR_UNAME_GNAME if (!(archive_handle->ah_flags & ARCHIVE_NUMERIC_OWNER)) { if (file_header->tar__uname) { //TODO: cache last name/id pair? struct passwd *pwd = getpwnam(file_header->tar__uname); if (pwd) uid = pwd->pw_uid; } if (file_header->tar__gname) { struct group *grp = getgrnam(file_header->tar__gname); if (grp) gid = grp->gr_gid; } } #endif /* GNU tar 1.15.1 uses chown, not lchown * ...but so what? If it really uses chown when restoring symlinks, then it's broken and perhaps even exploitable. */ res = lchown(name, uid, gid); if (res) bb_perror_msg("can't set ownership on '%s'", name); return res; } static int restore_date(char *name, file_header_t *file_header) { struct timespec t[2]; int res; /* For portability, it's unclear if we can assume struct timespec has only the two known fields. */ memset(t, 0, sizeof t); t[1].tv_sec = file_header->mtime; t[1].tv_nsec = file_header->mtime_nsec; /* Setting atime the same as mtime. */ t[0] = t[1]; res = utimensat(AT_FDCWD, name, t, AT_SYMLINK_NOFOLLOW); if (res) bb_perror_msg("can't set times on '%s'", name); return res; } #ifdef ARCHIVE_REPLACE_VIA_RENAME(xxii) static void atomic_replace_file(archive_handle_t *archive_handle) { file_header_t *file_header = archive_handle->file_header; char *tmp_name = 0; int dst_fd; if (S_ISREG(file_header->mode)) { /* rpm-style temp file name */ tmp_name = xasprintf("%s;newXXXXXX", file_header->name); dst_fd = mkstemp(tmp_name); if (dst_fd < 0) { bb_perror_msg("can't open temp file for '%s'", file_header->name); goto fail; } if (copyfd_exact_size(archive_handle->src_fd, dst_fd, file_header->size) == -1) { close(dst_fd); goto fail_unlink; } close(dst_fd); } else { /* Doesn't quite guarantee a unique name, but at least symlink() and mknod() don't clobber existing files like open() can. */ tmp_name = xasprintf("%s;new%d", file_header->name, (int) getpid()); if (S_ISLNK(file_header->mode)) { if (symlink(file_header->link_target, tmp_name)) { bb_perror_msg("can't create %slink from %s to %s", "sym", tmp_name, file_header->link_target); goto fail; } } else if (mknod(tmp_name, file_header->mode, file_header->device)) { bb_perror_msg("can't create node %s", tmp_name); goto fail; } } /* Need to restore metadata *before* the rename, since the point of this extraction mode is to prevent even momentary breakage when replacing components on a live system. */ if (!(archive_handle->ah_flags & ARCHIVE_DONT_RESTORE_OWNER)) if (restore_uid_gid(tmp_name, archive_handle)) goto fail_unlink; /* Symlinks have no meaningful mode or at least no available way to set it. */ if (!S_ISLNK(file_header->mode)) { mode_t mode = file_header->mode; if (S_ISREG(file_header->mode) && (archive_handle->ah_flags & ARCHIVE_DONT_RESTORE_PERM) ) { /* XXX we have to noisily derive the normal local file creation mode, because mkstemp sets the unusual mode 0600. Maybe busybox needs its own mkstemp. */ mode_t mask = umask(0); umask(mask); mode = 0666 & ~mask; } if (chmod(tmp_name, mode)) { bb_perror_msg("can't set mode on '%s'", tmp_name); goto fail_unlink; } } if (archive_handle->ah_flags & ARCHIVE_RESTORE_DATE) if (restore_date(tmp_name, file_header)) goto fail_unlink; if (rename(tmp_name, file_header->name)) { bb_perror_msg("can't move '%s' to '%s'", tmp_name, file_header->name); goto fail_unlink; } free(tmp_name); /* A pity gcc never invented safe stack allocation. */ return; fail_unlink: unlink(tmp_name); fail: free(tmp_name); archive_handle->status = EXIT_FAILURE; } #endif /* ARCHIVE_REPLACE_VIA_RENAME */ void FAST_FUNC data_extract_all(archive_handle_t *archive_handle) { file_header_t *file_header = archive_handle->file_header; int restore_parent_times = 0; struct timespec parent_times[2]; char *slash = 0; char *lastslash = 0; #if ENABLE_FEATURE_TAR_SELINUX char *sctx = archive_handle->tar__sctx[PAX_NEXT_FILE]; if (!sctx) sctx = archive_handle->tar__sctx[PAX_GLOBAL]; #endif if (file_header->size < 0) { bb_error_msg("BUG: bad header: negative size for %s", file_header->name); goto abort; } if (S_ISLNK(file_header->mode) && !file_header->link_target) { bb_error_msg("inconsistent file header: symlink without target %s", file_header->name); goto abort; } /* Walk the path, validating and creating dirs as needed. */ for (slash = strchr(file_header->name, '/'); slash; slash = strchr(slash+1, '/')) { struct stat st; *slash = '\0'; if (lstat(file_header->name, &st) == -1) { if (errno != ENOENT) { bb_perror_msg("can't stat prefix %s", file_header->name); goto abort; } if (!(archive_handle->ah_flags & ARCHIVE_CREATE_LEADING_DIRS)) { /* Prefix doesn't exist but we weren't asked to create. (Could probably ignore this and let whatever subsequent syscall catch the problem.) */ bb_simple_perror_msg(file_header->name); goto abort; } if (mkdir(file_header->name, 0777) == -1) { bb_perror_msg("can't create prefix %s", file_header->name); goto abort; } if (restore_parent_times) { *lastslash = '\0'; utimensat(AT_FDCWD, file_header->name, parent_times, 0); *lastslash = '/'; /* This level didn't exist, so the next one has nothing to restore. */ restore_parent_times = 0; } } else { /* Prefix exists */ if (!S_ISDIR(st.st_mode)) { /* Don't defer catching this to a later syscall, as it could be a directory symlink, which would lead us to successfully write outside the designated path (symlink attack)! */ bb_error_msg("prefix not a directory: %s", file_header->name); goto abort; } if (archive_handle->ah_flags & ARCHIVE_RESTORE_DATE) { /* We were asked to preserve archive timestamps; but directory mtimes are bumped by creating/renaming/unlinking anything inside, so we need to restore them afterward. It might be preferred to do this only for directories explicitly created by the archive; but without maintaining a space-consuming list of those, there's no way to distinguish. */ parent_times[0] = st.st_atim; parent_times[1] = st.st_mtim; restore_parent_times = 1; } } *slash = '/'; lastslash = slash; } if (archive_handle->ah_flags & ARCHIVE_UNLINK_OLD(xxiii)) { /* Remove the entry if it exists */ /* Is it hardlink? * We encode hard links as regular files of size 0 with a symlink */ if (S_ISREG(file_header->mode) && file_header->link_target && file_header->size == 0 ) { /* Ugly special case: * tar cf t.tar hardlink1 hardlink2 hardlink1 * results in this tarball structure: * hardlink1 * hardlink2 -> hardlink1 * hardlink1 -> hardlink1 <== !!! */ if (strcmp(file_header->link_target, file_header->name) == 0) goto ret; } /* Proceed with deleting */ /* Note: skipping the unlink when S_ISDIR(file_header->mode), as formerly done here, is not valid because it fails to replace an existing non-directory file. Although it doesn't seem to directly open a symlink attack due to the earlier path check. */ if (bb_unlink(file_header->name) == -1 && errno != ENOENT && !(errno == EISDIR && S_ISDIR(file_header->mode)) ) { bb_perror_msg("can't remove old file %s", file_header->name); goto abort; } } else if (archive_handle->ah_flags & (ARCHIVE_EXTRACT_NEWER(xxiv) | ARCHIVE_O_TRUNC(xxv))) { /* Both these modes require looking at the inode before we leap. */ struct stat existing_sb; if (lstat(file_header->name, &existing_sb) == -1) { if (errno != ENOENT) { bb_perror_msg("can't stat old file %s", file_header->name); goto abort; } /* Doesn't exist, no problem */ } else if (archive_handle->ah_flags & ARCHIVE_EXTRACT_NEWER) { /* Remove the existing entry if it's older than the extracted entry */ if (existing_sb.st_mtime >= file_header->mtime) { if (!(archive_handle->ah_flags & ARCHIVE_EXTRACT_QUIET) && !S_ISDIR(file_header->mode) ) { bb_error_msg("%s not created: newer or " "same age file exists", file_header->name); } data_skip(archive_handle); goto ret; } /* As above. * Note: we could skip the ENOENT case this time, since we just determined that the file exists. But why leave a race condition when it's easily prevented? * In general though we're not bothering about such conditions, so security will depend on only one process creating entries based on untrusted data in the same directory at the same time. * This could perhaps be improved by walking the path using openat() with O_NOFOLLOW and doing the rest relative to the resulting fd. */ if (bb_unlink(file_header->name) == -1 && errno != ENOENT && !(errno == EISDIR && S_ISDIR(file_header->mode)) ) { bb_perror_msg("can't remove old file %s", file_header->name); goto abort; } } else { /* ARCHIVE_O_TRUNC mode */ /* A directory may naturally block the overwrite, but not all special file types will; especially do NOT try writing into a symlink, fifo or device node! */ if (!S_ISREG(existing_sb.st_mode) && !(S_ISDIR(existing_sb.st_mode) && S_ISDIR(file_header->mode)) ) { bb_error_msg("can't overwrite non-regular file %s", file_header->name); goto abort; } /* Regular file, no problem */ } } #if ENABLE_FEATURE_TAR_SELINUX(xxvi) if (sctx) { /* setfscreatecon is 4 syscalls, avoid if possible */ setfscreatecon(sctx); free(archive_handle->tar__sctx[PAX_NEXT_FILE]); archive_handle->tar__sctx[PAX_NEXT_FILE] = NULL; } #endif /* Create the filesystem entry */ switch (file_header->mode & S_IFMT) { case S_IFREG: { /* Regular file */ /* We encode hard links as regular files of size 0 with a symlink */ if (file_header->link_target && file_header->size == 0) { /* Using linkat() with no AT_SYMLINK_FOLLOW flag specifies more clearly that we want the Linux link() behavior, i.e. symlinks not followed. */ if (linkat(AT_FDCWD, file_header->link_target, AT_FDCWD, file_header->name, 0) == -1) { bb_perror_msg("can't create %slink from %s to %s", "hard", file_header->name, file_header->link_target); goto abort; } /* Hardlinks have no separate mode/ownership, skip chown/chmod */ goto ret; } #ifdef ARCHIVE_REPLACE_VIA_RENAME if (archive_handle->ah_flags & ARCHIVE_REPLACE_VIA_RENAME) { atomic_replace_file(archive_handle); goto ret; /* Skip chown/chmod/utimes, already done. */ /* And don't goto abort on error; we may have already modified the parent directory, so need to restore its times. */ } #endif { int dst_fd; int flags = O_WRONLY | O_CREAT | O_EXCL; if (archive_handle->ah_flags & ARCHIVE_O_TRUNC) flags = O_WRONLY | O_CREAT | O_TRUNC; dst_fd = open(file_header->name, flags, file_header->mode); if (dst_fd < 0) { bb_perror_msg("can't open '%s'", file_header->name); goto abort; } if (copyfd_exact_size(archive_handle->src_fd, dst_fd, file_header->size) == -1) { close(dst_fd); goto abort; } close(dst_fd); } break; } case S_IFDIR: if ((mkdir(file_header->name, file_header->mode) == -1) && (errno != EISDIR) /* btw, Linux doesn't return this */ && (errno != EEXIST) ) { bb_perror_msg("can't make dir %s", file_header->name); goto abort; } /* TODO maybe warn also if it exists and ISN'T a dir. Due to the initial path check, this doesn't by itself open a symlink attack, but it does mean we failed to reproduce the archive structure. (This can happen when neither ARCHIVE_UNLINK_OLD nor ARCHIVE_O_TRUNC is set, for instance with tar -k.) */ break; case S_IFLNK: /* Symlink */ #ifdef ARCHIVE_REPLACE_VIA_RENAME if (archive_handle->ah_flags & ARCHIVE_REPLACE_VIA_RENAME) { atomic_replace_file(archive_handle); goto ret; /* Skip chown/chmod/utimes, already done. */ } #endif if (symlink(file_header->link_target, file_header->name)) { bb_perror_msg("can't create %slink from %s to %s", "sym", file_header->name, file_header->link_target); goto abort; } break; case S_IFSOCK: case S_IFBLK: case S_IFCHR: case S_IFIFO: #ifdef ARCHIVE_REPLACE_VIA_RENAME if (archive_handle->ah_flags & ARCHIVE_REPLACE_VIA_RENAME) { atomic_replace_file(archive_handle); goto ret; /* Skip chown/chmod/utimes, already done. */ } #endif if (mknod(file_header->name, file_header->mode, file_header->device) == -1) { bb_perror_msg("can't create node %s", file_header->name); goto abort; } break; default: bb_error_msg("unrecognized file type for %s", file_header->name); goto abort; } if (!(archive_handle->ah_flags & ARCHIVE_DONT_RESTORE_OWNER)) if (restore_uid_gid(file_header->name, archive_handle)) archive_handle->status = EXIT_FAILURE; /* but continue */ /* symlinks have no meaningful mode or at least no available way to set it */ if (!S_ISLNK(file_header->mode)) if (!(archive_handle->ah_flags & ARCHIVE_DONT_RESTORE_PERM)) if (chmod(file_header->name, file_header->mode)) { bb_perror_msg("can't set mode on '%s'", file_header->name); archive_handle->status = EXIT_FAILURE; /* but continue */ } if (archive_handle->ah_flags & ARCHIVE_RESTORE_DATE) if (restore_date(file_header->name, file_header)) archive_handle->status = EXIT_FAILURE; /* but continue */ ret: if (restore_parent_times) { *lastslash = '\0'; utimensat(AT_FDCWD, file_header->name, parent_times, 0); *lastslash = '/'; } #if ENABLE_FEATURE_TAR_SELINUX if (sctx) { /* reset the context after creating an entry */ setfscreatecon(NULL); } #endif return; abort: if (slash) *slash = '/'; if (lastslash) *lastslash = '/'; #if ENABLE_FEATURE_TAR_SELINUX if (sctx) setfscreatecon(NULL); #endif archive_handle->status = EXIT_FAILURE; }
- Meaning, the implementation of the classic Unix "tape archival" program, as developed in the Busybox project. [^]
- Albeit with less shiny tabulation of stats, as it seems I didn't quite come up with something reusable for that. But for the one-liner at work for the listing here:
for f in *.patch; do echo -n "<li><a href=\"/code/os/busybox/extraction-regrind/$f\">$f</a> (("; sed -n '7,/^$/ {s/^ *//;p}' $f | raw2wp | head -c -2; echo '))</li>'; done
[^]
- busybox: tar as well as xread more generally didn't check for read errors, mistaking them for short reads.
Test: "tar -tf ."
Used to print: "tar: short read"
Now prints: "tar: read error: Is a directory" [^] - busybox/archival (tar etc): when restoring timestamps from an archive, preserve parent directory times by checking and restoring after each operation that would unintentionally change them. [^]
- busybox/archival: switch from utimes to utimensat, allowing for setting nanoseconds when provided by the format
This brings it into agreement with the new code for saving and restoring parent directory timestamps. Using utimes wasn't a good choice there as it could lose resolution. [^]
- busybox ar: cleaner and more GNU-like extraction behavior.
I found the directory mtimes work was less relevant to ar since it's not supposed to extract directories, but that it would do so anyway if instructed by the input, and it also failed to extract some valid (or at least GNU-produced) archives*.
* I think that's the link, except the derps broke it already in the barely two weeks since I read it and I didn't yet stash a copy. [^]
- busybox/archival: add notes on possible symlink attacks & prevention [^]
- busybox/archival: replace bb_make_directory with full prefix walk, ruling out symlink attacks and unintended dir mtime changes at that stage.
No longer treating the special case of preserving mtime on "."; it gets too complicated. [^]
- busybox/archival: refactor: hard link creation doesn't need separate pre-checking [^]
- busybox/archival: close the remaining symlink attack vector; related fixes
Re-indenting is deferred to better highlight the actual changes.
In ARCHIVE_UNLINK_OLD mode we can't ignore an existing local file, assuming it to be a directory, just because the archive entry is a directory. This enabled incorrect/incomplete extraction without warning, but not actually a separate symlink attack from the main one previously closed. (It could leave a symlink where a directory was intended, but preventing that link from being followed when extracting a later file remains the responsibility of the code doing that extracting.)
In ARCHIVE_O_TRUNC mode we need to pre-check that existing local files are of suitable type, lest we proceed to writing into a symlink (or even device node), especially since arbitrary ones can be created by earlier entries in a malicious archive.
Complete their TODO on checking for null link_target pointer from a bogus archive. [^]
- busybox/archival: deferred reindent [^]
- busybox/archival: fix earlier comment [^]
- busybox/archival: bring the unlink logic for the ARCHIVE_EXTRACT_NEWER case into agreement with the ARCHIVE_UNLINK_OLD one; in particular, don't ignore EISDIR if we're not in fact creating a directory. [^]
- busybox/archival(tar,cpio): don't die on first extraction error
This more closely follows the behavior of GNU tar, and uses the same deferred failure message on exit, as this is all helpful to the user and costs little.
In the other relevant applets (ar,rpm,dpkg) the calling code is more complex and it's not clear to me if deferred exit is desirable, so preserving their prior behavior for now. [^]
- busybox/archival: hopefully fix SELinux context restoration (untested)
Defer setfscreatecon() until after creating parent directories, as we probably don't want to give them the context intended for the leaf file. [^]
- busybox/archival: safe temp file handling, gapless metadata, symlink and node restoration in ARCHIVE_REPLACE_VIA_RENAME mode; don't skip restoring owner & date for symlinks just because mode isn't possible; catch metadata restoration errors.
The renaming cases are now disentangled from the normal ones, as they have little in common besides low-level details which are now factored out.
So far this mode is just used by the rpm applet, but looks nice to have more generally, perhaps eventually to replace the gpkg shell archive format. [^]
- busybox/archival: range check file_header->size and correct my earlier addition of bb_perror_msg for a similar internal error [^]
- busybox/archival: implement tar --same-owner as default only for root, & related changes.
Now that we properly report failure to chmod, non-root extraction no longer spews permission errors, and as an added bonus no longer hammers on passwd/group database lookups.
Rename related getopt constants for consistency.
Since I had to dig up its meaning anyway, make better use of opt_complementary to set the --same-owner and --same-permissions pairs properly (so later options override earlier ones). [^]
- busybox/archival: BSD/POSIX compatibility for directory unlink error handling and hard link creation. [^]
- busybox/archival: fix integer overflows leading to possible heap buffer overflow in reading tar headers [^]
- Did it do them any good ? [^]
- This code is currently only used by the busybox "rpm" implementation, not "tar", so it's not even built in the present Gales configuration, but now that I've put in the effort to study this code, it's interesting to me as a potential replacement for the slow yet effective shell-based Gales package installation machinery. I tested my changes by temporarily hardcoding the mode in tar.c. [^]
- The default mode for "tar". [^]
- This mode is only used by the busybox "cpio"; that might be handy for assembling a Linux initramfs without having to build the binary in the kernel tree, but we wouldn't need this mode there. It seems a good candidate for removal, as it does rather complicate this block. [^]
- Used by "tar --overwrite". [^]
- I don't anticipate using this, but I get what it's for and it's not doing me any harm so far. [^]
First off, congrats on getting this out and a big thanks to Diana Coman for thinking through it with you, "implementing tar for production use", indeed, win.
That being said,
leaves a rather foul taste in my mouth.
On the one hand, I sympathize with why you went this way, i.e. at this point, there's no V available for production use on Gales as we're not ready to add Gnat, and thus VaMP, there and since you need to publish while it's fresh before fixing everything, this is how it goes and even this, explaining in such detail as you did, is muchmuchmuchismo more effort than the 99%. Fine. Even saying all that, it seems like you left it selling it short. I think the first take away for me is you're on the right path with the Gnat work you're starting on. Gotta get Gales on VaMP at some point to do your work here justice.
On the other hand, I know how much you value working with others and desire to work with more competent people since there's so much to fix everywhere we look. Is that "maybe I'll look at it if you write in" very inviting ?
I understand you don't want to promise effort for free to anyone and everyone, and I suppose the argument could be made that the work itself in this case and as generally demonstrated on Fixpoint is well fucking inviting already anyways. But I dunno, it just didn't sit right with me. I don't know if I'm right, but thought it's worth raising.
Anyways, that's my cents on that front and I'm welcome to be corrected.
We should tease out footnote xxii on rpm vs gales' shell-based package manager. Can you explain the slowness of the latter ? Given that some of the highest compliments on Gales have been on that shell-based package manager, I'd be very reluctant to making a change there, even if you would enjoy it ;-p
Comment by Robinson Dorion — 2024-05-07 @ 04:37
There was a cleaning process of a messy thing; the process itself was somewhat messy; I chose to show the history as-is, rather than regrinding 18 patches to basically redo the cleanup from zero based on new knowledge of how it *could* have been done in fewer steps. That I point this out at all is to acknowledge that there are situations where this matters more but I didn't see this being one. There's no equality and resources are limited; VaMP or no VaMP, not every piece of code is going to get the same level of treatment.
Maybe you have a point about selling short regarding "more digestible"; for all I know they're digestible enough already and I didn't mean to imply otherwise. Because the main file is completely rewritten, the patches are there as supporting background but one would just as easily read the final state straight through, which is why that's what's there directly in the text.
To me it seems very inviting, even the maximum possible level of inviting. Put it this way, "if you're someone who's already supposedly responsible for this code, yet has left it in this sad state, and you now do your homework and show me, maybe I'll help check that you got it right despite having no idea who you are."
I don't mean literally using rpm, probably more like adding a tar option to use that rename mode; then built packages could just be tarballs, perhaps with some standard metadata enclosed if need be for post-install odds and ends.
Currently, they're "shell archives", self-extracting executable scripts containing the files as plain text or base64 in the form of here-documents (shell syntax that allows passing literal text to a command's stdin). This allowed full control over both the archive creation (eg setting arbitrary owners and permissions without having to run the build as root, or deterministic ordering) and extraction process (atomic file replacement so there's no point at which a file being updated is incomplete or absent; update of version symlink last so there's no point at which it refers to an incomplete package). However, there's still something that doesn't love the "need to run it to find out what's in it" situation (Pelosi on Obamacare, much?), although in theory you can read them or grep/awk them. It was a no formats, no format wars kind of thing but meanwhile I don't see so much merit in the approach (and lo, shell script is itself a format and more complex than tar).
I know more about this now than when I cooked it up, from poking around inside ksh. There's the base64 decoding but I don't think that's even the biggest part. Basically it works the shell implementation pretty hard: it has to parse the whole command, possibly reading the whole here-document into memory; then it writes it into a temporary file (in /tmp), doing whatever processing might be required, like variable expansion even though that isn't used here; then the command can run with its input redirected from the temp file. So yeah, on large files like say the 19MB python2.7 executable it ends up quite slow, and I didn't see any low-hanging fruit for improving this in the shell. Besides speed, the unbounded demand for memory and/or temp space doesn't sit well, much like the GNU approach to tar extraction discussed here.
The one you link is about the non-requirement of scripting languages besides the shell. That would not change. (Although the precursor to all of this did involve Scheme...) The package manager already requires various C programs to work reasonably well; basically it would be adding "tar" to that list, now that the coast is clear to do so, while removing others. And so far I'm not itching to do it; there will be a bunch of questions to resolve, and a bunch of existing ports to update for however the new scheme works; meanwhile the current one still works well enough.
Comment by Jacob Welsh — 2024-05-07 @ 18:36