Comments on: Regrinding Busybox archive extraction, fixing directory timestamps, symlink attacks, a buffer overflow and more

By: October Busyboxing with grep and other bug fixes « Fixpoint

October Busyboxing with grep and other bug fixes « Fixpoint — Sun, 09 Nov 2025 17:53:05 +0000

[...] Fixpoint series on Busybox code study and correction [...]

By: Jacob Welsh

Jacob Welsh — Tue, 07 May 2024 18:36:29 +0000

There was a cleaning process of a messy thing; the process itself was somewhat messy; I chose to show the history as-is, rather than regrinding 18 patches to basically redo the cleanup from zero based on new knowledge of how it *could* have been done in fewer steps. That I point this out at all is to acknowledge that there are situations where this matters more but I didn't see this being one. There's no equality and resources are limited; VaMP or no VaMP, not every piece of code is going to get the same level of treatment.

Maybe you have a point about selling short regarding "more digestible"; for all I know they're digestible enough already and I didn't mean to imply otherwise. Because the main file is completely rewritten, the patches are there as supporting background but one would just as easily read the final state straight through, which is why that's what's there directly in the text.

Is that "maybe I'll look at it if you write in" very inviting ?

To me it seems very inviting, even the maximum possible level of inviting. Put it this way, "if you're someone who's already supposedly responsible for this code, yet has left it in this sad state, and you now do your homework and show me, maybe I'll help check that you got it right despite having no idea who you are."

We should tease out footnote xxii on rpm vs gales' shell-based package manager.

I don't mean literally using rpm, probably more like adding a tar option to use that rename mode; then built packages could just be tarballs, perhaps with some standard metadata enclosed if need be for post-install odds and ends.

Currently, they're "shell archives", self-extracting executable scripts containing the files as plain text or base64 in the form of here-documents (shell syntax that allows passing literal text to a command's stdin). This allowed full control over both the archive creation (eg setting arbitrary owners and permissions without having to run the build as root, or deterministic ordering) and extraction process (atomic file replacement so there's no point at which a file being updated is incomplete or absent; update of version symlink last so there's no point at which it refers to an incomplete package). However, there's still something that doesn't love the "need to run it to find out what's in it" situation (Pelosi on Obamacare, much?), although in theory you can read them or grep/awk them. It was a no formats, no format wars kind of thing but meanwhile I don't see so much merit in the approach (and lo, shell script is itself a format and more complex than tar).

Can you explain the slowness of the latter ?

I know more about this now than when I cooked it up, from poking around inside ksh. There's the base64 decoding but I don't think that's even the biggest part. Basically it works the shell implementation pretty hard: it has to parse the whole command, possibly reading the whole here-document into memory; then it writes it into a temporary file (in /tmp), doing whatever processing might be required, like variable expansion even though that isn't used here; then the command can run with its input redirected from the temp file. So yeah, on large files like say the 19MB python2.7 executable it ends up quite slow, and I didn't see any low-hanging fruit for improving this in the shell. Besides speed, the unbounded demand for memory and/or temp space doesn't sit well, much like the GNU approach to tar extraction discussed here.

Given that some of the highest compliments on Gales have been on that shell-based package manager, I'd be very reluctant to making a change there

The one you link is about the non-requirement of scripting languages besides the shell. That would not change. (Although the precursor to all of this did involve Scheme...) The package manager already requires various C programs to work reasonably well; basically it would be adding "tar" to that list, now that the coast is clear to do so, while removing others. And so far I'm not itching to do it; there will be a bunch of questions to resolve, and a bunch of existing ports to update for however the new scheme works; meanwhile the current one still works well enough.

By: Robinson Dorion

Robinson Dorion — Tue, 07 May 2024 04:37:47 +0000

First off, congrats on getting this out and a big thanks to Diana Coman for thinking through it with you, "implementing tar for production use", indeed, win.

That being said,

as I can't quite be bothered to revise the history into something more "logical", digestible, or separable; they build on each other as well as prior work in the Gales tree. If any other Busybox publishers out there want to apply the fixes to their own trees, they're invited to write in and share their progress, and maybe I'll give it a look.

leaves a rather foul taste in my mouth.

On the one hand, I sympathize with why you went this way, i.e. at this point, there's no V available for production use on Gales as we're not ready to add Gnat, and thus VaMP, there and since you need to publish while it's fresh before fixing everything, this is how it goes and even this, explaining in such detail as you did, is muchmuchmuchismo more effort than the 99%. Fine. Even saying all that, it seems like you left it selling it short. I think the first take away for me is you're on the right path with the Gnat work you're starting on. Gotta get Gales on VaMP at some point to do your work here justice.

On the other hand, I know how much you value working with others and desire to work with more competent people since there's so much to fix everywhere we look. Is that "maybe I'll look at it if you write in" very inviting ?

I understand you don't want to promise effort for free to anyone and everyone, and I suppose the argument could be made that the work itself in this case and as generally demonstrated on Fixpoint is well fucking inviting already anyways. But I dunno, it just didn't sit right with me. I don't know if I'm right, but thought it's worth raising.

Anyways, that's my cents on that front and I'm welcome to be corrected.

We should tease out footnote xxii on rpm vs gales' shell-based package manager. Can you explain the slowness of the latter ? Given that some of the highest compliments on Gales have been on that shell-based package manager, I'd be very reluctant to making a change there, even if you would enjoy it ;-p