mmastrac 7 hours ago

The major takeaway from this is that Rust will be making environment setters unsafe in the next edition. With luck, this will filter down into crates that trigger these crashes (https://github.com/alexcrichton/openssl-probe/issues/30 filed upstream in the meantime).

  • usefulcat 5 hours ago

    But that won't actually fix the underlying problem, namely that getenv and setenv (or unsetenv, probably) cannot safely be called from different threads.

    It seems like the only reliable way to fix this is to change these functions so that they exclusively acquire a mutex.

    • eqvinox 5 hours ago

      I have a different perspective: the underlying problem is calling setenv(). As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv. It's not a mechanism for exchanging information within a process, as used here with SSL_CERT_FILE.

      And remember that the exec* family of calls has a version with an envp argument, which is what should be used if a child process is to be started with a different environment — build a completely new structure, don't touch the existing one. Same for posix_spawn.

      And, lastly, compatibility with ancient systems strikes again: the environment is also accessible through this:

         extern char **environ;
      
      Which is, of course, best described as bullshit.
      • lamontcg an hour ago

        Environment variables are a gigantic, decades-old hack that nobody should be using... but instead everyone has rejected file-based configuration management and everyone is abusing environment variables to inject config into "immutable" docker containers...

      • diroussel 5 hours ago

        Indeed, environment variables should be used to configure child processes, not to configure the current process, for non-shell programs, IMHO.

        Note that Java, and the JVM, doesn't allow changing environment variables. It was the right choice, even if painful at times.

        • hinkley 3 hours ago

          I think there's a narrow window, at least in some programming languages, when environment variables can be set at the start of a process. But since it's global shared state, it needs to be write (0,1) and read many. No libraries should set them. No frameworks should set them, only application authors and it should be dead obvious to the entire team what the last responsible moment is to write an environment variable.

          I am fairly certain that somewhere inside the polyhedron that satisfies those constraints, is a large subset that could be statically analyzed and proven sound. But I'm less certain if Rust could express it cleanly.

        • jamesfinlayson 3 hours ago

          Sure is painful (mostly when writing tests where the environment variables aren't abstracted in some way).

          But I think it was actually possible to hack around up until Java 17.

      • Joker_vD 5 hours ago

        > As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv.

        Mutating argv is actually quite popular, or at least it used to be.

        • GuB-42 3 hours ago

          Mutating argv is fine for how it is usually done. That is, to permute the arguments in a getopt() call so that all nonoptions are at the end.

          It is fine because it is usually done during the initialization phase, before starting any other thread. setenv() can be used here too, though I prefer to avoid doing that in any case. I also prefer not to touch argv, but since that's how GNU getopt() works, I just go with it.

          Once the program is running and has started its threads, I consider setenv() is a big no no. The Rust documentation agrees with me: "In multi-threaded programs on other operating systems, the only safe option is to not use set_var or remove_var at all.". Note: here, "other operating systems" means "not Windows".

        • eqvinox 5 hours ago

          Yes, and if there were "setargv()" or "getargv()" functions, they'd have the same issues ;) … but argv is a function parameter to main()¹, and only that.

          ¹ or technically whatever your ELF entry point is, _start in crt0 or your poison of choice.

          • gpm 5 hours ago

            > but argv is a function parameter to main()¹, and only that.

            > ¹ or technically whatever your ELF entry point is, _start in crt0 or your poison of choice.

            Once you include the footnote, at least on linux/macos (not sure about Windows), you could take the same perspective with regards to envp and the auxiliary array. It's libc that decided to store a pointer to these before calling your `main`, not the abi. At the time of the ELF entry point these are all effectively stack local variables.

            • eqvinox 4 hours ago

              I mean, yes, we're in "violent agreement" there. It's nice that libc squirrels away a copy and gives you a `getenv()` function with a string lookup, but… setenv… that was just a horrible idea. It's not really wrong to view it as a tool that allows you to muck around with main()'s local variables. Which to me sounds like one should take a shower after using it ;D

              (Ed.: the man page should say "you are required to take a shower after writing code that uses setenv(), both to get off the dirt, but also to give you time to think about what you are doing" :D)

              • gpm 4 hours ago

                Oops, didn't mean to come across as disagreeing at all, more of a "yes, and <once you include the footnote>".

                • eqvinox 4 hours ago

                  Ah, after rereading I think I accidentally read that in, sorry

    • debugnik 4 hours ago

      No amount of locking can make the getenv API thread-safe, because it returns a pointer which gets invalidated by setenv, but lacks a way to release ownership over it and unblock setenv safely (or to free a returned copy).

      So setenv's existence makes getenv inherently unsafe unless you can ensure the entire application is at a safe point to use them.

      • josefx 3 hours ago

        C could provide functions to lock/unlock a mutex and require that any attempt to access the environment has to be done holding the mutex. This would still leave the correctness in the hands of the user, but at least it would provide a standard API to secure the environment in a multi threaded application that library and application developers could adopt.

    • pshc 5 hours ago

      The underlying problem is that setenv is mutable global state and should never have existed

      • Joker_vD 5 hours ago

        The process's current directory is mutable global state as well, and yet chdir(2) is thread-safe.

        • plorkyeran 4 hours ago

          chdir is thread-safe, but interacting with the current directory in any context other than parsing command-line arguments is still nearly always a mistake. Everything past a program's entry point should be working exclusively in absolute paths.

          • fanf2 2 hours ago

            Yeah if you chdir() in a multithreaded program, all cwd-relative file accesses in other threads are fucked.

            As well as absolute paths, it’s ok to work with descriptor-relative paths using openat() and friends.

      • josefx 4 hours ago

        Welcome to the C standard library, the application of mutable global state to literally everything in it has to be the most consistent and predictable feature of the language standard.

    • ModernMech 5 hours ago

      It's the same problem with global vars, but at a machine scope. The real solution here would be for the OS to have a better interface to read and write env vars, more like a file where you have to get rw permission (whether that's implemented as a mutex or what).

      • eqvinox 5 hours ago

        This is neither an OS nor a machine scope problem. The environment is provided by the OS at startup. What the process does with it from there on is its own concern.

        • ModernMech 5 hours ago

          > The environment is provided by the OS at startup.

          That's part of the design of the OS. How the OS implements this is primitive, and so it leaves it up to every language to handle. The blog mentions the issue is with getenv, setenv, and realloc, all system calls. To me, that sounds like bad OS design is causing issues downstream with languages, leaving it up to individual programmers to deal with the fallout.

          • eqvinox 5 hours ago

            > getenv, setenv, and realloc, all system calls

            None of these 3 functions is a system call. open(), mmap(), sbrk(), poll(), etc. are system calls. What you're referring to is C library API, which as Go has shown (both to its benefit and its detriment) is optional on almost all operating systems (a major exception being OpenBSD.)

            If you really want to lose some sanity I would recommend reading the man page for getauxval(), and then look up how that works on the machine level when the process is started. Especially on some of the older architectures. (No liability accepted for any grey hair induced by this.)

            ed.: https://lwn.net/Articles/631631/

          • Joker_vD 5 hours ago

            Neither getenv, setenv nor realloc are system calls, they all are functions from C stdandard library, some parts of which for historical reasons are required to be almost impossible to use safely/reliably.

  • benatkin 6 hours ago

    People get trained to ignore the ____UNSAFE_payattention__nevermindthatthisappears50timesinthisfile___ blocks and prefixes

    This also shows up in web frameworks where Vue has the v-html directive and react has dangerouslySetInnerHTML. Vue definitely has it better.

    • crooked-v 6 hours ago

      In the React world, the only times I've seen dangerouslySetInnerHTML consistently used is for outputting string literal CSS content (and this one is increasingly rare as build tools need less handholding), string literal JSON content (for JSON+LD), and string literal premade scripts (i.e. pixel tags from the marketing content). That's not to say there's no danger surface there, but it's not broadly used as a tool outside of code that's either really bad or really exhaustively hand-tuned.

      • rerdavies 4 hours ago

        Code syntax highlighting libraries for react use dangerouslySetInnerHTML.

      • javier2 6 hours ago

        I've only really seen dangerouslySetInnerHTML used while transitioning from certain kinds of server side rendering to React. There is still lots of really old internal tools in ancient html out there.

      • benatkin 6 hours ago

        React doesn't have a tag and attribute sanitizer built in, so having non-js-programmers edit JSX isn't especially safe anyways, as an img or a href could exfiltrate data. If it were they could just block out an innerHTML attribute. A js programmer can get around it by setting up a ref and then using the reference to set innerHTML without the word dangerously appearing.

        • koito17 6 hours ago

          > A js programmer can get around it by setting up a ref and then using the reference to set innerHTML without the word dangerously appearing.

          If DOM nodes during the next render differ from what react-dom expects (i.e. the DOM nodes from the previous render), then react-dom may throw a DOMException. Mutating innerHTML via a ref may violate React's invariants, and the library correctly throws an error when programmers, browser extensions, etc. mutate the DOM such that a node's parent unexpectedly changes.

          There are workarounds[1] to mutate DOM nodes managed by React and avoid DOMExceptions, but I haven't worked on a codebase where anything like this was necessary.

          [1] https://github.com/facebook/react/issues/11538#issuecomment-...

ChrisSD 7 hours ago

In the Rust std, `set_var` and `remove_var` will correctly require using an `unsafe {}` block in the next edition (2024). The documentation does now mention the safety issue but obviously it was a mistake to make these functions safe originally (albeit a mistake even higher level languages have made).

https://doc.rust-lang.org/stable/std/env/fn.set_var.html

There is a patch for glibc which makes `getenv` safe in more cases where the environment is modified but C still allows direct access to the environ so it can't be completely safe in the face of modification https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f...

  • Thaxll 7 hours ago

    Why requiring unsafe when the std implementation could take care of the synchronisation?

    • masklinn 7 hours ago

      Because the std implementation can not force synchronisation on the libc, so any call into a C library which uses getenv will break... which is exactly what happened in TFA: `openssl-probe` called env::set_var on the Rust side, and the Python interpreter called getenv(3) directly.

      • rerdavies 4 hours ago

        But the standard implementation could copy the environment at startup, and only uses its copy.

        And the library's use of setenv is clearly a bug as setenv is documented to be not threadsafe in the C standard library. So that would take care of that problem.

        • demurgos 3 hours ago

          If you clone the environment at startup, then you get a situation where code in the same binary can see different values depending if it uses libc or Rust's std. It's also no longer the same environment as in the process metadata.

          Using a copy by default may have worked if it was designed as such before Rust 1.0, but Rust took the decision to expose the real environment and changing this now would be more disruptive than marking mutations as unsafe.

      • miohtama 5 hours ago

        Is it possible to skip libc completely or would this introduce too many portability concerns?

        • duped 5 hours ago

          In general, no, because of FFI. In special circumstances, yes, but this isn't really important because the libc implementation is trivial (on all platforms that matter, envp is a char** to strings formatted as KEY=VALUE, set_env(key, value) is equivalent to allocating a new KEY=VALUE string and finding the index of a key if it exists or appending to the array).

          Under the hood the pointer is initialized by the loader, in a special place in executable memory. Most of the time, the loader gets the initial environment variable list by looking at argv* (try reading past the end of the null separator, you'll find the initial environment variables).

          It would be possible for a language to hack it such that on load they initialize their own env var set without using libc and be able to safely set/get those env vars without going through libc, and to inherit them when spawning child processes by reading the special location instead of the standard location initialized by your platforms' loader/updated by libc. But how useful is a language with FFI that's fundamentally broken since callees can't set environment variables? (probably very useful, since software that relies on this is questionably designed in the first place)

          If you wanted to make a bullet proof solution, you would specify the location of an envp mutex in the loaders' format and make it libc's (or any language runtime) problem to acquire that mutex.

          * there are platforms where this isn't true

        • jcotton42 5 hours ago

          It's not just libc, it's any C or C++ library that calls getenv or setenv.

          • rerdavies 4 hours ago

            Specifically, any C or C++ library that calls setenv (despite documentation that says that setenv is not threadsafe).

    • ChrisSD 7 hours ago

      It can only synchronize if everything using is Rust's functions. But that's not a given. People can use C libraries (especially libc) which won't be aware of Rust's locks. Or they could even use a high level runtime with its own locking but then they'll be distinct from Rust's locks.

      The only way to coordinate locking would be to do so in libc itself.

      • wahern 6 hours ago

        libc does do locking, but it's insufficient. The semantics of getenv/setenv/putenv just aren't safe for multi-threaded mutation, period, because the addresses are exposed. It's not really even a C language issue; were you to design a thread-safe env API, for C or Rust, it would look much different, likely relying on string copying even on reads rather than passing strings by reference (reference counted immutable strings would work, too, but is probably too heavy handed), and definitely not exposing the environ array.

        The closest libc can get to MT safety is to never deallocate an environment string or an environ array. Solaris does this--if you continually add new variables with setenv it just leaks environ array memory, or if you continually overwrite a key it just leaks the old value. (IIRC, glibc is halfway there.) But even then it still requires the application to abstain from doing crazy stuff, like modifying the strings you get back from getenv. NetBSD tried adding safer interfaces, like getenv_r, but it's ultimately insufficient to meaningfully address the problem.

        The right answer for safe, portable programs is to not mutate the environment once you go multi-threaded, or even better just treat process environment as immutable once you enter your main loop or otherwise finish with initial process setup. glibc could (and maybe should) fully adopt the Solaris solution (currently, IIRC, glibc leaks env strings but not environ arrays), but if applications are using the environment variable table as a global, shared, mutable key-value store, then leaking memory probably isn't what they want, either. Either way, the best solution is to stop treating it as mutable.

    • demurgos 7 hours ago

      It can't ensure synchronization because any code using libc could bypass the sync wrapper. In particular, Rust lets you link C libs which wouldn't use the Rust stdlib.

    • msully4321 7 hours ago

      Because it can still race with C code using the standard library. getenv calls are common in C libraries; the call to getenv in this post was inside of strerror.

    • fsckboy 6 hours ago

      you've gotten a lot of answers which say the same thing, but which I don't think answer your question:

      synchronization methods impose various complexity and performance penalties, and single threaded applications which don't need that would pay those penalties and get no benefit.

      Unix was designed around a lightweight ethos that allowed simple combining of functions by the user on the command line. See "worse is better", but tl;dr that way of doing things proved better, and that's why you find yourself confronting what it doesn't do.

      • davidt84 5 hours ago

        The real problem is that getenv() and setenv() were created before threads were really a thing.

      • sunshowers 5 hours ago

        Well it was better in the short term but is worse in the long term. In particular, the error handling situation is generally atrocious, which is fine for interactive/sysadmin use but much worse for serious production use.

vlovich123 7 hours ago

Even if C stdlib maintainers are resistant against making setenv multi-thread safe, at a minimum there should be a new alternative thread-safe API defined, whether within POSIX or defining a defacto standard and forcing POSIX to adopt it over time. If instead of explaining why nothing could be done was spent fixing this problem, a new thread-safe API could have replaced the old setenv which could have been deprecated and removed from many software projects.

I'm also not convinced by Musl's maintainer that it can't be fixed within Musl considering glibc is making changes to make this a non-issue.

  • usefulcat 5 hours ago

    The biggest problem is not the absence of a thread safe API, it's the existence of this:

        extern char **environ;
    
    As long as environ is publicly accessible, there's no guarantee that setenv and getenv will be used at all, since they're not necessary.

    If you're willing to get rid of environ, it's pretty trivial to make setenv and getenv thread safe. If not, then it's impossible, although one could still argue that making setenv and getenv thread safe is at least an improvement, even if it's not a complete solution (aka don't let the perfect be the enemy of the good).

    • vlovich123 3 hours ago

      > aka don't let the perfect be the enemy of the good

      Exactly my point. Over time *environ would disappear, at least from the major software projects that everyone uses (assuming it's even in use in them in the first place).

  • panzi 5 hours ago

    Guess that would also require some locking for all the exec() functions that don't take the environment as a parameter or that search PATH for the executable.

  • davidt84 4 hours ago

    I'm not convinced by you that you know more than the experts who have determined there is no backwards-compatible way to fix this.

StillBored 6 hours ago

Its like a rite of passage to be hit by an environment related bug on linux, which is mysteriously less a problem on other unix's. Which is sorta funny given how pragmatic Linus and the kernel are about fixing POSIX bugs by making them not happen, while glibc is still lagging here decades after people tried to at least make the problem better. Sure there is all the crap around TZ/etc, but simply providing getenv_r() and synchronizing it with setenv() and warning during compile/link on getenv() would have killed much of the problem. Nevermind, actually doing a COW style system where the env pointer(s) are read only. Instead the problem is pushed to the individual application, which is a huge mistake, because application writers are rarely aware of what their dependencies are doing. Which is the situation I found myself in many many years ago. The closed source library vendor, at the time, told us to stop using that toy unix clone (linux).

  • kelnos 5 hours ago

    > environment related bug on linux, which is mysteriously less a problem on other unix's.

    How do you figure? The problem isn't the implementation, it's the API. setenv(), unsetenv(), putenv(), and especially environ, are inherently unsafe in a multithreaded program. Even getenv_r() can't really save you, since another thread may be calling setenv() while the (old) value of an env var is being copied into the provided buffer. Sure, a getenv_r() fixes the case where you get something back from getenv(), and then another thread calls setenv() and makes that memory invalid, but there's no way to protect the other calls breaking the API.

    There are ways to mitigate some of the issues, like having libc hold a mutex when inside getenv()/setenv()/putenv()/unsetenv(), but there's still no way for libc to guarantee that something returned by getenv() remains valid long enough for the calling code to use it (which, right, can be fixed by getenv_r(), which could also be protected by that mutex). But there's no good way to make direct access to environ safe. I suppose you could make environ a thread-local, but then different threads' views of the environment could become out of sync, permanently (and you could get different results between calling getenv_r() and examining environ directly).

    Back-compat here is just really hard to do. Even adding a mutex to protect those functions could change the semantics enough to break existing programs. (Arguably they're already broken in that case, but still...)

    • rerdavies 4 hours ago

      Why does adding a mutex break the API? I guess it breaks `char**environ`. But the API wouldn't be broken.

      • benmmurphy 3 hours ago

        I think you would have to change the API to return a copy of the string as the get_env result which the caller is responsible for free-ing or the env implementation would have to ensure returned values from get_env are stable and never change which is effectively a memory leak.

    • einpoklum 5 hours ago

      > Even getenv_r() can't really save you, since another thread may be calling setenv() while the (old) value of an env var is being copied into the provided buffer.

      Won't that depends on the libc implementation. For example, maybe setenv writes to another buffer, then swaps pointers atomically; wouldn't that work?

masklinn 7 hours ago

Previously on setenv being a terrible thing: https://www.evanjones.ca/setenv-is-not-thread-safe.html (discussion: https://news.ycombinator.com/item?id=38342642 first comment is even about it causing issues in Rust)

  • Animats 6 hours ago

    Yes. That's known.

    Most of the rest of the problem here seems to be the development environment. They're testing on a remote machine in an Amazon data center and using Docker. This rig fails to report that a process has crashed. Then they don't have enough debug symbol info inside their container to get a backtrace. If they'd gotten a clean backtrace reported on the first failure, this would have been obvious.

    Why is anyone using "setenv" anyway?

    • mmastrac 6 hours ago

      Yup, it's mostly just the story and tools we used to get ourselves out of a mess that was made harder by some decisions made earlier -- the tests were running in a container with stripped symbols (we're going to ship symbols after this, no reason to over-optimize), our custom test runner failed to report process death (an oversight).

      There's no reason setenv should have been called here. The `openssl-probe` library could simply return the paths to the system cert files and callers could plug those directly into the OpenSSL config.

      Oversights all around and hopefully this continues to improve.

    • masklinn 6 hours ago

      > Why is anyone using "setenv" anyway?

      Because it’s there and it looks like a good idea until it takes one of your fingers.

      • einpoklum 5 hours ago

        It really does not look like a good idea to setenv() . The very notion is quite terrifying. Messing with a bunch of globals, that other code knows about as well? Nuh-uh.

        The thing is, the OP people weren't doing that at all, it was some irresponsible library maintainers. If your code does that, you have to include something like the "surgeon general's warning" everywhere: "CAREFUL: USING THIS LIBRARY MAY CAUSE TERMINAL CRASHES".

        • SAI_Peregrinus 16 minutes ago

          It's OpenSSL. It's basically a sea urchin turned into code in terms of safe handling.

kelnos 5 hours ago

This reminded me of that whole "12-factor app" movement, which several of my former coworkers had really bought into. One of the "factors" is that apps should be configured by environment variables.

I always thought this was kinda foolish: your configuration method is a flat-namespace basked of stringly-typed values. The perils of getenv()/setenv()/environ are also, I think, a great argument against using env vars for configuration.

Sure, there aren't always great, well-supported options out there. I prefer using a configuration file (you can have templated config and a system that fills in different values for e.g. dev/stage/prod), and I'll usually use YAML, despite its faults and gotchas. There are probably better configuration file formats, but IMO YAML is still significantly better than using env vars.

  • __MatrixMan__ 2 hours ago

    I have similar reservations about env vars. I dislike how they can be read from anywhere--it interrupts the ability to reason about a function's behavior from its signature and makes impure plenty of functions that could otherwise have been pure.

    If there were a language feature that let me mark apps such that during any process env vars are not writable and are readable only once (together, in a batch, not once per var), I'd use it everywhere.

  • eqvinox 4 hours ago

    getenv() is perfectly fine, it's setenv() that is the problem. Which in theory this wouldn't be using since the env would be set up prior to starting that mystical app.

    But yes, a flat namespace, with string values, shared as a free-for-all with who knows what libraries and modules you're loading… that's not a good idea even if it didn't have safety issues in setenv().

rikthevik 4 hours ago

Great article about digging into a non-obvious bug. This one had it all! Intermittent bug, architecture-specific, hidden in a dependency, rust, the python GIL, gettext. Fantastic stuff.

These kinds of detailed troubleshooting reports are the closest thing you can get to having to do it yourself. Thanks to the authors. It's easy to say "don't use X duh" until a dependency relies on it, and how were you supposed to know?

HarHarVeryFunny 4 hours ago

What is the rationale for libc not making setenv/getenv thread safe? It does seem rather odd given how environment variables are explicitly defined as shared between threads in the same process!

It doesn't seem it would take much to do it efficiently, even retaining the poor getenv() pointer-returning API (which could point to a thread local buffer). The coordination between getenv and setenv could be very lightweight - spinlock vs mutex.

shikon7 7 hours ago

I wonder why it is so hard for Rust to implement its own safe stdlib independent of C.

  • dgrunwald 6 hours ago

    How exactly would that help in this situation?

    If both Rust and C have independent standard libraries loaded into the same process, each would have an independent set of environment variables. So setting a variable from Rust wouldn't make it visible to the C code, which would break the article's usecase of configuring OpenSSL.

    The only real solution is to have the operating system provide a thread-safe way of managing environment variables. Windows does so; but in Linux that's the job of libc, which refuses to provide thread-safety.

  • do_not_redeem 7 hours ago

    The crash in the article happened when Python called C's getenv. Rust could very well throw away libc, but then it would also be throwing away its great C interop story. Rust can't force Python to use its own stdlib instead of libc.

  • kbolino 6 hours ago

    They did, it's called core. But it assumes no operating system at all, and environment variables require an operating system.

    • nomel 6 hours ago

      > and environment variables require an operating system

      Is that true? It's just a process global string -> string map, that can be pre-loaded with values before the process starts, with a copy of the current state being passed to any sub-process. This could be trivially implemented with batch processing/supervisory programs.

      • kbolino 5 hours ago

        Sure, there's a broader concept here, which doesn't require any operating system. But any alternate string->string map you define won't answer to C code calling getenv, won't be passed to child processes created with fork, won't be visible through /proc/$PID/environ, etc.

        • nomel 23 minutes ago

          This is the context:

          > They did, it's called core. But it assumes no operating system at all, and environment variables require an operating system.

          I think there's some confusion here. The C standard library is an abstraction layer that exists to implement standard behavior on hardware. It's entirely unrelated to the existence of an OS. Things like "/proc/$PID/environ" have nothing to do with C.

          There are many standard libraries, for embedded, that implement these things, like getenv, on bare metal [1].

          Standard C libraries exist to implement functionality. It does not define how to implement the functionality. That's the whole point of C: it's an abstraction that has very little requirements.

          The implementation of environment variables don't require an OS. If they made this "core", they could trivially implement the concept.

          [1] https://en.wikipedia.org/wiki/Newlib [2] getenv: https://sourceware.org/newlib/libc.html

      • panzi 5 hours ago

        Well, it's used by the OS when exec-ing a new process, but at least the Linux syscall for that takes the environment as an explicit parameter. So it could be managed in whatever way by the runtime until execve() is called.

      • sunshowers 5 hours ago

        Environment variables are not just technical, they're social. You need to get everyone on board with your scheme.

  • steveklabnik 6 hours ago

    Linux is an unusual platform in that it allows you to call into it via assembly. Most other platforms require you to go through libc to do so. It's not really in Rust's hands.

    • PaulDavisThe1st 5 hours ago

      This is not unusual at all. Windows allowed it for years before Linux came along. It was also true of some other *nix systems - IIRC, Ultrix (DEC) allowed this, and so did Dynix (Sequent).

      *BSD allows it too, or used as of 2022.

      What is unusual about Linux is that it guarantees a syscall ABI, meaning that if you follow it, you can make a system call "portably" across "any" version of Linux.

      • steveklabnik 5 hours ago

        Sure, I’m speaking about platforms that are relevant today, not historical ones. Windows, MacOS, {Free,Open,Net}BSD, Solaris, illumos, none of these do.

        • eqvinox 4 hours ago

          It's quite easy to find out the actual situation on this since Go decided to do it their way. Last I checked, OpenBSD is the only OS where they go through libc, but I haven't really kept up.

          • steveklabnik 4 hours ago

            In my understanding, Go initially disregarded various platforms' rules here, and have ended up walking it back. I could be wrong though.

            It's hard to find good details here, but here's a mailing list thread from 2019 mentioning libc usage: https://groups.google.com/g/golang-nuts/c/uX8eUeyuuAY/m/Cfhl...

            > On Solaris (and Windows), and more recently in macOS as well we link with libc (or equivalent).

            > Go used to do raw system calls on macOS, and binaries were occasionally broken by kernel updates. Now Go uses libc on macOS.

            • PaulDavisThe1st 4 hours ago

              Yep, in 2022 it finally started using libc on *BSD too.

              But ... there's a difference between being able to do direct syscalls via asm, and them being portable across kernel versions, which is what this subthread was about.

              Granted, most people want version portability, but still on a technical level, it's not the same thing.

              • steveklabnik 3 hours ago

                No, my comment was about what APIs a platform considers to be their stable, external API. That you can technically call them anyway (except for ones like OpenBSD that actively check and prevent you) doesn't mean you're not doing something unsupported.

  • zanderwohl 7 hours ago

    It would be a tremendous amount of work, and would take years. Meanwhile, the problems are avoidable. It's not exactly the "rust way" to just remember and avoid problems, but everything in language design is compromises.

    • IshKebab 6 hours ago

      "Impossibru!!"

      https://github.com/sunfishcode/eyra

      Oh look:

      > Why use Eyra? It fixes Rust's set_var unsoundness issue. The environment-variable implementation leaks memory internally (it is optional, but enabled by default), so setenv etc. are thread-safe.

      • sunshowers 5 hours ago

        That only works on Linux though right?

      • kbolino 6 hours ago

        That's quite a trade-off

        • mmastrac 4 hours ago

          I think glibc made the same trade-off. It makes sense for most types of programs, but there's certainly a lot of classes of programs that wouldn't take it.

        • IshKebab 4 hours ago

          What is? Leaking memory? It's going to be a few kB at absolute most. Not an issue unless you are doing something very weird.

datadeft 7 hours ago

Couldn't we have a better pattern for this?

    if (__environ == NULL || name[0] == '\0')
      return NULL;
gavinhoward 6 hours ago

It is weird that I got this right before Rust did.

Because I use structured concurrency, I can make it so every thread has its own environment stack. To add to a new environment, I duplicate it, add the new variable, and push the new enviroment on the stack.

Then I can use code blocks to delimit where that stack should be popped. [1]

This is all perfectly safe, no `unsafe` required, and can even extend to other things like the current working directory. [2]

IMO, Rust got this wrong 10 years ago when Leakpocalypse broke. [3]

[1]: https://git.yzena.com/Yzena/Yc/src/branch/master/tests/yao/e...

[2]: https://gavinhoward.com/2024/09/rewriting-rust-a-response/#g...

[3]: https://gavinhoward.com/2024/05/what-rust-got-wrong-on-forma...

  • mmastrac 6 hours ago

    This isn't _really_ a Rust problem. Rust is a victim of POSIX.

    If you have 1) C FFI interop in Yao, there's still a chance you might have two C libraries cause a crash without your code even being involved.

    • gavinhoward 3 hours ago

      Except if there is dymanic linking, I can use that to inject my own setenv and getenv, just like people inject jemalloc or other malloc alternatives.

cuno 5 hours ago

We ended up overriding and replacing with our own thread-safe version years ago when we also hit this.

hauntsaninja 5 hours ago

We had so many of these issues that we ended up LD_PRELOAD-ing patch getenv / setenv / putenv

  • msully4321 5 hours ago

    With a fixed implementation that leaks environments (like the one that just landed in glibc)?

jandrese 7 hours ago

Yet another person is burned by calling setenv() in a multi-threaded context. There really needs to be a big warning banner on the manpage for setenv() that warns about this because it seems like a far more common problem than you would expect.

  • umpalumpaaa 7 hours ago

    The man page says:

    > POSIX.1 does not require setenv() or unsetenv() to be reentrant.

    A non-reentrant function cannot be thread safe.

    In general (for POSIX, libc and many other libraries: if the docs do not explicitly say "this function is thread safe" they are not).

    • wmf 7 hours ago

      It's time to move beyond this attitude and make things safe by default. For example, Solaris has a safer version of setenv().

      "It is ridiculous that this has been a known problem for so long. It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it. We know how to fix the problem." https://www.evanjones.ca/setenv-is-not-thread-safe.html

      • PaulDavisThe1st 5 hours ago

        One of the major differences between X Window and the win32 GUI APIs is that the windows one builds in thread safety, and it cannot be removed. This means that you pay the price of mutexes and the like (what the windows world likes to call "critical sections"), even if you have a single threaded GUI. X Window, on the other hand, decided to do nothing about threads at all, leaving it up to the application.

        30 years after these decisions were made, most sensible people do single threaded GUIs anyway (that is, all calls to the windowing API come from a single thread, and all redraws occur synchronously with respect to that thread; this does not block the use of threads functioning as workers on behalf of the GUI, but they are not allowed to make windowing API calls themselves).

        Consequently, the overhead present in the win32 API is basically just dead-weight, there to make sure that "things are safe by default".

        There's a design lesson here for everyone, though precisely what it is will likely still be argued about.

        • wmf 4 hours ago

          Yet 30 years later people are calling setenv()/getenv() from different threads even though "it is known" that it crashes. For whatever reason the lesson from GUIs doesn't apply here.

          • PaulDavisThe1st 4 hours ago

            Judging from a lot of the comments in this thread, the idea that there could even be parts of the *POSIX API* that are not thread-safe seems like an idea that hasn't even occured to a lot of (younger?) programmers ...

      • cogman10 6 hours ago

        You can't.

        You could wrap setenv in a mutex, but that's not good enough. It can still be called from different processes, which means you'd need to do a more expensive and complex syncing system to make it safe.

        That ballons out to other env related methods needing to honor the synchronization primitive in order for there to be a semblance of safety.

        However, you still end up in a scenario where you can call

            setenv
            getenv
        
        and that would be incorrect because between the set and the get, even with mutexes properly in place and coordinated amongst different applications, you have a race condition where your set can be overwritten by another application's set before your get can run. Now, instead of actually making these functions safe you've buried the fact that external processes (or your own threads) can mess with env state.

        The solution is to stop using env as some sort of global variable and instead treat it as a constant when the application starts. Using setenv should be mostly discouraged because of these issues.

        • ryao 6 hours ago

          How does an external process mess with env state? As far as I know, you pass the environment when doing the execvpe() and then you cannot touch it from outside of the process anymore.

          • jenadine 6 hours ago

            You're correct. Parent comment is inaccurate. The problem is that a different library in the same process can use getenv without locking (or without locking the same lock as your code)

        • tsimionescu 6 hours ago

          Of course you can. Mutexes are system objects, so it's not a huge problem to sync across processes, if you really have to (is it really expected that one process can set env vars inside another process?).

          Making global state, especially state that has no reason to be modified or even read very often like the env, thread safe is a trivial issue, well studied and understood. Could an intern do it? Probably not. Could literally any maintainer of a standard C library? Easily.

          This is much more of a culture problem preventing such obvious flaws from being recognized as such.

          Side-note: your set-then-get example is a theoretical problem in search of a use case. Why would you ever want to concurrently set an env var and expect to be guaranteed to read that same value? And even if this is a real thing that applications really use, exposing a new function to sync anything on the env mutex is, again, trivial. So, if you really needed that, you could do

            lockenv
            setenv
            getenv
            unlockenv
          
          And problem solved.
          • kelnos 5 hours ago

            That doesn't solve anything. You could be using a library (perhaps a closed-source one) that doesn't use these hypothetical lockenv()/unlockenv() functions.

            This needs to be fixed inside libc, but there's no way to do so completely without breaking backward-compatibility.

          • sunshowers 5 hours ago

            That is a technical solution. What is your solution to the much more serious social problem of adding this check to every codebase in existence? What points of leverage do you have?

        • wmf 4 hours ago

          You didn't read the link, did you?

      • umpalumpaaa 7 hours ago

        I am not sure making things safe by default is a good idea. This always comes with a cost. Thats also the reason why basic data types (array, dictionaries, etc) are generally not thread safe… because its usually not needed or handled on a much higher level.

        Its a different story for languages/environments that are supposed to be safe by default and where you have language features that ensure safety (actors, optionals etc) but not for something like libc which has a standard it has to conform to and like 100 years of history.

        • dgrunwald 6 hours ago

          The problem with `setenv` is that people expect one process to have one set of environment variables, which is shared across multiple languages running in that process. This implies every language must let its environment variables be managed by a central language-independent library -- and on POSIX systems, that's libc. So if libc refuses to provide thread-safety, that impacts not just C, but all possible languages (except for those that cannot call into C-libraries; as those don't need to bother synchronizing the environment with libc).

          • PaulDavisThe1st 5 hours ago

            It's not just that "libc refuses to provide thread-safety" ... the POSIX standard specifies that these functions are non-reentrant.

        • tsimionescu 6 hours ago

          In some cases this is true. In the case of setting and getting env vars, it is not. There is no comceivable reason for making a process that spends any significant portion of its runtime calling setenv() or getenv(). Even if those calls were a thousand times slower than today, it would still be a non-issue.

    • jabl 6 hours ago

      > A non-reentrant function cannot be thread safe.

      Actually, a non-reentrant function can be thread-safe. A common example of such a function in libc being malloc().

      • adrian_b 6 hours ago

        By definition, a "reentrant function" is a function that may be invoked even when it has not returned yet from a previous invocation.

        So a non-reentrant function is a function that may not be invoked again between a previous invocation and returning from that invocation.

        When a function may be invoked from different threads, then it is certain that sometimes it will be invoked by a thread before returning from a previous invocation from a different thread.

        Therefore any function that may be invoked from different threads must be reentrant. Otherwise the behavior of the program is unpredictable. Reentrant functions may be required even in single-thread programs, when they may be invoked recursively, or they may be invoked by signal handlers.

        An implementation of "malloc" may be reentrant or it may be non-reentrant.

        Old "malloc" implementations were usually non-reentrant because they used global variables for managing the heap. Such "malloc" functions could not be used in multi-threaded programs.

        Modern "malloc" implementations are reentrant, either by using only thread-local storage or by using shared global variables to which some method for concurrent access is implemented, e.g. with mutual exclusion.

        • tedunangst 6 hours ago

          Who has a signal safe malloc?

          • adrian_b 4 hours ago

            POSIX does not require malloc to be signal safe.

            Therefore I do not think that anyone has bothered to implement a signal-safe malloc, as this is likely to be complicated.

            Allocating memory in a signal handler makes no sense in a well designed program, so not being allowed to use malloc and related functions is not a problem.

      • sumtechguy 6 hours ago

        I could be wrong but isnt that because each thread has its own heap?

wakawaka28 5 hours ago

Sounds like you just didn't know it's not threadsafe. This is common knowledge in the C and C++ world.

einpoklum 5 hours ago

A function which sets global process state is not thread safe? Why, I'm shocked; shocked and chagrined.

But really, I don't understand why a sensitive security-related library would implicitly use an unsafe function like setenv().

  • bangaladore 4 hours ago

    > A function which sets global process state is not thread safe? Why, I'm shocked; shocked and chagrined.

    This is a oversimplification. Windows has essentially the exact same API and it works just fine in multithreaded contexts.

    The issue here is unix allows the underlying pointer to be accessed, bypassing any possible thread-safe APIs.

forrestthewoods 7 hours ago

Mutable global state is evil. Friends don’t let friends use mutable global state.

I hate envvars. It’s “the Linux way”. I avoid them like the plague. A++ strong recommend.

libc is terrible. The world needs to move on.

  • 01HNNWZ0MV43FF 7 hours ago

    Env vars are good if you treat them as read-only within the process

    • msully4321 7 hours ago

      Yeah, setenv should probably just not exist, and environment variables should be only set when spawning new processes.

      • plorkyeran 6 hours ago

        The problem is that applications sometimes need to set environment variables which will be read by libraries in the same process. This is safe to do during startup, but at no later times.

        Ideally all libraries which use environment variables should have APIs allowing you to override the env variables without calling setenv(), but that isn't always the case.

        • docandrew 6 hours ago

          I’d argue that libraries shouldn’t read environment variables at all. They’re passed on the initial program stack and look just like stack vars, so the issue here is essentially the same as taking the address of a stack variable and misusing it.

          Just like a library wouldn’t try to use argv directly, it shouldn’t use envp either (even if done via getenv/setenv)

        • kelnos 5 hours ago

          > The problem is that applications sometimes need to set environment variables which will be read by libraries in the same process. This is safe to do during startup, but at no later times.

          No, the problem is that libraries try to do this at all. Libraries should just have those APIs you mention, and not touch env vars, period. If you, the library user, really want to use env vars for those settings, you can getenv() them yourself and pass them to the library's APIs.

          Obviously we can't change history; there are libraries that do this anyway. But we should encourage library authors to (in the future) pretend that env vars don't exist.

          • plorkyeran 3 hours ago

            The place where it makes sense for a library to read environment variables is where the program is not written to use that specific library. For example, I can link a program whose author has never heard of TCMalloc against TCMalloc rather than the system malloc, and then configure TCMalloc via environment variables. This does not require modifying a single line of code, while manually forwarding configuration onto the allocator would. Another common example is configuring sanitizers. Not having to do anything other than pass another command-line switch to the compiler is one of the things that makes them really painless to use.

            I do think you'd be hard-pressed to find a situation where a program calling setenv() to configure a library actually makes sense. It's a pretty strong sign that someone made a bad decision. People will, however, make mistakes in API design.

          • PaulDavisThe1st 5 hours ago

            If env vars don't exist, that makes it much harder (and more likely impossible) for users to modify library/application behavior at run time.

            I agree with you that it would be much better if, when libA needs to set behavio Foo in libB, it called libB:setBehavior (Foo) rather than setenv ("LibBehavior", "Foo")

            But let's not throw the baby out with the bathwater.

        • msully4321 6 hours ago

          Yeah, the cows have certainly gotten out already.

    • forrestthewoods 3 hours ago

      I’ll take a config file over an envvar 100% of the time.

  • maep 7 hours ago

    > Mutable global state is evil. Friends don’t let friends use mutable global state.

    Throw away your CPU and RAM then.

    • incrudible 2 hours ago

      Your CPU has an MMU in order to (among other things) let the OS prevent mutable global state.

    • titzer 7 hours ago

      And disks. And the cloud. Or basically, you know, computers.

      • kibwen 6 hours ago

        Don't threaten me with a good time.

      • layer8 7 hours ago

        The universe, you mean.

      • incrudible 2 hours ago

        Ah yes, the cloud where we all happily share compute resources without any restrictions to avoid stomping on each others toes.

    • forrestthewoods 4 hours ago

      I can not possibly roll my eyes hard enough.

      Go ahead and write lots of mutable global statics. But when your program crashes randomly and you need my help to debug and it is, once again, a global mutable then you have to perform a walk of shame.

  • sim7c00 7 hours ago

    what do you suggest as alternative?

    the problem is not linux, not mutable global state or resources and not libc.

    the problem is not getting time at work to do things properly. like spotting this in GDB before the issue hit, because your boss gave you time to tirelessly debug and reverse your code and anything it touches....

    there is too much money in halfbaked code. sad but true.

    • viraptor 6 hours ago

      It definitely is the current libc. That one's proven by systems which do not have the same problem. Then the next layer problem is trying to pretend we can get everyone to pay attention and avoid bugs in code instead of forcing interfaces and implementations where those bugs are not possible.

  • glouwbug 7 hours ago

    libc moved the world into the Information Age

    • kibwen 6 hours ago

      In the same way that Yersinia pestis moved the world into the Renaissance?

      • glouwbug 6 hours ago

        Yes, neither were memory or thread safe

  • jimbob45 7 hours ago

    What's your preferred alternative?

lopkeny12ko 7 hours ago

The whole point of Rust is memory safety, not thread safety...

  • masklinn 6 hours ago

    Rust literally bakes data race safety into the language. While it does not resolve general race conditions, thread safety issues which cause memory unsafety (which an UAF or dangling pointer would be) are very much within its remit.