LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 42540 - "version" include wreaks havoc on case-insensitive filesystems
Summary: "version" include wreaks havoc on case-insensitive filesystems
Status: RESOLVED WONTFIX
Alias: None
Product: libc++
Classification: Unclassified
Component: All Bugs (show other bugs)
Version: unspecified
Hardware: All All
: P normal
Assignee: Unassigned Clang Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-08 09:31 PDT by Quentin Smith
Modified: 2021-09-20 13:21 PDT (History)
6 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Quentin Smith 2019-07-08 09:31:59 PDT
- libc++ >= 7.0 has an include file called `version` (as part of C++20)
- Many open source packages contain a file called `VERSION` (this is a standard file in autoconf-based distributions)
- On a case-insensitive filesystem, #include <version> finds `VERSION` and hilarity ensues.

I'm really not sure what the best fix would be. Asking thousands of open-source projects to rename their `VERSION` files seems like a huge challenge. Asking clang to be case-sensitive even when the underlying filesystem is not seems like a non-starter. libc++ could at least not having any includes of <version> within the standard library itself, but that just postpones the pain until it starts being included in other libraries.

Is it too late to raise this as a standards issue?
Comment 1 Marshall Clow (home) 2019-07-08 09:53:53 PDT
We had this discussion back when the patch was landed (last September)
Dicussion starts here: http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20181001/245438.html

Jonathan Wakely (the head libstdc++ maintainer) summarized it thus:

> Exactly the same issue exists with every new header, e.g.
<mutex> in C++11 and <optional> in C++17. Keep extension-less files
out of your header paths.


Side note: Even if we "fixed" clang to be case-sensitive even when the underlying filesystem is not, all this code would break for gcc users as well.

I remember this discussion in the committee when this was being debated, and they still decided that "version" was the right name for this header.
Comment 2 Quentin Smith 2019-07-08 10:06:26 PDT
The difference between "mutex", "optional", and "version" is that "VERSION" is far more prevalent than either of the other two. Did the committee specifically discuss this in the context of case-insensitive filesystems?
Comment 3 James Y Knight 2019-07-08 10:26:30 PDT
I think the new thing here is that's it's not just a potential conflict, it's an _actual_ conflict with a common software package layout.

Introducing any other headers is always theoretically the same issue, but there were just not many people with those filenames lying around in the way, while there are many such packages that contain a file named "VERSION" -- and put their root directory on the include path (ill-advised as that may be...).

So, I think it was likely a mistake to choose the name "version" for a standard header, given the popularity of that filename. Similarly, C++ should avoid using other such common suffix-less filenames, like "authors", "copying", "license", "maintainers" in the future.

And when the initial reports of issues started coming in, it probably should've been taken more seriously.

However, by now, it seems like it might be too late to fix this in practice, even if not in theory. While C++20 isn't yet standardized and maybe could be changed, releases of Clang have already shipped with this 'version' header, and software is thus already broken, and needing to be patched -- even if it doesn't use C++20.

By the time the standard could be changed, and clang changed, and a new clang release deployed, enough software might've already been fixed already that it may not even be worth changing anymore...

The answer might depend on just how big a problem this is -- are there only overall a relatively small (even if large enough to be notable) number of packages affected? Or is this an extremely widespread issue? How many packages does this _actually_ affect?
Comment 4 Werner Lemberg 2019-07-08 12:17:25 PDT
My guess is a few hundred, some of them quite popular (like groff or LilyPond).  For programs using automake or autoconf it is quite common to have a top-level VERSION file.  The very problem is that a configure script creates a file `config.h' in the top-level, too, to be included by virtually all source files.  This is a guaranteed recipe for trouble...
Comment 5 Werner Lemberg 2019-07-08 12:18:16 PDT
I mean, a few hundred popular packages, some of them *very* popular.
Comment 6 Marshall Clow (home) 2019-07-08 14:31:02 PDT
Some more info:

We shipped a <version> header in LLVM 7 (September 2018) and LLVM 8 (March 2019).
GCC shipped theirs in GCC 9.1 (May 2019).
MSVC will be shipping their implementation in VS 2019 16.2 (real soon now).

Casey found a record of the discussion in LEWG (in ABQ), and they didn't record any discussion about conflicts with existing names, but "version" was overwhelmingly the choice that they wanted.
Comment 7 Marshall Clow (home) 2019-07-08 14:32:04 PDT
I searched my hard disk for files named VERSION.

There were 50 of them. Only one of them was in a place that could be on a header include path; and that was a "maybe"
Comment 8 Quentin Smith 2019-07-08 19:44:28 PDT
"VERSION" files are typically part of source packages, not binary packages. It's not surprising you don't see them installed in include paths - you need to look at the source packages for your system. That's where VERSION is likely in the same directory as config.h.
Comment 9 Anders Kaseorg 2019-07-09 00:17:08 PDT
FreeBSD had to patch 21 packages back in March to work around the conflict with (lowercase) ‘version’ files:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236192
Comment 10 Quentin Smith 2019-07-09 09:08:39 PDT
I managed to quantify the impact of this using Debian source packages. (Of course, Debian is not typically used with a case-insensitive filesystem, but it's an easy way to get $large_collection_of_packages.)

At a first cut of estimating the size of the problem, 227 Debian buster source packages contain a file named "version", and a further 991 Debian buster source packages contain a file named "VERSION" or a different mixed case. (This is only packages that ship the file directly in their source; some packages also generate these files at build time and would not be counted in this.)

Of the 1218 Debian source packages that contain a file named "version" (with some case), 392 contain C++ code by file extension (".cpp", ".c++", ".C", etc.). Of *those* 217 are definitely affected because they have a "config.h" in their root directory, plus some number of the remainder whose include path is not so easily determined from a file listing.

I know LLVM 7.0 has been shipping since September, but only now is it really starting to be picked up by distributions and these conflicts are being discovered.
Comment 11 Marshall Clow (home) 2019-07-09 15:54:16 PDT
(In reply to Quentin Smith from comment #10)
> 
> I know LLVM 7.0 has been shipping since September, but only now is it really
> starting to be picked up by distributions and these conflicts are being
> discovered.

That suggests (to me) that whatever we do now is more or less irrelevant, since by the time that distributions pick up a hypothetical newer version of clang that has this "fixed", all the projects that people care about will have been updated to no longer have this problem.
Comment 12 Quentin Smith 2019-07-10 14:22:16 PDT
(In reply to Marshall Clow (home) from comment #11)
> (In reply to Quentin Smith from comment #10)
> > 
> > I know LLVM 7.0 has been shipping since September, but only now is it really
> > starting to be picked up by distributions and these conflicts are being
> > discovered.
> 
> That suggests (to me) that whatever we do now is more or less irrelevant,
> since by the time that distributions pick up a hypothetical newer version of
> clang that has this "fixed", all the projects that people care about will
> have been updated to no longer have this problem.

I think this is a false equivalency; distros take bug fixes and feature enhancements on different schedules. Granted, the fix for this problem is likely to not be a trivial patch, but I think you should leave that decision up to distros and not presuppose that they will not be able to take an upstream fix.
Comment 13 Louis Dionne 2021-09-20 13:21:12 PDT
I'm going to close this.

When the C++ Standards Committee tells us to add a header named `<version>`, we do it. IMO, that's a problem that should have been fixed at the source, when <version> was proposed, not after the fact.