Empty symlinks and full POSIX compliance
Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today! |
Symbolic links are a mechanism to make one pathname be an alias for another. One would think that there would be little value in an empty symbolic link — one where the destination pathname is the empty string — but that doesn't keep people from trying to create such links, an act that the Linux virtual filesystem layer does not allow. It turns out that its refusal to allow the creation of empty symbolic links puts Linux out of compliance with the POSIX standard. The real question, though, might be: how much does that really matter?
Pointing to the void
It turns out that many Unix-based systems will happily allow a command like:
ln -s "" link-to-nothing
On a Linux system, though, that command will fail with a "No such file or directory" error. This is, as was pointed out in a bug report last January, a somewhat confusing message. If the empty string is replaced by the name of a nonexistent file, no such error results. In other words, for most cases, the lack of an existing file is not a concern. So it seems strange that Linux would gripe about "no such file" in the empty string case.
As part of the ensuing discussion, it turned out that the POSIX standard was not consistent with how empty symbolic links (that already exist in the filesystem) are handled in existing systems. Solaris systems will, when such a link is dereferenced, treat it as a link to the current directory; essentially, an empty link is treated as if it were "." instead. BSD systems respond differently: they take the position that no such file can exist and duly return the "no such file" error. Neither of those responses was compliant with POSIX, a problem which was only fixed in early May, when the standard was updated to allow either the Solaris or the BSD behavior. The result is a standard that explicitly says one cannot know how the system will resolve an empty symbolic link; they might work, or they might not.
How Linux handles an attempt to resolve an empty symbolic link that already exists within a filesystem is not well defined. Some of the work of link resolution is pushed down into filesystem-specific code, so the behavior may depend on which filesystem type is in use. It is hard to test because, as mentioned above, Linux does not allow the creation of empty symbolic links, so they can only come by way of a filesystem from another system. But, in general, an attempt to resolve an empty symbolic link can be expected to return a "No such file or directory" response.
The refusal to create an empty symbolic link, as it turns out, is contrary
to how
POSIX thinks the symlink() system call should work. The
standard text says explicitly that the target string "shall be
treated only as a character string and shall not be validated as a
pathname
". Empty strings are valid character strings, and the
implementation is not allowed to care that they cannot be the name of a
real file, so, by the standard, the creation of such a symbolic link should
be allowed.
Back in January, Pádraig Brady posted a patch enabling the creation of empty symbolic links. The patches did not generate much interest at that time. He followed up in May after the standard had been updated; this time Al Viro expressed his feelings on the matter:
And that is pretty much where the discussion stopped.
Linux and POSIX compliance
That said, it would not be entirely surprising if such a patch were to make it into the kernel at some point. The cost of enabling the creation of empty symbolic links is essentially zero, and adding that capability would bring Linux a little closer to POSIX compliance. But true POSIX compliance, which is a function of both the kernel and the low-level libraries that sit above it, still seems like a distant goal for Linux distributions as a whole.
As a Unix-like system, Linux is not that far removed from compliance with the POSIX standard. Linux developers normally try to adhere to such standards when it makes sense to do so, but they generally feel no need to apply changes that, in their opinion, do not make technical sense just because a standard document calls for it. The reaction to the creation of empty symbolic links is a case in point. The value of closer adherence to POSIX is not seen as being high enough to justify the addition of a "feature" that seems nonsensical.
The value question is an interesting one. Getting certified to the point where one can use the POSIX trademark is a matter of passing the verification test suite, applying for certification, and handing over a relatively small amount of money as described in the fee schedule [PDF]. An enterprise Linux distributor wishing to claim POSIX compliance could almost certainly attain this certification in a relatively short period of time with an investment that would be far smaller than was required for, say, Common Criteria security certification. Carrying some non-mainline patches to the kernel and C library would likely be necessary, but enterprise distributors have generally shown little reluctance to do that when it suits their interests.
But there are no POSIX-certified Linux distributions on the market now. As far as your editor can tell, the only time a distribution has achieved that certification was when Linux-FT claimed it in 1995. That work was (or was not, depending on which side of the argument you listen to) acquired by Caldera shortly thereafter; Caldera, too, intended to achieve POSIX certification. That certification does not appear to have happened, and Caldera, of course, followed its own unhappy path to its doom. Since then, corporate interest in POSIX certification for Linux has been subdued, to say the least.
One can only conclude that the commercial value of a 100% certified POSIX-compliant distribution is not enough to justify even a relatively small level of effort. If distributors were losing business due to the lack of certification, they would be doing something about it. But, it seems, "almost POSIX" is good enough for users, especially in an era where Linux is the preferred platform for many applications.
POSIX still has its place; it sets the expectations for the low-level
interface provided by the operating system and helps to ensure
compatibility. But, increasingly, most current development work is outside
of the scope of POSIX. The
standard cannot hope to keep up with the changes being made to Linux at the
kernel level and above. We live in a fast-changing world where, in many
cases, "what does Linux do?" is the real standard. The developers who are
busily pushing Linux forward have little time or patience for working
toward complete POSIX compatibility when the interesting problems are
elsewhere, so a fully POSIX-compliant
distribution seems unlikely to show up in the near future.
(Log in to post comments)
Empty symlinks and full POSIX compliance
Posted May 23, 2013 6:56 UTC (Thu) by ogj (guest, #3024) [Link]
Empty symlinks and full POSIX compliance
Posted May 23, 2013 8:48 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]
Empty symlinks and full POSIX compliance
Posted May 23, 2013 17:35 UTC (Thu) by wahern (subscriber, #37304) [Link]
In fact, in this case the issue was reported in January of this year and resolved in a few months.
The Linux kernel team is unique in their indifference toward the standard. It's usually companies (Red Hat in particular, because of its stewardship of glibc) who "represent" Linux. The BSD teams have no problem keeping POSIX compliance as a goal while continuing to evolve their own proprietary interfaces.
Perhaps if Linux kernel developers were more engaged a more satisfactory outcome would have resulted.
Empty symlinks and full POSIX compliance
Posted May 23, 2013 17:44 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]
Sure, reasonable compatibility with POSIX is required (for process control, sockets, etc.) but trying to conform to all of the requirements? Nah, not worth it.
Empty symlinks and full POSIX compliance
Posted May 23, 2013 20:43 UTC (Thu) by dlang (guest, #313) [Link]
Actually, most of the POSIX interfaces work just fine. You are using them all over the place without realizing it.
However, there are some POSIX standards that are broken or just plain 'odd'. These are mostly historical accidents.
You have to remember, POSIX was not a standard dreamed up by Ivory Tower folks, it was a standard created to document what is actually in use to limit/prevent fragmentation between implementations.
This does mean that POSIX is always going to lag what's implemented, because it requires that multiple people implement something before they will standardize it (specifically to avoid Ivory Tower type problems)
There is value in being POSIX compliant, but you get 99.999%+ of the value by being "almost" POSIX compliant the way Linux is.
Empty symlinks and full POSIX compliance
Posted May 23, 2013 21:31 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]
There is value in being POSIX compliant, but you get 99.999%+ of the value by being "almost" POSIX compliant the way Linux is.
It seems to me that refusing to follow the POSIX behavior when it's really stupid, as Linux is doing by refusing to create symlinks named the empty string, probably gives you slightly more value than rigid adherence to the standard would.
Empty symlinks and full POSIX compliance
Posted May 23, 2013 22:37 UTC (Thu) by dlang (guest, #313) [Link]
Empty symlinks and full POSIX compliance
Posted Jun 2, 2013 0:40 UTC (Sun) by yeti-dn (guest, #46560) [Link]
So the question is not whether to change nice Linux behaviour to stupid POSIX behaviour, but whether to swap one kind of stupid behaviour for another. If one of them is standard-compliant it should be the clear winner.
Empty symlinks and full POSIX compliance
Posted Jun 3, 2013 8:53 UTC (Mon) by micka (subscriber, #38720) [Link]
The change is just not worth it, and if you must have a stupid behaviour, it's better to keep the one you have.
It would be different to change to a *less* stupid behaviour (even non standard-compliant) like changing the 'no such file or directory' to 'symlink to empty name not allowed'. Not compliant but sure less stupid and confusing.
Empty symlinks and full POSIX compliance
Posted Jul 20, 2013 1:49 UTC (Sat) by kirv (guest, #89016) [Link]
$ ln -s /no-such-file foo
$ ln -s "" bar
ln: creating symbolic link `bar' -> `': No such file or directory
Since the non-existence of the target file is not a problem, what 'file or directory' is being referred to?
I have no particular use in mind for empty symlink targets, and just ran into this case while working up some test scripts. I do have use for arbitrary strings in the system target, and would think it preferable to allow the empty string there just for consistency, but at least the error message should be meaningful.
The whole POSIX question is beyond me, but I don't think any special handling should be necessary for empty symlinks. Just let them resolve as they are, e.g., so maybe cat bar or ls bar would look like:
$ cat ""
cat: : No such file or directory
$ ls ""
ls: cannot access : No such file or directory
Best solved in the C-library?
Posted May 23, 2013 8:58 UTC (Thu) by shalem (subscriber, #4062) [Link]
Best solved in the C-library?
Posted May 23, 2013 13:33 UTC (Thu) by corbet (editor, #1) [Link]
Because then if you do, say, readlink() on the link you get something other than what you stored there. That, too, has standards-compliance issues.
Empty symlinks and full POSIX compliance
Posted May 23, 2013 12:41 UTC (Thu) by etienne (guest, #25256) [Link]
I.e. the read-only backing store contains a file named "dummy", the read/write overlay receive a "delete dummy" so the overlay remove "dummy" by creating an empty "dummy" symlink.
Empty symlinks and full POSIX compliance
Posted May 28, 2013 20:18 UTC (Tue) by ms-tg (subscriber, #89231) [Link]
Empty symlinks and full POSIX compliance
Posted May 29, 2013 9:04 UTC (Wed) by etienne (guest, #25256) [Link]
Did not read the source code, but extracted from:
http://lwn.net/Articles/325126/
-------
If a file is deleted which exists at the bottom layer, a so-called
whiteout file with the same name is created at the top layer. Users
never get to see this file; it is not included in readdir results, and
trying to open it fails with errno == ENOENT. If a file with the same
name is later created, this file replaces the whiteout.
-------
readdir() could forget to list empty symlinks, but anyway if the file is listed in readdir() trying to open it would look like the file did exist but has now disappeared - i.e. has been removed since readdir() was initialised.
Empty symlinks and full POSIX compliance
Posted May 28, 2013 20:32 UTC (Tue) by mathstuf (subscriber, #69389) [Link]
Seems a small concern
Posted May 23, 2013 17:48 UTC (Thu) by justincormack (subscriber, #70439) [Link]
Seems a small concern
Posted May 27, 2013 21:22 UTC (Mon) by nix (subscriber, #2304) [Link]
Empty symlinks and full POSIX compliance
Posted May 23, 2013 20:08 UTC (Thu) by viro (subscriber, #7872) [Link]
1) there was a kinda-sorta sane followup and yes, it might make sense to remove that check from symlink creation. Giving that rationale from the very beginning, instead of "we have decided that existing behaviour is bug; we are POSIX, so a bug it is" would've worked a lot better.
2) empty symlinks present on filesystem are, indeed, handled - semantics of those happens to be "just stay in parent".
3) to all kinds of kooks with agenda^W^W^Wactivists: get a life, would you? There are places for all kinds of perversions, including advocacy; if you feel like buggering ducks, wanking at the photo of Nixon/Brezhnev kiss, promoting *BSD/systemd/whatnot - it's your business, just keep the discussion in alt.sex.* or appropriate pr0n sites.
Empty symlinks and full POSIX compliance
Posted May 23, 2013 20:12 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]
Empty symlinks and full POSIX compliance
Posted May 25, 2013 9:00 UTC (Sat) by tpo (subscriber, #25713) [Link]
> would you? There are places for all kinds of perversions,
> including advocacy; if you feel like buggering ducks,
> wanking at the photo of Nixon/Brezhnev kiss,
> promoting *BSD/systemd/whatnot - it's your business,
> just keep the discussion in alt.sex.* or appropriate pr0n sites.
Take a short break from the kernel business to fix that mirror of yours.
*t
Empty symlinks and full POSIX compliance
Posted May 27, 2013 18:40 UTC (Mon) by pr1268 (subscriber, #24648) [Link]
...the [POSIX] standard was updated to allow either the Solaris or the BSD behavior.
What a washout. POSIX should choose one are the other and stick with it.
...it was a standard created to document what is actually in use to limit/prevent fragmentation between implementations. -dlang
By not choosing one specific behavior of empty symlinks, POSIX is only adding to the fragmentation. Sigh...
Historical side note bearing similarity: In 1960, the Federal Communications Commission (FCC) was tasked with choosing a single stereophonic FM broadcast standard from among 14 submissions (from a number of different companies). They chose one based on the technical merits known at the time, and upset the other companies which had submitted a proposal. Oh well. Fast forward to 1980, and again, the FCC was tasked with choosing a single AM stereo1 specification (from four submissions, each from a different company). Instead, the FCC chose not to choose one, deciding that any of the four standards could be used2. I believe this caused more ire than the FM row twenty years prior.
1 I can't argue the need for AM stereo - it's a low-fidelity medium not well-suited for stereophonic content anyway.
2 If I understand 47 CFR §73.128 correctly, the FCC has since regulated stereo AM to a single specification. I think it's a moot point, anyway; when's the last time anyone saw consumer audio hardware proudly sporting AM stereo capability?
Typo correction
Posted May 27, 2013 18:44 UTC (Mon) by pr1268 (subscriber, #24648) [Link]
POSIX should choose one are the other
s/are/or/
Typo correction
Posted May 31, 2013 15:28 UTC (Fri) by mmendez (subscriber, #81435) [Link]
omalloc (was Re: Empty symlinks and full POSIX compliance)
Posted May 29, 2013 12:37 UTC (Wed) by mirabilos (subscriber, #84359) [Link]
This is apparently cheaper than doing full file I/O, which isn’t unimportant in a malloc I’d guess.
Empty symlinks and full POSIX compliance
Posted Jun 13, 2013 18:42 UTC (Thu) by dmuc (guest, #91417) [Link]
Years ago we run a software which was configured with a lot of symlinks, and a few plain files.
It worked like that:
/etc/foo/mail/to -> foo@example.com
/etc/foo/mail/subject-prefix ->
The benefit is: You don't need a config file parser, you can access the config simply by readlink("/etc/foo/mail/subject-prefix").
With an empty symlink you have another value you can use. Think of NULL vs "" in SQL.
file not found -> NULL
symlink to "" -> ""
symlink to "x" -> "x"
The semantic for subject-prefix was: if the symlink is not there, there was a default for subject-prefix. If you want no prefix, just set the symlink to "".
Empty symlinks and full POSIX compliance
Posted Jun 14, 2013 0:31 UTC (Fri) by mpr22 (subscriber, #60784) [Link]
An amusing hack, but ln makes a lousy configuration editor.