|
|
Subscribe / Log in / New account

Empty symlinks and full POSIX compliance

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jonathan Corbet
May 22, 2013
Symbolic links are a mechanism to make one pathname be an alias for another. One would think that there would be little value in an empty symbolic link — one where the destination pathname is the empty string — but that doesn't keep people from trying to create such links, an act that the Linux virtual filesystem layer does not allow. It turns out that its refusal to allow the creation of empty symbolic links puts Linux out of compliance with the POSIX standard. The real question, though, might be: how much does that really matter?

Pointing to the void

It turns out that many Unix-based systems will happily allow a command like:

    ln -s "" link-to-nothing

On a Linux system, though, that command will fail with a "No such file or directory" error. This is, as was pointed out in a bug report last January, a somewhat confusing message. If the empty string is replaced by the name of a nonexistent file, no such error results. In other words, for most cases, the lack of an existing file is not a concern. So it seems strange that Linux would gripe about "no such file" in the empty string case.

As part of the ensuing discussion, it turned out that the POSIX standard was not consistent with how empty symbolic links (that already exist in the filesystem) are handled in existing systems. Solaris systems will, when such a link is dereferenced, treat it as a link to the current directory; essentially, an empty link is treated as if it were "." instead. BSD systems respond differently: they take the position that no such file can exist and duly return the "no such file" error. Neither of those responses was compliant with POSIX, a problem which was only fixed in early May, when the standard was updated to allow either the Solaris or the BSD behavior. The result is a standard that explicitly says one cannot know how the system will resolve an empty symbolic link; they might work, or they might not.

How Linux handles an attempt to resolve an empty symbolic link that already exists within a filesystem is not well defined. Some of the work of link resolution is pushed down into filesystem-specific code, so the behavior may depend on which filesystem type is in use. It is hard to test because, as mentioned above, Linux does not allow the creation of empty symbolic links, so they can only come by way of a filesystem from another system. But, in general, an attempt to resolve an empty symbolic link can be expected to return a "No such file or directory" response.

The refusal to create an empty symbolic link, as it turns out, is contrary to how POSIX thinks the symlink() system call should work. The standard text says explicitly that the target string "shall be treated only as a character string and shall not be validated as a pathname". Empty strings are valid character strings, and the implementation is not allowed to care that they cannot be the name of a real file, so, by the standard, the creation of such a symbolic link should be allowed.

Back in January, Pádraig Brady posted a patch enabling the creation of empty symbolic links. The patches did not generate much interest at that time. He followed up in May after the standard had been updated; this time Al Viro expressed his feelings on the matter:

Functionality in question is utterly pointless, seeing that semantics of such symlinks is OS-dependent anyway *and* that blanket refusal to traverse such beasts is a legitimate option. What's the point in allowing to create them in the first place?

And that is pretty much where the discussion stopped.

Linux and POSIX compliance

That said, it would not be entirely surprising if such a patch were to make it into the kernel at some point. The cost of enabling the creation of empty symbolic links is essentially zero, and adding that capability would bring Linux a little closer to POSIX compliance. But true POSIX compliance, which is a function of both the kernel and the low-level libraries that sit above it, still seems like a distant goal for Linux distributions as a whole.

As a Unix-like system, Linux is not that far removed from compliance with the POSIX standard. Linux developers normally try to adhere to such standards when it makes sense to do so, but they generally feel no need to apply changes that, in their opinion, do not make technical sense just because a standard document calls for it. The reaction to the creation of empty symbolic links is a case in point. The value of closer adherence to POSIX is not seen as being high enough to justify the addition of a "feature" that seems nonsensical.

The value question is an interesting one. Getting certified to the point where one can use the POSIX trademark is a matter of passing the verification test suite, applying for certification, and handing over a relatively small amount of money as described in the fee schedule [PDF]. An enterprise Linux distributor wishing to claim POSIX compliance could almost certainly attain this certification in a relatively short period of time with an investment that would be far smaller than was required for, say, Common Criteria security certification. Carrying some non-mainline patches to the kernel and C library would likely be necessary, but enterprise distributors have generally shown little reluctance to do that when it suits their interests.

But there are no POSIX-certified Linux distributions on the market now. As far as your editor can tell, the only time a distribution has achieved that certification was when Linux-FT claimed it in 1995. That work was (or was not, depending on which side of the argument you listen to) acquired by Caldera shortly thereafter; Caldera, too, intended to achieve POSIX certification. That certification does not appear to have happened, and Caldera, of course, followed its own unhappy path to its doom. Since then, corporate interest in POSIX certification for Linux has been subdued, to say the least.

One can only conclude that the commercial value of a 100% certified POSIX-compliant distribution is not enough to justify even a relatively small level of effort. If distributors were losing business due to the lack of certification, they would be doing something about it. But, it seems, "almost POSIX" is good enough for users, especially in an era where Linux is the preferred platform for many applications.

POSIX still has its place; it sets the expectations for the low-level interface provided by the operating system and helps to ensure compatibility. But, increasingly, most current development work is outside of the scope of POSIX. The standard cannot hope to keep up with the changes being made to Linux at the kernel level and above. We live in a fast-changing world where, in many cases, "what does Linux do?" is the real standard. The developers who are busily pushing Linux forward have little time or patience for working toward complete POSIX compatibility when the interesting problems are elsewhere, so a fully POSIX-compliant distribution seems unlikely to show up in the near future.


(Log in to post comments)

Empty symlinks and full POSIX compliance

Posted May 23, 2013 6:56 UTC (Thu) by ogj (guest, #3024) [Link]

Why not aim for a change in the standard? As it is, the standard is utterly pointless. No real value of following the standard as it is. After all, the standard was very recently changed to conform to the behavior of BSD and Solaris.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 8:48 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Changes in standards like POSIX take years to materialize. And many features like cgroups or namespaces have about zero chance to be approved, because no other vendor can realistically implement them quickly enough.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 17:35 UTC (Thu) by wahern (subscriber, #37304) [Link]

Changes to the specification of existing interfaces can occur quickly. You can file a bug report against the specification, discuss, and expect a resolution in a reasonable timeframe.

In fact, in this case the issue was reported in January of this year and resolved in a few months.

The Linux kernel team is unique in their indifference toward the standard. It's usually companies (Red Hat in particular, because of its stewardship of glibc) who "represent" Linux. The BSD teams have no problem keeping POSIX compliance as a goal while continuing to evolve their own proprietary interfaces.

Perhaps if Linux kernel developers were more engaged a more satisfactory outcome would have resulted.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 17:44 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

What does the POSIX compatibility achieves? Most of POSIX interfaces are either braindead simple or totally broken.

Sure, reasonable compatibility with POSIX is required (for process control, sockets, etc.) but trying to conform to all of the requirements? Nah, not worth it.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 20:43 UTC (Thu) by dlang (guest, #313) [Link]

> Most of POSIX interfaces are either braindead simple or totally broken.

Actually, most of the POSIX interfaces work just fine. You are using them all over the place without realizing it.

However, there are some POSIX standards that are broken or just plain 'odd'. These are mostly historical accidents.

You have to remember, POSIX was not a standard dreamed up by Ivory Tower folks, it was a standard created to document what is actually in use to limit/prevent fragmentation between implementations.

This does mean that POSIX is always going to lag what's implemented, because it requires that multiple people implement something before they will standardize it (specifically to avoid Ivory Tower type problems)

There is value in being POSIX compliant, but you get 99.999%+ of the value by being "almost" POSIX compliant the way Linux is.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 21:31 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

There is value in being POSIX compliant, but you get 99.999%+ of the value by being "almost" POSIX compliant the way Linux is.

It seems to me that refusing to follow the POSIX behavior when it's really stupid, as Linux is doing by refusing to create symlinks named the empty string, probably gives you slightly more value than rigid adherence to the standard would.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 22:37 UTC (Thu) by dlang (guest, #313) [Link]

In case it's not clear, I completely agree with you. I was just talking about the value to be gained by following POSIX at all (I was replying to a post effectively claiming that POSIX is completely irrelevant and should be ignored/scrapped)

Empty symlinks and full POSIX compliance

Posted Jun 2, 2013 0:40 UTC (Sun) by yeti-dn (guest, #46560) [Link]

Certainly it can be argued that the POSIX-specified behaviour is stupid. But the same holds for the Linux behaviour. Giving `no such file or directory' for a symlink target -- what is more odd and confusing than that?

So the question is not whether to change nice Linux behaviour to stupid POSIX behaviour, but whether to swap one kind of stupid behaviour for another. If one of them is standard-compliant it should be the clear winner.

Empty symlinks and full POSIX compliance

Posted Jun 3, 2013 8:53 UTC (Mon) by micka (subscriber, #38720) [Link]

I'm not sure I agree that a change TO a stupid behaviour is a good idea, event if the current state is a stupid behaviour and the next state is standard-compliant.

The change is just not worth it, and if you must have a stupid behaviour, it's better to keep the one you have.

It would be different to change to a *less* stupid behaviour (even non standard-compliant) like changing the 'no such file or directory' to 'symlink to empty name not allowed'. Not compliant but sure less stupid and confusing.

Empty symlinks and full POSIX compliance

Posted Jul 20, 2013 1:49 UTC (Sat) by kirv (guest, #89016) [Link]

Bingo. As the originator of this 'bug report', the reason I posted it was that the error message was confusing. Consider:

$ ln -s /no-such-file foo
$ ln -s "" bar
ln: creating symbolic link `bar' -> `': No such file or directory

Since the non-existence of the target file is not a problem, what 'file or directory' is being referred to?

I have no particular use in mind for empty symlink targets, and just ran into this case while working up some test scripts. I do have use for arbitrary strings in the system target, and would think it preferable to allow the empty string there just for consistency, but at least the error message should be meaningful.

The whole POSIX question is beyond me, but I don't think any special handling should be necessary for empty symlinks. Just let them resolve as they are, e.g., so maybe cat bar or ls bar would look like:

$ cat ""
cat: : No such file or directory
$ ls ""
ls: cannot access : No such file or directory

Best solved in the C-library?

Posted May 23, 2013 8:58 UTC (Thu) by shalem (subscriber, #4062) [Link]

Since the POSIX API is defined at the C-lib level, and not system call level AFAIK, why not simply let the C-lib deal with this, and replace the empty string with for example . It could perhaps even depend on feature macros ...

Best solved in the C-library?

Posted May 23, 2013 13:33 UTC (Thu) by corbet (editor, #1) [Link]

Because then if you do, say, readlink() on the link you get something other than what you stored there. That, too, has standards-compliance issues.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 12:41 UTC (Thu) by etienne (guest, #25256) [Link]

Would empty symlink be a documented way to remove a file in a unionfs?
I.e. the read-only backing store contains a file named "dummy", the read/write overlay receive a "delete dummy" so the overlay remove "dummy" by creating an empty "dummy" symlink.

Empty symlinks and full POSIX compliance

Posted May 28, 2013 20:18 UTC (Tue) by ms-tg (subscriber, #89231) [Link]

That's very interesting -- if that's not how it's done, how is it done?

Empty symlinks and full POSIX compliance

Posted May 29, 2013 9:04 UTC (Wed) by etienne (guest, #25256) [Link]

> if that's not how it's done

Did not read the source code, but extracted from:
http://lwn.net/Articles/325126/
-------
If a file is deleted which exists at the bottom layer, a so-called
whiteout file with the same name is created at the top layer. Users
never get to see this file; it is not included in readdir results, and
trying to open it fails with errno == ENOENT. If a file with the same
name is later created, this file replaces the whiteout.
-------
readdir() could forget to list empty symlinks, but anyway if the file is listed in readdir() trying to open it would look like the file did exist but has now disappeared - i.e. has been removed since readdir() was initialised.

Empty symlinks and full POSIX compliance

Posted May 28, 2013 20:32 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Why not just make a self-referential symlink? Same result, but avoids the corner case.

Seems a small concern

Posted May 23, 2013 17:48 UTC (Thu) by justincormack (subscriber, #70439) [Link]

There are much more significant ones if anyone cared about POSIX and Linux eg some of the problems caused by the fact that Linux has a different process thread model and does not provide means to paper over this.

Seems a small concern

Posted May 27, 2013 21:22 UTC (Mon) by nix (subscriber, #2304) [Link]

Most of those problems vanished when LinuxThreads was supplanted by NPTL, thank goodness. The largest remaining ones consist of problems caused by glibc's insistence on synthesising its own TID rather than just handing up the kernel's, causing trouble talking to kernel interfaces that expect the real TID.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 20:08 UTC (Thu) by viro (subscriber, #7872) [Link]

Sigh...

1) there was a kinda-sorta sane followup and yes, it might make sense to remove that check from symlink creation. Giving that rationale from the very beginning, instead of "we have decided that existing behaviour is bug; we are POSIX, so a bug it is" would've worked a lot better.

2) empty symlinks present on filesystem are, indeed, handled - semantics of those happens to be "just stay in parent".

3) to all kinds of kooks with agenda^W^W^Wactivists: get a life, would you? There are places for all kinds of perversions, including advocacy; if you feel like buggering ducks, wanking at the photo of Nixon/Brezhnev kiss, promoting *BSD/systemd/whatnot - it's your business, just keep the discussion in alt.sex.* or appropriate pr0n sites.

Empty symlinks and full POSIX compliance

Posted May 23, 2013 20:12 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

alt.sex.systemd? Now that's a thought. Integration of RedTube into Plymouth might be... interesting.

Empty symlinks and full POSIX compliance

Posted May 25, 2013 9:00 UTC (Sat) by tpo (subscriber, #25713) [Link]

> 3) to all kinds of kooks with agenda^W^W^Wactivists: get a life,
> would you? There are places for all kinds of perversions,
> including advocacy; if you feel like buggering ducks,
> wanking at the photo of Nixon/Brezhnev kiss,
> promoting *BSD/systemd/whatnot - it's your business,
> just keep the discussion in alt.sex.* or appropriate pr0n sites.

Take a short break from the kernel business to fix that mirror of yours.
*t

Empty symlinks and full POSIX compliance

Posted May 27, 2013 18:40 UTC (Mon) by pr1268 (subscriber, #24648) [Link]

...the [POSIX] standard was updated to allow either the Solaris or the BSD behavior.

What a washout. POSIX should choose one are the other and stick with it.

...it was a standard created to document what is actually in use to limit/prevent fragmentation between implementations. -dlang

By not choosing one specific behavior of empty symlinks, POSIX is only adding to the fragmentation. Sigh...

Historical side note bearing similarity: In 1960, the Federal Communications Commission (FCC) was tasked with choosing a single stereophonic FM broadcast standard from among 14 submissions (from a number of different companies). They chose one based on the technical merits known at the time, and upset the other companies which had submitted a proposal. Oh well. Fast forward to 1980, and again, the FCC was tasked with choosing a single AM stereo1 specification (from four submissions, each from a different company). Instead, the FCC chose not to choose one, deciding that any of the four standards could be used2. I believe this caused more ire than the FM row twenty years prior.

1 I can't argue the need for AM stereo - it's a low-fidelity medium not well-suited for stereophonic content anyway.

2 If I understand 47 CFR §73.128 correctly, the FCC has since regulated stereo AM to a single specification. I think it's a moot point, anyway; when's the last time anyone saw consumer audio hardware proudly sporting AM stereo capability?

Typo correction

Posted May 27, 2013 18:44 UTC (Mon) by pr1268 (subscriber, #24648) [Link]

POSIX should choose one are the other

s/are/or/

Typo correction

Posted May 31, 2013 15:28 UTC (Fri) by mmendez (subscriber, #81435) [Link]

My brain saw no typo there!

omalloc (was Re: Empty symlinks and full POSIX compliance)

Posted May 29, 2013 12:37 UTC (Wed) by mirabilos (subscriber, #84359) [Link]

Interestingly enough, symlinks as strings are used in BSD omalloc, where /etc/malloc.conf is a *symlink* pointing to a string that’s then parsed as system-wide malloc options (an environment variable is parsed the same way).
This is apparently cheaper than doing full file I/O, which isn’t unimportant in a malloc I’d guess.

Empty symlinks and full POSIX compliance

Posted Jun 13, 2013 18:42 UTC (Thu) by dmuc (guest, #91417) [Link]

Empty symlink targets are not stupid.

Years ago we run a software which was configured with a lot of symlinks, and a few plain files.

It worked like that:

/etc/foo/mail/to -> foo@example.com
/etc/foo/mail/subject-prefix ->

The benefit is: You don't need a config file parser, you can access the config simply by readlink("/etc/foo/mail/subject-prefix").

With an empty symlink you have another value you can use. Think of NULL vs "" in SQL.

file not found -> NULL
symlink to "" -> ""
symlink to "x" -> "x"

The semantic for subject-prefix was: if the symlink is not there, there was a default for subject-prefix. If you want no prefix, just set the symlink to "".

Empty symlinks and full POSIX compliance

Posted Jun 14, 2013 0:31 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

An amusing hack, but ln makes a lousy configuration editor.


Copyright © 2013, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds