18

[^a] means any character other than a, we know, but what does [^] (with no following characters) mean? Just as - loses its meaning of character range in cases such as [-], I assumed that [^] would match the caret. I spent way too long debugging this problem, only to find out that at least in Chrome 19 it appears to match anything--in other words, be equivalent to .. Is there a spec applicable here or what is the expected behavior?

Yes, I'm aware that I can and probably should use [\^]. This question is more in the nature of morbid curiosity.

3
  • Hmm. It negates the set but if the set is empty... match anything as long as it is not nothing? That doesn't seem right. What does [] match? [^] should match anything that [] does not match. Jun 1, 2012 at 1:48
  • Based on the answers below it sounnds like it means "this expression should not be used"!
    – jahroy
    Jun 1, 2012 at 2:36
  • 2
    Related performance test: jsperf.com/match-any-char-regex Jun 1, 2012 at 9:54

3 Answers 3

30

According to the JavaScript specification (ES3 and ES5), [^] matches any single code unit, the same as [\s\S], [\0-\uffff], (.|\s) (don't use that; unlike the others, it relies on backtracking), etc. The difference from . is that the dot doesn't match the four newline code points (\r, \n, \u2028, and \u2029).

I don't recommend using [^] or [], because they don't work consistently cross-browser, and they prevent your regexes from working in other programming languages. IE <= 8 and older versions of Safari use the traditional (non-JavaScript) regex behavior for empty character classes. Older versions of Opera reverse the correct JavaScript behavior, so that [] matches any code unit and [^] never matches. The traditional regex behavior is that a leading, unescaped ] within a character class is treated as a literal character and does not end the character class.

If you use the XRegExp library, [] and [^] work correctly and consistently cross-browser. XRegExp also adds the s (aka dotall or singleline) flag that makes a dot match any code unit (the same as [^] in a browser that correctly follows the JavaScript spec).

2
  • Great post! Could you be more specific about the old Safari and Opera versions? Jun 1, 2012 at 4:58
  • 2
    Thanks. I'm not sure which versions fixed the problems. I know Safari 3 got it wrong. (Early v3 Safari had lots of little-known RegExp surprises since it was running PCRE with a too-simple JS layer on top of it.) Opera was still getting it wrong when I first wrote xregexp.com/cross_browser . They probably fixed it shortly after Acid3 was released, since Acid3 explicitly tests empty character classes (to my dismay, since until that came out I was hoping ES could change to match the traditional behavior). It looks like IE actually didn't fix the problem until v9 (I've edited my post).
    – slevithan
    Jun 1, 2012 at 6:30
2

The caret ^ has many meanings - as with most characters in the regular expression syntax. Furthermore, all characters heavily depend on their context. To complicate things further, some characters and syntax depend on the underlying engine (Perl, Java).

Let's break apart [^]:

[] is a character class.

[^ is the:

Negation of the character class, matching a character not listed in the character class.

You didn't define any characters in the character class. So the behavior is undefined. Meaning there is nothing to negate and therefore it matches anything.

2
  • 1
    @Derek Because . doesn’t match newline characters. Jun 1, 2012 at 4:58
  • So the behavior is undefined. This might lead people to believe that it is undefined behavior, which is not true, since it's defined in ECMA spec, though implementation varies.
    – nhahtdh
    May 11, 2015 at 10:23
1

The meaning is the negation of what follows. Nothing follows here, therefore:

anything except nothing = everything

However, most other RegEx engines throw an error at the expression though:

  • ereg(): REG_EBRACK
  • preg_match(): Compilation failed: missing terminating ]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.