Changeset 203350 in webkit


Ignore:
Timestamp:
Jul 18, 2016 11:32:52 AM (8 years ago)
Author:
fpizlo@apple.com
Message:

WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384

Reviewed by Geoffrey Garen.
Source/WTF:


In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:

  • Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured.


  • FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended.


This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.

This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.

The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.

ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!

WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.

Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.

Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:

799, 6, 1, 1, 1, 1, 1, 1, 1, 1

One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:

80, 79, 79, 79, 79, 79, 79, 80, 80, 79

I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!

This change is good for three reasons:

  • We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy.


  • I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread.


  • If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging.
  • benchmarks/LockFairnessTest.cpp:

(main):

  • benchmarks/ToyLocks.h:
  • wtf/Condition.h:

(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):

  • wtf/Lock.cpp:

(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):

  • wtf/Lock.h:

(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):

  • wtf/ParkingLot.cpp:

(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):

  • wtf/ParkingLot.h:

(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):

Tools:

  • TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Location:
trunk
Files:
10 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/WTF/ChangeLog

    r203330 r203350  
     12016-07-02  Filip Pizlo  <fpizlo@apple.com>
     2
     3        WTF::Lock should be fair eventually
     4        https://bugs.webkit.org/show_bug.cgi?id=159384
     5
     6        Reviewed by Geoffrey Garen.
     7       
     8        In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
     9        locks makes them fast. That post presented lock fairness as a trade-off between two
     10        extremes:
     11       
     12        - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
     13          thread on the queue. If there was a thread on the queue, the lock is released and that
     14          thread is made runnable. That thread may then grab the lock, or some other thread may grab
     15          the lock first (it may barge). Usually, the barging thread is the thread that released the
     16          lock in the first place. This maximizes throughput but hurts fairness. There is no good
     17          theoretical bound on how unfair the lock may become, but empirical data suggests that it's
     18          fair enough for the cases we previously measured.
     19       
     20        - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
     21          if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
     22          runnable and inform it that it now holds the lock. This ensures perfect round-robin
     23          fairness and allows us to reason theoretically about how long it may take for a thread to
     24          grab the lock. For example, if we know that only N threads are running and each one may
     25          contend on a critical section, and each one may hold the lock for at most S seconds, then
     26          the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
     27          in most cases. This is because for the common case of short critical sections, they force
     28          a context switch after each critical section if the lock is contended.
     29       
     30        This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
     31        Thanks to this new algorithm, you can now have both of these things at the same time.
     32       
     33        This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
     34        guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
     35        sections that are longer than 1ms are always fair. For shorter critical sections, the amount
     36        of time that any thread waits is 1ms times the number of threads. There are some caveats
     37        that arise from our use of randomness, but even then, in the limit as the critical section
     38        length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
     39        experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
     40        section that is 1ms or longer.
     41       
     42        The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
     43        fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
     44        fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
     45       
     46        ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
     47        other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
     48        change also makes parkConditionally() a lot more precise about when it was unparked due to a
     49        call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
     50        guaranteed to report that it was unparked rather than timing out, and that thread is
     51        guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
     52        a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
     53        default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
     54        fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
     55        it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
     56        on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
     57        can make a decision about unlock strategy and inject a token while it has complete knowledge
     58        over the state of the queue. As such, it's not immediately obvious how to implement this
     59        algorithm on top of futexes. You really need ParkingLot!
     60       
     61        WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
     62        which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
     63        per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
     64        sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
     65        at random. When a dequeue happens and there are threads that actually get dequeued, we check
     66        if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
     67        more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
     68        This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
     69        happen at least once per millisecond. It will happen at 2 KHz on average. If there are
     70        collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
     71        average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
     72        resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
     73        timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
     74        of everyone else. Randomness ensures that we aren't too fair to any one thread.
     75       
     76        Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
     77        improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
     78        old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
     79        allow only one thread to hold the lock during a whole second in which each thread is holding
     80        the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
     81        holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
     82        other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
     83        handles this case like a champ: each thread gets equal turns.
     84       
     85        Here's some data. If we launch 10 threads and have each of them run for 1 second while
     86        repeatedly holding a critical section for 1ms, then here's how many times each thread gets
     87        to hold the lock using the old WTF::Lock algorithm:
     88       
     89        799, 6, 1, 1, 1, 1, 1, 1, 1, 1
     90       
     91        One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
     92        becomes totally fair:
     93       
     94        80, 79, 79, 79, 79, 79, 79, 80, 80, 79
     95       
     96        I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
     97        that this is a pretty awesome advancement to the state of the art!
     98       
     99        This change is good for three reasons:
     100       
     101        - We do have long critical sections in WebKit and we don't want to have to worry about
     102          starvation. This reduces the likelihood that we will see starvation due to our lock
     103          strategy.
     104       
     105        - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
     106          lockFairly() or some moral equivalent for the scavenger thread.
     107       
     108        - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
     109          to unlock and relock without barging.
     110
     111        * benchmarks/LockFairnessTest.cpp:
     112        (main):
     113        * benchmarks/ToyLocks.h:
     114        * wtf/Condition.h:
     115        (WTF::ConditionBase::waitUntil):
     116        (WTF::ConditionBase::notifyOne):
     117        * wtf/Lock.cpp:
     118        (WTF::LockBase::lockSlow):
     119        (WTF::LockBase::unlockSlow):
     120        (WTF::LockBase::unlockFairlySlow):
     121        (WTF::LockBase::unlockSlowImpl):
     122        * wtf/Lock.h:
     123        (WTF::LockBase::try_lock):
     124        (WTF::LockBase::unlock):
     125        (WTF::LockBase::unlockFairly):
     126        (WTF::LockBase::isHeld):
     127        (WTF::LockBase::isFullyReset):
     128        * wtf/ParkingLot.cpp:
     129        (WTF::ParkingLot::parkConditionallyImpl):
     130        (WTF::ParkingLot::unparkOne):
     131        (WTF::ParkingLot::unparkOneImpl):
     132        (WTF::ParkingLot::unparkAll):
     133        * wtf/ParkingLot.h:
     134        (WTF::ParkingLot::parkConditionally):
     135        (WTF::ParkingLot::compareAndPark):
     136        (WTF::ParkingLot::unparkOne):
     137
    11382016-07-17  Myles C. Maxfield  <mmaxfield@apple.com>
    2139
  • trunk/Source/WTF/benchmarks/LockFairnessTest.cpp

    r202791 r203350  
    4949NO_RETURN void usage()
    5050{
    51     printf("Usage: LockFairnessTest yieldspinlock|pausespinlock|wordlock|lock|barginglock|bargingwordlock|thunderlock|thunderwordlock|cascadelock|cascadewordlockhandofflock|mutex|all <num threads> <seconds per test>\n");
     51    printf("Usage: LockFairnessTest yieldspinlock|pausespinlock|wordlock|lock|barginglock|bargingwordlock|thunderlock|thunderwordlock|cascadelock|cascadewordlockhandofflock|mutex|all <num threads> <seconds per test> <microseconds in critical section>\n");
    5252    exit(1);
    5353}
     
    5555unsigned numThreads;
    5656double secondsPerTest;
     57unsigned microsecondsInCriticalSection;
    5758
    5859struct Benchmark {
     
    7374                "Benchmark Thread",
    7475                [&, threadIndex] () {
     76                    if (!microsecondsInCriticalSection) {
     77                        while (keepGoing) {
     78                            lock.lock();
     79                            counts[threadIndex]++;
     80                            lock.unlock();
     81                        }
     82                        return;
     83                    }
     84                   
    7585                    while (keepGoing) {
    7686                        lock.lock();
    7787                        counts[threadIndex]++;
     88                        usleep(microsecondsInCriticalSection);
    7889                        lock.unlock();
    7990                    }
     
    8697        sleep(secondsPerTest);
    8798   
     99        keepGoing = false;
    88100        lock.lock();
    89         keepGoing = false;
    90101   
    91102        dataLog(name, ": ");
     
    107118    WTF::initializeThreading();
    108119   
    109     if (argc != 4
     120    if (argc != 5
    110121        || sscanf(argv[2], "%u", &numThreads) != 1
    111         || sscanf(argv[3], "%lf", &secondsPerTest) != 1)
     122        || sscanf(argv[3], "%lf", &secondsPerTest) != 1
     123        || sscanf(argv[4], "%u", &microsecondsInCriticalSection) != 1)
    112124        usage();
    113125   
  • trunk/Source/WTF/benchmarks/ToyLocks.h

    r200444 r203350  
    236236        ParkingLot::unparkOne(
    237237            &m_state,
    238             [this] (ParkingLot::UnparkResult result) {
     238            [this] (ParkingLot::UnparkResult result) -> intptr_t {
    239239                if (result.mayHaveMoreThreads)
    240240                    m_state.store(hasParkedBit);
    241241                else
    242242                    m_state.store(0);
     243                return 0;
    243244            });
    244245    }
     
    431432           
    432433            if (m_state.compareExchangeWeak(state, state + parkedCountUnit)) {
    433                 bool result = ParkingLot::compareAndPark(&m_state, state + parkedCountUnit);
     434                bool result = ParkingLot::compareAndPark(&m_state, state + parkedCountUnit).wasUnparked;
    434435                m_state.exchangeAndAdd(-parkedCountUnit);
    435436                if (result)
  • trunk/Source/WTF/wtf/Condition.h

    r199760 r203350  
    8181                },
    8282                [&lock] () { lock.unlock(); },
    83                 timeout);
     83                timeout).wasUnparked;
    8484        }
    8585        lock.lock();
     
    181181        ParkingLot::unparkOne(
    182182            &m_hasWaiters,
    183             [this] (ParkingLot::UnparkResult result) {
     183            [this] (ParkingLot::UnparkResult result) -> intptr_t {
    184184                if (!result.mayHaveMoreThreads)
    185185                    m_hasWaiters.store(false);
     186                return 0;
    186187            });
    187188    }
  • trunk/Source/WTF/wtf/Lock.cpp

    r199760 r203350  
    11/*
    2  * Copyright (C) 2015 Apple Inc. All rights reserved.
     2 * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    6868
    6969        // We now expect the value to be isHeld|hasParked. So long as that's the case, we can park.
    70         ParkingLot::compareAndPark(&m_byte, isHeldBit | hasParkedBit);
     70        ParkingLot::ParkResult parkResult =
     71            ParkingLot::compareAndPark(&m_byte, isHeldBit | hasParkedBit);
     72        if (parkResult.wasUnparked) {
     73            switch (static_cast<Token>(parkResult.token)) {
     74            case DirectHandoff:
     75                // The lock was never released. It was handed to us directly by the thread that did
     76                // unlock(). This means we're done!
     77                RELEASE_ASSERT(isHeld());
     78                return;
     79            case BargingOpportunity:
     80                // This is the common case. The thread that called unlock() has released the lock,
     81                // and we have been woken up so that we may get an opportunity to grab the lock. But
     82                // other threads may barge, so the best that we can do is loop around and try again.
     83                break;
     84            }
     85        }
    7186
    7287        // We have awoken, or we never parked because the byte value changed. Either way, we loop
     
    7590}
    7691
    77 NEVER_INLINE void LockBase::unlockSlow()
     92void LockBase::unlockSlow()
     93{
     94    unlockSlowImpl(Unfair);
     95}
     96
     97void LockBase::unlockFairlySlow()
     98{
     99    unlockSlowImpl(Fair);
     100}
     101
     102NEVER_INLINE void LockBase::unlockSlowImpl(Fairness fairness)
    78103{
    79104    // We could get here because the weak CAS in unlock() failed spuriously, or because there is
     
    90115        }
    91116
    92         // Someone is parked. Unpark exactly one thread, possibly leaving the parked bit set if
    93         // there is a chance that there are still other threads parked.
     117        // Someone is parked. Unpark exactly one thread. We may hand the lock to that thread
     118        // directly, or we will unlock the lock at the same time as we unpark to allow for barging.
     119        // When we unlock, we may leave the parked bit set if there is a chance that there are still
     120        // other threads parked.
    94121        ASSERT(oldByteValue == (isHeldBit | hasParkedBit));
    95122        ParkingLot::unparkOne(
    96123            &m_byte,
    97             [this] (ParkingLot::UnparkResult result) {
     124            [&] (ParkingLot::UnparkResult result) -> intptr_t {
    98125                // We are the only ones that can clear either the isHeldBit or the hasParkedBit,
    99126                // so we should still see both bits set right now.
    100127                ASSERT(m_byte.load() == (isHeldBit | hasParkedBit));
     128               
     129                if (result.didUnparkThread && (fairness == Fair || result.timeToBeFair)) {
     130                    // We don't unlock anything. Instead, we hand the lock to the thread that was
     131                    // waiting.
     132                    return DirectHandoff;
     133                }
    101134
    102135                if (result.mayHaveMoreThreads)
     
    104137                else
    105138                    m_byte.store(0);
     139                return BargingOpportunity;
    106140            });
    107141        return;
  • trunk/Source/WTF/wtf/Lock.h

    r200068 r203350  
    11/*
    2  * Copyright (C) 2015 Apple Inc. All rights reserved.
     2 * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    4141// competetive to a spinlock (uncontended locking is inlined and is just a CAS, microcontention is
    4242// handled by spinning and yielding), and a slow path that is competetive to std::mutex (if a lock
    43 // cannot be acquired in a short period of time, the thread is put to sleep until the lock is available
    44 // again). It uses less memory than a std::mutex.
     43// cannot be acquired in a short period of time, the thread is put to sleep until the lock is
     44// available again). It uses less memory than a std::mutex. This lock guarantees eventual stochastic
     45// fairness, even in programs that relock the lock immediately after unlocking it. Except when there
     46// are collisions between this lock and other locks in the ParkingLot, this lock will guarantee that
     47// at worst one call to unlock() per millisecond will do a direct hand-off to the thread that is at
     48// the head of the queue. When there are collisions, each collision increases the fair unlock delay
     49// by one millisecond in the worst case.
    4550
    4651// This is a struct without a constructor or destructor so that it can be statically initialized.
     
    7479    }
    7580
     81    // Relinquish the lock. Either one of the threads that were waiting for the lock, or some other
     82    // thread that happens to be running, will be able to grab the lock. This bit of unfairness is
     83    // called barging, and we allow it because it maximizes throughput. However, we bound how unfair
     84    // barging can get by ensuring that every once in a while, when there is a thread waiting on the
     85    // lock, we hand the lock to that thread directly. Every time unlock() finds a thread waiting,
     86    // we check if the last time that we did a fair unlock was more than roughly 1ms ago; if so, we
     87    // unlock fairly. Fairness matters most for long critical sections, and this virtually
     88    // guarantees that long critical sections always get a fair lock.
    7689    void unlock()
    7790    {
     
    8295
    8396        unlockSlow();
     97    }
     98
     99    // This is like unlock() but it guarantees that we unlock the lock fairly. For short critical
     100    // sections, this is much slower than unlock(). For long critical sections, unlock() will learn
     101    // to be fair anyway. However, if you plan to relock the lock right after unlocking and you want
     102    // to ensure that some other thread runs in the meantime, this is probably the function you
     103    // want.
     104    void unlockFairly()
     105    {
     106        if (LIKELY(m_byte.compareExchangeWeak(isHeldBit, 0, std::memory_order_release))) {
     107            // Lock released and nobody was waiting!
     108            return;
     109        }
     110
     111        unlockFairlySlow();
    84112    }
    85113
     
    102130    WTF_EXPORT_PRIVATE void lockSlow();
    103131    WTF_EXPORT_PRIVATE void unlockSlow();
     132    WTF_EXPORT_PRIVATE void unlockFairlySlow();
     133   
     134    enum Fairness {
     135        Fair,
     136        Unfair
     137    };
     138    void unlockSlowImpl(Fairness);
     139   
     140    enum Token {
     141        BargingOpportunity,
     142        DirectHandoff
     143    };
    104144
    105145    // Method used for testing only.
  • trunk/Source/WTF/wtf/ParkingLot.cpp

    r201433 r203350  
    2727#include "ParkingLot.h"
    2828
     29#include "CurrentTime.h"
    2930#include "DataLog.h"
    3031#include "HashFunctions.h"
     
    3334#include "ThreadingPrimitives.h"
    3435#include "Vector.h"
     36#include "WeakRandom.h"
    3537#include "WordLock.h"
    3638#include <condition_variable>
     
    5961   
    6062    ThreadData* nextInQueue { nullptr };
     63   
     64    intptr_t token { 0 };
    6165};
    6266
     
    7074    WTF_MAKE_FAST_ALLOCATED;
    7175public:
     76    Bucket()
     77        : random(static_cast<unsigned>(bitwise_cast<intptr_t>(this))) // Cannot use default seed since that recurses into Lock.
     78    {
     79    }
     80   
    7281    void enqueue(ThreadData* data)
    7382    {
     
    124133        ThreadData** currentPtr = &queueHead;
    125134        ThreadData* previous = nullptr;
     135
     136        double time = monotonicallyIncreasingTimeMS();
     137        bool timeToBeFair = false;
     138        if (time > nextFairTime)
     139            timeToBeFair = true;
     140       
     141        bool didDequeue = false;
     142       
    126143        while (shouldContinue) {
    127144            ThreadData* current = *currentPtr;
     
    130147            if (!current)
    131148                break;
    132             DequeueResult result = functor(current);
     149            DequeueResult result = functor(current, timeToBeFair);
    133150            switch (result) {
    134151            case DequeueResult::Ignore:
     
    144161                if (current == queueTail)
    145162                    queueTail = previous;
     163                didDequeue = true;
    146164                *currentPtr = current->nextInQueue;
    147165                current->nextInQueue = nullptr;
     
    151169            }
    152170        }
     171       
     172        if (timeToBeFair && didDequeue)
     173            nextFairTime = time + random.get();
    153174
    154175        ASSERT(!!queueHead == !!queueTail);
     
    159180        ThreadData* result = nullptr;
    160181        genericDequeue(
    161             [&] (ThreadData* element) -> DequeueResult {
     182            [&] (ThreadData* element, bool) -> DequeueResult {
    162183                result = element;
    163184                return DequeueResult::RemoveAndStop;
     
    172193    // this lock.
    173194    WordLock lock;
     195   
     196    double nextFairTime { 0 };
     197   
     198    WeakRandom random;
    174199
    175200    // Put some distane between buckets in memory. This is one of several mitigations against false
     
    531556} // anonymous namespace
    532557
    533 NEVER_INLINE bool ParkingLot::parkConditionallyImpl(
     558NEVER_INLINE ParkingLot::ParkResult ParkingLot::parkConditionallyImpl(
    534559    const void* address,
    535560    const ScopedLambda<bool()>& validation,
     
    541566   
    542567    ThreadData* me = myThreadData();
     568    me->token = 0;
    543569
    544570    // Guard against someone calling parkConditionally() recursively from beforeSleep().
    545571    RELEASE_ASSERT(!me->address);
    546572
    547     bool result = enqueue(
     573    bool enqueueResult = enqueue(
    548574        address,
    549575        [&] () -> ThreadData* {
     
    555581        });
    556582
    557     if (!result)
    558         return false;
     583    if (!enqueueResult)
     584        return ParkResult();
    559585
    560586    beforeSleep();
     
    583609    if (didGetDequeued) {
    584610        // Great! We actually got dequeued rather than the timeout expiring.
    585         return true;
     611        ParkResult result;
     612        result.wasUnparked = true;
     613        result.token = me->token;
     614        return result;
    586615    }
    587616
    588617    // Have to remove ourselves from the queue since we timed out and nobody has dequeued us yet.
    589618
    590     // It's possible that we get unparked right here, just before dequeue() grabs a lock. It's
    591     // probably worthwhile to detect when this happens, and return true in that case, to ensure
    592     // that when we return false it really means that no unpark could have been responsible for us
    593     // waking up, and that if an unpark call did happen, it woke someone else up.
     619    bool didDequeue = false;
    594620    dequeue(
    595621        address, BucketMode::IgnoreEmpty,
    596         [&] (ThreadData* element) {
    597             if (element == me)
     622        [&] (ThreadData* element, bool) {
     623            if (element == me) {
     624                didDequeue = true;
    598625                return DequeueResult::RemoveAndStop;
     626            }
    599627            return DequeueResult::Ignore;
    600628        },
    601629        [] (bool) { });
    602 
    603     ASSERT(!me->nextInQueue);
     630   
     631    // If didDequeue is true, then we dequeued ourselves. This means that we were not unparked.
     632    // If didDequeue is false, then someone unparked us.
     633   
     634    RELEASE_ASSERT(!me->nextInQueue);
    604635
    605636    // Make sure that no matter what, me->address is null after this point.
    606637    {
    607638        std::lock_guard<std::mutex> locker(me->parkingLock);
     639        if (!didDequeue) {
     640            // If we were unparked then our address would have been reset by the unparker.
     641            RELEASE_ASSERT(!me->address);
     642        }
    608643        me->address = nullptr;
    609644    }
    610645
    611     // If we were not found in the search above, then we know that someone unparked us.
    612     return false;
     646    ParkResult result;
     647    result.wasUnparked = !didDequeue;
     648    if (!didDequeue) {
     649        // If we were unparked then there should be a token.
     650        result.token = me->token;
     651    }
     652    return result;
    613653}
    614654
     
    624664        address,
    625665        BucketMode::EnsureNonEmpty,
    626         [&] (ThreadData* element) {
     666        [&] (ThreadData* element, bool) {
    627667            if (element->address != address)
    628668                return DequeueResult::Ignore;
     
    644684        std::unique_lock<std::mutex> locker(threadData->parkingLock);
    645685        threadData->address = nullptr;
     686        threadData->token = 0;
    646687    }
    647688    threadData->parkingCondition.notify_one();
     
    652693NEVER_INLINE void ParkingLot::unparkOneImpl(
    653694    const void* address,
    654     const ScopedLambda<void(ParkingLot::UnparkResult)>& callback)
     695    const ScopedLambda<intptr_t(ParkingLot::UnparkResult)>& callback)
    655696{
    656697    if (verbose)
    657698        dataLog(toString(currentThread(), ": unparking one the hard way.\n"));
    658 
     699   
    659700    ThreadData* threadData = nullptr;
     701    bool timeToBeFair = false;
    660702    dequeue(
    661703        address,
    662704        BucketMode::EnsureNonEmpty,
    663         [&] (ThreadData* element) {
     705        [&] (ThreadData* element, bool passedTimeToBeFair) {
    664706            if (element->address != address)
    665707                return DequeueResult::Ignore;
    666708            threadData = element;
     709            timeToBeFair = passedTimeToBeFair;
    667710            return DequeueResult::RemoveAndStop;
    668711        },
     
    671714            result.didUnparkThread = !!threadData;
    672715            result.mayHaveMoreThreads = result.didUnparkThread && mayHaveMoreThreads;
    673             callback(result);
     716            if (timeToBeFair)
     717                RELEASE_ASSERT(threadData);
     718            result.timeToBeFair = timeToBeFair;
     719            intptr_t token = callback(result);
     720            if (threadData)
     721                threadData->token = token;
    674722        });
    675723
     
    695743        address,
    696744        BucketMode::IgnoreEmpty,
    697         [&] (ThreadData* element) {
     745        [&] (ThreadData* element, bool) {
    698746            if (verbose)
    699747                dataLog(toString(currentThread(), ": Observing element with address = ", RawPointer(element->address), "\n"));
  • trunk/Source/WTF/wtf/ParkingLot.h

    r201433 r203350  
    4444    // Parks the thread in a queue associated with the given address, which cannot be null. The
    4545    // parking only succeeds if the validation function returns true while the queue lock is held.
     46    //
    4647    // If validation returns false, it will unlock the internal parking queue and then it will
    47     // return without doing anything else. If validation returns true, it will enqueue the thread,
    48     // unlock the parking queue lock, call the beforeSleep function, and then it will sleep so long
    49     // as the thread continues to be on the queue and the timeout hasn't fired. Finally, this
    50     // returns true if we actually got unparked or false if the timeout was hit. Note that
    51     // beforeSleep is called with no locks held, so it's OK to do pretty much anything so long as
    52     // you don't recursively call parkConditionally(). You can call unparkOne()/unparkAll() though.
    53     // It's useful to use beforeSleep() to unlock some mutex in the implementation of
     48    // return a null ParkResult (wasUnparked = false, token = 0) without doing anything else.
     49    //
     50    // If validation returns true, it will enqueue the thread, unlock the parking queue lock, call
     51    // the beforeSleep function, and then it will sleep so long as the thread continues to be on the
     52    // queue and the timeout hasn't fired. Finally, this returns wasUnparked = true if we actually
     53    // got unparked or wasUnparked = false if the timeout was hit. When wasUnparked = true, the
     54    // token will contain whatever token was returned from the callback to unparkOne(), or 0 if the
     55    // thread was unparked using unparkAll() or the form of unparkOne() that doesn't take a
     56    // callback.
     57    //
     58    // Note that beforeSleep is called with no locks held, so it's OK to do pretty much anything so
     59    // long as you don't recursively call parkConditionally(). You can call unparkOne()/unparkAll()
     60    // though. It's useful to use beforeSleep() to unlock some mutex in the implementation of
    5461    // Condition::wait().
     62    struct ParkResult {
     63        bool wasUnparked { false };
     64        intptr_t token { 0 };
     65    };
    5566    template<typename ValidationFunctor, typename BeforeSleepFunctor>
    56     static bool parkConditionally(
     67    static ParkResult parkConditionally(
    5768        const void* address,
    5869        ValidationFunctor&& validation,
     
    7081    // indefinitely so long as the value at the given address hasn't changed.
    7182    template<typename T, typename U>
    72     static bool compareAndPark(const Atomic<T>* address, U expected)
     83    static ParkResult compareAndPark(const Atomic<T>* address, U expected)
    7384    {
    7485        return parkConditionally(
     
    8293    }
    8394
     95    // Unparking status given to you anytime you unparkOne().
     96    struct UnparkResult {
     97        // True if some thread was unparked.
     98        bool didUnparkThread { false };
     99        // True if there may be more threads on this address. This may be conservatively true.
     100        bool mayHaveMoreThreads { false };
     101        // This bit is randomly set to true indicating that it may be profitable to unlock the lock
     102        // using a fair unlocking protocol. This is most useful when used in conjunction with
     103        // unparkOne(address, callback).
     104        bool timeToBeFair { false };
     105    };
     106
    84107    // Unparks one thread from the queue associated with the given address, which cannot be null.
    85108    // Returns true if there may still be other threads on that queue, or false if there definitely
    86109    // are no more threads on the queue.
    87     struct UnparkResult {
    88         bool didUnparkThread { false };
    89         bool mayHaveMoreThreads { false };
    90     };
    91110    WTF_EXPORT_PRIVATE static UnparkResult unparkOne(const void* address);
    92111
     112    // This is an expert-mode version of unparkOne() that allows for really good thundering herd
     113    // avoidance and eventual stochastic fairness in adaptive mutexes.
     114    //
    93115    // Unparks one thread from the queue associated with the given address, and calls the given
    94     // functor while the address is locked. Reports to the callback whether any thread got unparked
    95     // and whether there may be any other threads still on the queue. This is an expert-mode version
    96     // of unparkOne() that allows for really good thundering herd avoidance in adaptive mutexes.
    97     // Without this, a lock implementation that uses unparkOne() has to have some trick for knowing
    98     // if there are still threads parked on the queue, so that it can set some bit in its lock word
    99     // to indicate that the next unlock() also needs to unparkOne(). But there is a race between
    100     // manipulating that bit and some other thread acquiring the lock. It's possible to work around
    101     // that race - see Rusty Russel's well-known usersem library - but it's not pretty. This form
    102     // allows that race to be completely avoided, since there is no way that a thread can be parked
    103     // while the callback is running.
     116    // callback while the address is locked. Reports to the callback whether any thread got
     117    // unparked, whether there may be any other threads still on the queue, and whether this may be
     118    // a good time to do fair unlocking. The callback returns an intptr_t token, which is returned
     119    // to the unparked thread via ParkResult::token.
     120    //
     121    // WTF::Lock and WTF::Condition both use this form of unparkOne() because it allows them to use
     122    // the ParkingLot's internal queue lock to serialize some decision-making. For example, if
     123    // UnparkResult::mayHaveMoreThreads is false inside the callback, then we know that at that
     124    // moment nobody can add any threads to the queue because the queue lock is still held. Also,
     125    // WTF::Lock uses the timeToBeFair and token mechanism to implement eventual fairness.
    104126    template<typename Callback>
    105127    static void unparkOne(const void* address, Callback&& callback)
    106128    {
    107         unparkOneImpl(address, scopedLambda<void(UnparkResult)>(std::forward<Callback>(callback)));
     129        unparkOneImpl(address, scopedLambda<intptr_t(UnparkResult)>(std::forward<Callback>(callback)));
    108130    }
    109131
     
    127149
    128150private:
    129     WTF_EXPORT_PRIVATE static bool parkConditionallyImpl(
     151    WTF_EXPORT_PRIVATE static ParkResult parkConditionallyImpl(
    130152        const void* address,
    131153        const ScopedLambda<bool()>& validation,
     
    134156   
    135157    WTF_EXPORT_PRIVATE static void unparkOneImpl(
    136         const void* address, const ScopedLambda<void(UnparkResult)>& callback);
     158        const void* address, const ScopedLambda<intptr_t(UnparkResult)>& callback);
    137159
    138160    WTF_EXPORT_PRIVATE static void forEachImpl(const std::function<void(ThreadIdentifier, const void*)>&);
  • trunk/Tools/ChangeLog

    r203338 r203350  
     12016-07-02  Filip Pizlo  <fpizlo@apple.com>
     2
     3        WTF::Lock should be fair eventually
     4        https://bugs.webkit.org/show_bug.cgi?id=159384
     5
     6        Reviewed by Geoffrey Garen.
     7
     8        * TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
     9
    1102016-07-17  Sam Weinig  <sam@webkit.org>
    211
  • trunk/Tools/TestWebKitAPI/Tests/WTF/ParkingLot.cpp

    r198053 r203350  
    149149           
    150150            // We need to wait.
    151             if (ParkingLot::compareAndPark(&semaphore, newSemaphoreValue)) {
     151            if (ParkingLot::compareAndPark(&semaphore, newSemaphoreValue).wasUnparked) {
    152152                // We did wait, and then got woken up. This means that someone who up'd the semaphore
    153153                // passed ownership onto us.
Note: See TracChangeset for help on using the changeset viewer.