Category: Operating System

volatile considered harmful in kernel 

Why the "volatile" type class should not be used

C programmers have often taken volatile to mean that the variable could be
changed outside of the current thread of execution; as a result, they are
sometimes tempted to use it in kernel code when shared data structures are
being used.  In other words, they have been known to treat volatile types
as a sort of easy atomic variable, which they are not.  The use of volatile in
kernel code is almost never correct; this document describes why.

The key point to understand with regard to volatile is that its purpose is
to suppress optimization, which is almost never what one really wants to
do.  In the kernel, one must protect shared data structures against
unwanted concurrent access, which is very much a different task.  The
process of protecting against unwanted concurrency will also avoid almost
all optimization-related problems in a more efficient way.

Like volatile, the kernel primitives which make concurrent access to data
safe (spinlocks, mutexes, memory barriers, etc.) are designed to prevent
unwanted optimization.  If they are being used properly, there will be no
need to use volatile as well.  If volatile is still necessary, there is
almost certainly a bug in the code somewhere.  In properly-written kernel
code, volatile can only serve to slow things down.

Consider a typical block of kernel code:


If all the code follows the locking rules, the value of shared_data cannot
change unexpectedly while the_lock is held.  Any other code which might
want to play with that data will be waiting on the lock.  The spinlock
primitives act as memory barriers - they are explicitly written to do so -
meaning that data accesses will not be optimized across them.  So the
compiler might think it knows what will be in shared_data, but the
spin_lock() call, since it acts as a memory barrier, will force it to
forget anything it knows.  There will be no optimization problems with
accesses to that data.

If shared_data were declared volatile, the locking would still be
necessary.  But the compiler would also be prevented from optimizing access
to shared_data _within_ the critical section, when we know that nobody else
can be working with it.  While the lock is held, shared_data is not
volatile.  When dealing with shared data, proper locking makes volatile
unnecessary - and potentially harmful.

The volatile storage class was originally meant for memory-mapped I/O
registers.  Within the kernel, register accesses, too, should be protected
by locks, but one also does not want the compiler "optimizing" register
accesses within a critical section.  But, within the kernel, I/O memory
accesses are always done through accessor functions; accessing I/O memory
directly through pointers is frowned upon and does not work on all
architectures.  Those accessors are written to prevent unwanted
optimization, so, once again, volatile is unnecessary.

Another situation where one might be tempted to use volatile is
when the processor is busy-waiting on the value of a variable.  The right
way to perform a busy wait is:

    while (my_variable != what_i_want)

The cpu_relax() call can lower CPU power consumption or yield to a
hyperthreaded twin processor; it also happens to serve as a compiler
barrier, so, once again, volatile is unnecessary.  Of course, busy-
waiting is generally an anti-social act to begin with.

There are still a few rare situations where volatile makes sense in the

  - The above-mentioned accessor functions might use volatile on
    architectures where direct I/O memory access does work.  Essentially,
    each accessor call becomes a little critical section on its own and
    ensures that the access happens as expected by the programmer.

  - Inline assembly code which changes memory, but which has no other
    visible side effects, risks being deleted by GCC.  Adding the volatile
    keyword to asm statements will prevent this removal.

  - The jiffies variable is special in that it can have a different value
    every time it is referenced, but it can be read without any special
    locking.  So jiffies can be volatile, but the addition of other
    variables of this type is strongly frowned upon.  Jiffies is considered
    to be a "stupid legacy" issue (Linus's words) in this regard; fixing it
    would be more trouble than it is worth.

  - Pointers to data structures in coherent memory which might be modified
    by I/O devices can, sometimes, legitimately be volatile.  A ring buffer
    used by a network adapter, where that adapter changes pointers to
    indicate which descriptors have been processed, is an example of this
    type of situation.

For most code, none of the above justifications for volatile apply.  As a
result, the use of volatile is likely to be seen as a bug and will bring
additional scrutiny to the code.  Developers who are tempted to use
volatile should take a step back and think about what they are truly trying
to accomplish.

Patches to remove volatile variables are generally welcome - as long as
they come with a justification which shows that the concurrency issues have
been properly thought through.




Original impetus and research by Randy Dunlap
Written by Jonathan Corbet
Improvements via comments from Satyam Sharma, Johannes Stezenbach, Jesper
	Juhl, Heikki Orsila, H. Peter Anvin, Philipp Hahn, and Stefan

good explanation about spinlock and semaphor

saw this post from stackoverflow. very good explanation.

Spinlock and semaphore differ mainly in four things:

1. What they are
A spinlock is one possible implementation of a lock, namely one that is implemented by busy waiting (“spinning”). A semaphore is a generalization of a lock (or, the other way around, a lock is a special case of a semaphore). Usually, but not necessarily, spinlocks are only valid within one process whereas semaphores can be used to synchronize between different processes, too.

A lock works for mutual exclusion, that is one thread at a time can acquire the lock and proceed with a “critical section” of code. Usually, this means code that modifies some data shared by several threads.
A semaphore has a counter and will allow itself being acquired by one or several threads, depending on what value you post to it, and (in some implementations) depending on what its maximum allowable value is.

Insofar, one can consider a lock a special case of a semaphore with a maximum value of 1.

2. What they do
As stated above, a spinlock is a lock, and therefore a mutual exclusion (strictly 1 to 1) mechanism. It works by repeatedly querying and/or modifying a memory location, usually in an atomic manner. This means that acquiring a spinlock is a “busy” operation that possibly burns CPU cycles for a long time (maybe forever!) while it effectively achieves “nothing”.
The main incentive for such an approach is the fact that a context switch has an overhead equivalent to spinning a few hundred (or maybe thousand) times, so if a lock can be acquired by burning a few cycles spinning, this may overall very well be more efficient. Also, for realtime applications it may not be acceptable to block and wait for the scheduler to come back to them at some far away time in the future.

A semaphore, by contrast, either does not spin at all, or only spins for a very short time (as an optimization to avoid the syscall overhead). If a semaphore cannot be acquired, it blocks, giving up CPU time to a different thread that is ready to run. This may of course mean that a few milliseconds pass before your thread is scheduled again, but if this is no problem (usually it isn’t) then it can be a very efficient, CPU-conservative approach.

3. How they behave in presence of congestion
It is a common misconception that spinlocks or lock-free algorithms are “generally faster”, or that they are only useful for “very short tasks” (ideally, no synchronization object should be held for longer than absolutely necessary, ever).
The one important difference is how the different approaches behave in presence of congestion.

A well-designed system normally has low or no congestion (this means not all threads try to acquire the lock at the exact same time). For example, one would normally not write code that acquires a lock, then loads half a megabyte of zip-compressed data from the network, decodes and parses the data, and finally modifies a shared reference (append data to a container, etc.) before releasing the lock. Instead, one would acquire the lock only for the purpose of accessing the shared resource.
Since this means that there is considerably more work outside the critical section than inside it, naturally the likelihood for a thread being inside the critical section is relatively low, and thus few threads are contending for the lock at the same time. Of course every now and then two threads will try to acquire the lock at the same time (if this couldn’t happen you wouldn’t need a lock!), but this is rather the exception than the rule in a “healthy” system.

In such a case, a spinlock greatly outperforms a semaphore because if there is no lock congestion, the overhead of acquiring the spinlock is a mere dozen cycles as compared to hundreds/thousands of cycles for a context switch or 10-20 million cycles for losing the remainder of a time slice.

On the other hand, given high congestion, or if the lock is being held for lengthy periods (sometimes you just can’t help it!), a spinlock will burn insane amounts of CPU cycles for achieving nothing.
A semaphore (or mutex) is a much better choice in this case, as it allows a different thread to run useful tasks during that time. Or, if no other thread has something useful to do, it allows the operating system to throttle down the CPU and reduce heat / conserve energy.

Also, on a single-core system, a spinlock will be quite inefficient in presence of lock congestion, as a spinning thread will waste its complete time waiting for a state change that cannot possibly happen (not until the releasing thread is scheduled, which isn’t happening while the waiting thread is running!). Therefore, given any amount of contention, acquiring the lock takes around 1 1/2 time slices in the best case (assuming the releasing thread is the next one being scheduled), which is not very good behaviour.

4. How they’re implemented
A semaphore will nowadays typically wrap sys_futex under Linux (optionally with a spinlock that exits after a few attempts).
A spinlock is typically implemented using atomic operations, and without using anything provided by the operating system. In the past, this meant using either compiler intrinsics or non-portable assembler instructions. Meanwhile both C++11 and C11 have atomic operations as part of the language, so apart from the general difficulty of writing provably correct lock-free code, it is now possible to implement lock-free code in an entirely portable and (almost) painless way.

What is spinlock, Peterson’s algorithm,Semaphore (programming),Dining philosophers problem, paging, etc starvation and deadlock

What is fragmentation? Different types of fragmentation? – Fragmentation occurs in a dynamic memory allocation system when many of the free blocks are too small to satisfy any request. External Fragmentation: External Fragmentation happens when a dynamic memory allocation algorithm allocates some memory and a small piece is left over that cannot be effectively used. If too much external fragmentation occurs, the amount of usable memory is drastically reduced. Total memory space exists to satisfy a request, but it is not contiguous.Internal Fragmentation: Internal fragmentation is the space wasted inside of allocated memory blocks because of restriction on the allowed sizes of allocated blocks. Allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used

What is a Safe State and what is its use in deadlock avoidance? – When a process requests an available resource, system must decide if immediate allocation leaves the system in a safe state. System is in safe state if there exists a safe sequence of all processes. Deadlock Avoidance: ensure that a system will never enter an unsafe state.

What is CPU Scheduler? – Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them. CPU scheduling decisions may take place when a process: 1.Switches from running to waiting state. 2.Switches from running to ready state. 3.Switches from waiting to ready. 4.Terminates. Scheduling under 1 and 4 is non-preemptive. All other scheduling is preemptive.


When paging is used, a problem called “thrashing” can occur, in which the computer spends an unsuitable amount of time swapping pages to and from a backing store, hence slowing down useful work. Adding real memory is the simplest response, but improving application design, scheduling, and memory usage can help.

Journaling file system