The concepts of process, thread, and task are fundamental to understanding the working of an operating system. You should have a good understanding of threads and how they work to learn asynchrony, parallelism, and concurrency. This article discusses the concepts related to these concepts in detail with relevant code examples wherever appropriate.

If you're to work with the code examples discussed in this article, you need the following installed in your system:

  • Visual Studio 2022
  • .NET 9.0
  • ASP.NET 9.0 Runtime

If you don't already have Visual Studio 2022 installed on your computer, you can download it from here: https://visualstudio.microsoft.com/downloads/.

I'll start by showing you the basic concepts and then explore how you can program against each of them.

Overview of Threads, Multithreading, and Multitasking

In this section, I'll examine threads, multithreading, multitasking, multiprocessing, and other related concepts. Before I delve deeply into these concepts, let's understand what resources are in the context of an operating system.

Resources

Resources, in an operating system (OS), are components required by a process or thread to function. Typical examples of resources include hardware components, such as a central processing unit (CPU) time, memory space, input/output (I/O) devices (printers, disk drives), and files managed by the operating system to ensure that processes have access to these resources in a controlled and efficient manner. It's extremely important for an operating system to apply proper algorithms and strategies to manage and control the allocation, scheduling, and release of resources to ensure seamless functioning of operations.

From the operating system's perspective, there are two types of resources: preemptive resources and non-preemptive resources.

Preemptive resources can be taken away from a process and then re-allocated when the need arises. In other words, their allocation can be preemptively interrupted, and then assigned to another process, with the intention of reallocating back to the original process at a later time. The state of the resource can be saved and restored later, so the process can continue from where it was halted.

A good example of a preemptive resource is the CPU: An operating system can interrupt running processes (context switch), let other programs use the CPU for some share of time, and finally give it back to the first one without causing any harm. It should be noted that saving and restoring resource states of preemptive resources can be complex and costly.

Non-preemptive resources are also known as non-sharable resources because they remain with that process until they're released. Removing a non-preemptive resource from a process could either lead to processing errors or data corruption. Thus, no other process can use this resource until that particular process has finished with them and voluntarily releases them.

Typical examples of non-preemptive resources include printers and scanners, etc. A printer is a non-preemptive resource because once started, a printing job can't be stopped halfway through because multiple processes lead to possible loss or damage in output. The main problem associated with non-preemptive resources is avoiding resource deadlocks, because processes can't release their resources until they no longer require them.

CPU-Bound and I/O-Bound Operations

Any operation can be classified as either CPU-bound (or CPU-intensive) or I/O-bound (or I/O-intensive). A CPU-bound operation is one where the time spent on the CPU is greater. Long computations, such as massive data processing, are the most common examples. An I/O-bound operation is where the time spent performing I/O operations is greater. Typical examples are reading or writing files, copying files, reading or writing data to and from a database, and downloading data off the internet.

Thread

A thread is a low-level concept that represents the smallest unit of CPU use in a process. Memory space is shared between threads belonging to the same process. Threads are used to achieve parallelism on multi-core processors by enabling the execution of several threads concurrently. However, shared memory space in multithreading is complicated because synchronization and state management become difficult to manage. It should be noted that any process must have at least one thread. This thread is also known as the main or primary thread. All other threads in a process are known as worker threads.

Task

A task is a high-level abstraction on top of a thread designed to support asynchronous operations. A task is often used in the context of asynchronous programming and encapsulates the job that needs to be performed. Unlike threads, which are managed by the underlying operating system, tasks are managed by the runtime environment in which they are executed, i.e., the CLR, if you're using .NET Core.

Multiprogramming

Basically, there are two types of processing: Serial Processing and Batch Processing. In serial processing, jobs are processed in a pre-determined sequence, meaning that each job is processed after the previous one has completed its execution. It should be noted that the processor is capable of executing one and only one job at a given point of time. Incidentally, serial processing was used before disk technology came into being.

Batch processing involves sending a bunch of programs to a tape drive at once. Typically, the jobs that have similar functionalities are grouped into a batch. These jobs are not processed at the same time. Instead, each request is processed one at a time, and the results are returned to the user after it has been processed. All such batches of jobs are read and run by the CPU and the result or output is written to another tape drive.

Multiprogramming is where an operating system loads multiple programs in the main memory (RAM) and runs them simultaneously. A variation of batch processing, multiprogramming attempts to maximize CPU use and enhances system throughput. This implies that it keeps the CPU busy most of the time so that each program gets a fair share of the CPU processing time.

Figure 1 shows a typical multiprogramming operating system.

Figure 1: A typical multiprogramming operating system
Figure 1: A typical multiprogramming operating system

Multithreading

Typically, applications are either single-threaded or multithreaded based on how the processor executes the threads. In a single-threaded application, when the running thread has finished execution, the operating system schedules another thread for execution. This approach doesn't provide better system throughput (a measure of work completed per unit time). You cannot schedule another thread once a thread is in execution, and a single thread monopolizes the processor.

In a multithreaded application, several threads reside in the memory simultaneously, with one being in an execution state. Without waiting for the turnaround time of the currently executing thread to complete, new threads may be scheduled in these applications.

The CPU supports two types of scheduling: preemptive and non-preemptive. In preemptive scheduling, the thread in execution state is interrupted by the operating system and replaced by a different thread. This increases the system throughput because the threads waiting for the CPU are scheduled even before the executing thread relinquishes control of the processor. In non-preemptive scheduling, the running thread doesn't relinquish control of the CPU until its execution is complete.

The key benefits of multithreading include the following:

  • Increased responsiveness
  • Better resource sharing within the processes
  • Enhanced scalability
  • Better use of multiple processing cores

Figure 2 demonstrates a multithreaded application at work.

Figure 2: A typical multithreaded system
Figure 2: A typical multithreaded system

Multitasking

Multitasking is an operating system's capability to accomplish multiple tasks simultaneously by switching between them to enhance scalability and system throughput. In a typical multitasking system, the tasks are not actually executed simultaneously. Instead, the operating system uses its task scheduler to schedule tasks as needed and manages multiple tasks by allocating time slices to each of them.

Multitasking is of the following types:

  • Cooperative Multitasking: This is a type of multitasking in which a running task voluntarily relinquishes control of the CPU to allow other tasks to run. It should be noted that Windows 3.x supported cooperative multitasking.
  • Preemptive Multitasking: In this case, the operating system allocates a limited time slice to each task, which allows the operating system to manage and control access to the CPU. As soon as this allotted time span elapses, the operating system preempts the running task and schedules another runnable task from the queue. This type of multitasking is supported by Windows 9x and later versions of the Windows operating systems.

Figure 3 shows a typical multitasking system at work.

Figure 3: A typical multitasking system in action
Figure 3: A typical multitasking system in action

Multiprocessing

Multiprocessing is defined as the ability of an operating system to run several programs or processes concurrently, either on a single processor with many cores or across multiple processors in the same system. This can improve performance, throughput and efficiency by allowing parallel execution of processes or threads across multiple CPUs or cores.

On a single core system, the threads pertaining to a multithreaded application never execute in parallel because they have to share a single core. On a multi-core system, there are multiple cores within a single processor and you can execute tasks in parallel. Figure 4 shows a multiprocessor system with multiple cores.

Figure 4: A typical multiprocessing system in action
Figure 4: A typical multiprocessing system in action

Programming Threads in C#

In C#, the Thread class pertaining to the System.Threading namespace works with threads. Remember, you can have two types of threads: an application thread and one or more worker threads. Although the application thread is created by an application automatically by the runtime, any thread you create programmatically is a worker thread.

The Thread class provides three important properties: IsAlive, IsBackground, and IsThreadPoolThread. You may check whether the thread is alive using the read-only IsAlive property of the Thread class. If the thread is alive, it contains a value true, and false otherwise. The IsBackground property contains a value true if the thread is a background thread, and false otherwise.

When set to true, this property transforms a thread into a background thread. The IsThreadPoolThread is yet another read-only property that indicates if the thread belongs to the managed thread pool. If it's a thread pool thread, this property contains a value true, and false otherwise.

Creating a Thread

When a thread is first created, it's in a dormant state. To start a thread in C#, make a call to the Start method of the Thread class. The following code snippet demonstrates how you can create a worker thread in C#.

static void ThreadMethod() { Console.WriteLine("Worker thread."); } static void CreateThread() { Console.WriteLine("Primary thread."); Thread threadObj = new Thread(new ThreadStart(ThreadMethod)); threadObj.Start(); }

In the preceding code example, as the name suggests, the method named ThreadMethod is the method that gets called when the thread is started. Note that ThreadStart is a delegate used to represent the thread method that gets executed when a thread is started. The following code snippet shows how you can start a worker thread which, in turn, is attached to a method that displays integers 1 to 10.

static void CreateThread() { Thread threadObj = new Thread(new ThreadStart(DisplayNumbers)); threadObj.Start(); } static void DisplayNumbers() { for (int i = 1; i <= 10; i++) { Console.WriteLine(i); } }

You can also create a thread by passing the name of the thread method directly in the constructor of the Thread class, as shown in the following code example:

static void CreateThread() { Console.WriteLine("Primary thread."); Thread threadObject = new Thread(ThreadMethod); threadObject.Start(); } static void ThreadMethod() { Console.WriteLine("Worker thread."); }

Because creation and disposal of threads are resource-intensive operations, you can use a thread from the managed thread pool to queue tasks without having to manage the threads individually. The following code snippet illustrates how this can be achieved.

ThreadPool.QueueUserWorkItem((state) => { Console.WriteLine(""" This code will be executed by a thread pertaining to the managed thread pool. """); });

Putting a Thread to Sleep in C#

You can leverage the Thread.Sleep() method to put a running thread to sleep. The Thread.Sleep method accepts an integer as a parameter that represents the timeout in milliseconds, as shown in the code snippet below.

static void CreateThread() { Thread threadObject = new Thread(ThreadMethod); threadObject.Start(); Thread.Sleep(1000); }

When you invoke the Thread.Sleep method, it instantly suspends the thread that called it for the period of time specified in milliseconds as a parameter. You can ensure that a thread runs indefinitely by using while(true) in your thread method. This consumes CPU cycles and is resource intensive. A better alternative is using Timeout.Infinite as a parameter to the Thread.Sleep method, as shown below.

Thread.Sleep(Timeout.Infinite);

A call to the Thread.Sleep method puts the current execution context to sleep by invoking the sleep function of the OS kernel, allowing the CPU to continue to do other work.

Setting Thread Priority

The Thread class provides an enum called ThreadPriority that contains all supported thread priorities: Lowest, BelowNormal, Normal, AboveNormal, and Highest. By default, a managed thread is created with a thread priority ThreadPriority.Normal. You can also set the priority of a managed thread when creating it using the following piece of code:

static void SetThreadPriority() { Thread threadObj = new Thread(new ThreadStart(DisplayNumbers)); threadObj.Priority = ThreadPriority.AboveNormal; threadObj.Name = "SetThreadPriority"; threadObj.Start(); } static void DisplayNumbers() { for (int i = 1; i <= 10; i++) { Console.WriteLine(i); } }

Retrieving the State of a Thread

You can take advantage of the ThreadState property to get the state of a thread, as shown in the code snippet below:

static void CreateThread() { Console.WriteLine("Primary thread."); Thread threadObject = new Thread(ThreadMethod); Console.WriteLine("The thread state is: " + threadObject.ThreadState); threadObject.Start(); Thread.Sleep(1000); Console.WriteLine("The thread state is: " + threadObject.ThreadState); } static void ThreadMethod() { Console.WriteLine("Worker thread."); }

All supported states of a managed thread are defined in the ThreadState Enum. You can take advantage of the public read-only property called ThreadState to know the state of a particular managed thread.

Suspending and Resuming a Thread

In C#, a thread can be suspended or resumed using the deprecated methods Suspend and Resume. These methods are deprecated and their use is discouraged in .NET Framework and .NET Core. This is because if you suspend a thread that's already inside a critical section where it's holding a lock on a critical resource, the entire application might deadlock. A better way to handle this is by using WaitHandle.

WaitHandlers help threads communicate with one another using signaling where a particular thread waits until it receives a notification from another thread. In C#, you can have two types that represent EventWaitHandlers, AutoResetEvent and ManualResetEvent. The basic difference between an AutoResetEvent and a ManualResetEvent is that an AutoResetEvent only allows one waiting thread to continue, and a ManualResetEvent allows multiple threads to continue until you stop it.

In AutoResetEvent, WaitOne(), and Reset() are executed as a single atomic operation. To be more precise, although AutoResetEvent enables one of the waiting threads to pass when the Set() method is called, the latter allows all waiting threads to pass when the Set() method is called. In addition, the AutoResetEvent releases only one waiting thread at a time. It should be noted that an AutoResetEvent resets automatically when your code passes through eventObj.WaitOne() and a ManualResetEvent does not.

Terminating a Thread

The Thread.Abort() method can be used to abort a running thread. The following code snippet shows how the Thread.Abort method can be used in C#:

Thread threadObj = new Thread(PerformSomeWork); threadObj.Start(); //Usual code threadObj.Abort();

However, this isn't a recommended approach and has been deprecated in .NET Core because it adopts an unsafe approach to terminating threads. A recommended approach to thread termination is by using CancellationToken, as shown in the code snippet given below:

public static void Start() { CancellationTokenSource cancellationToken = new CancellationTokenSource(); Thread thread = new Thread(() => MyThreadMethod(cancellationToken.Token)); thread.Start(); Thread.Sleep(1000); cancellationToken.Cancel(); thread.Join(); } public static void MyThreadMethod(CancellationToken cancellationToken) { while (!cancellationToken.IsCancellationRequested) { Thread.Sleep(1000); Console.WriteLine("The thread method is running..."); } Console.WriteLine(""" Cancellation requested, exiting thread. """); }

Deadlock

In a multiprogramming environment, several processes may often vie for a limited number of resources at the same time. When a process needs access to any resource but it isn't available at the moment, it enters a waiting state and a deadlock occurs. A deadlock is when two or more processes (or threads) wait for one another to release a resource or multiple resources that create cyclic dependencies and none of them are allowed to continue further. All of those processes are also waiting, yet they cause others to wait because each of them is awaiting fulfillment of a condition that only exists in another process.

Deadlock occurs when you have multiple processes waiting to acquire control of a shared resource.

Let's say that process P1 and process P2 have acquired resources R1 and R2 respectively. Process P1 enters into a waiting state when process P2 hasn't relinquished control of resource R2. Process P2 enters a waiting state to acquire control of resource R1 that has been acquired by process P1. The result is deadlock because one process (say, P1) waits indefinitely for access to shared resources that has already been acquired by another process (say, P2), thereby becoming entangled in a vicious cycle.

The following code snippet illustrates the entire process:

lock (R1) { //Some code lock (R2) { //Some code } } lock (R2) { //Some code lock (R1) { //Some code } }

The following piece of code shows how to use the Monitor.TryEnter method to acquire an exclusive lock on a shared object.

public class DeadlockExample { private readonly static object lockObj = new object(); public static void PreventDeadlock(object sharedInstance) { try { if (Monitor.TryEnter(lockObj, TimeSpan.FromMilliseconds(10))) { // This is the critical section } } catch (Exception ex) { Console.WriteLine(ex.ToString()); } finally { if (Monitor.IsEntered(sharedInstance)) Monitor.Exit(sharedInstance); Monitor.Exit(lockObj); } } }

The complete source code of the DeadlockExample class is given in Listing 3. In Listing 1, ProcessA and ProcessB acquire locks on resources objA and objB respectively. ProcessA waits to acquire access on resource objB after acquiring objA, and ProcessB waits for access to resource objA after acquiring access to resource objB, resulting a deadlock situation.

Listing 1: Demonstrating a potential deadlock situation

static void ExecuteProcessA()
{
    lock (objA)
    {
        Console.WriteLine("Process A is holding lock on resource objA");
        Thread.Sleep(1000);
      
        Console.WriteLine("Process A is waiting for resource objB");
  
        lock (objB)
        {
            Console.WriteLine("Process A is holding lock on objB");
        }
    }
}

static void ExecuteProcessB()
{
    lock (objB)
    {
        Console.WriteLine("Process B is holding lock on resource objB");
        Thread.Sleep(1000);

        Console.WriteLine("Process B is waiting for resource objA");
        lock (objA)
        {
            Console.WriteLine("Process B is holding lock on objA");
        }
    }
}

In C#, several methods of the classes pertaining to the System.Threading namespace provide support for time-outs to help you determine deadlocks. For example, the following code snippet shows how you can gain a lock on a shared resource within a specific time interval. If it's not possible to gain the lock on the shared resource within the specified time frame, the Monitor.TryEnter method returns false.

if (Monitor.TryEnter(lockObj, 10)) { try { //This is the critical section. //Write your code here. } finally { Monitor.Exit(lockObj); } } else { //This code will execute if the //request for acquiring a lock //on the shared resource times out. }

Listing 2 shows how you can avoid the deadlock situation by establishing a consistent locking order, ensuring that the processes acquire locks on the resources in a consistent order to avoid any potential circular dependency. The ExecuteProcessA() method contains the necessary source code for process A, the ExecuteProcessB() method includes the source code for process B. Note how circular dependencies have been prevented in these two methods by using a lock hierarchy, i.e., using nested locks. You can ensure that there will be no circular wait using lock hierarchy because the locks are acquired in the same order by processes A and B. You can prevent deadlock in this manner.

Listing 2: Solution to Deadlock by avoiding ciscular dependencies

static void ExecuteProcessA()
{
    lock (objA)
    {
        Console.WriteLine("Process A is holding lock on resource objA");
        Thread.Sleep(1000);

        Console.WriteLine("Process A is waiting for resource objB");

        lock (objB)
        {
            Console.WriteLine("Process A is holding lock on objB");
        }
    }
}

static void ExecuteProcessB()
{
    lock (objA)
    {
        Console.WriteLine("Process B is holding lock on resource objA");
        Thread.Sleep(1000);

        Console.WriteLine("Process B is waiting for resource objB");

        lock (objB)
        {
            Console.WriteLine("Process B is holding lock on objB");
        }
    }
}

Listing 3 shows the complete source for the DeadlockExample class.

Listing 3: The complete source of the DeadlockExample class

public static class DeadlockExample
{
    static readonly object objA = new object();
    static readonly object objB = new object();

    public static void DeadlockDemo()
    {
        Thread threadA = new Thread(ExecuteProcessA);
        Thread threadB = new Thread(ExecuteProcessB);

        threadA.Start();
        threadB.Start();

        threadA.Join();
        threadB.Join();

        Console.WriteLine("End of program");
    }

    static void ExecuteProcessA()
    {
        lock (objA)
        {
            Console.WriteLine("Process A is holding lock on resource objA");
            Thread.Sleep(1000);

            Console.WriteLine("Process A is waiting for res. objB");

            lock (objB)
            {
                Console.WriteLine("Process A is holding lock on objB");
            }
        }
    }

    static void ExecuteProcessB()
    {
        lock (objA)
        {
            Console.WriteLine("Process B is holding lock on resource objA");
            Thread.Sleep(1000);
 
            Console.WriteLine("Process B is waiting for res. objB");
            lock (objB)
            {
                Console.WriteLine("Process B is holding lock on objB");
            }
        }
    }
}

Asynchronous Programming

Asynchrony is a programming paradigm that allows tasks to execute independently of the primary or the main thread of execution, thereby enabling programs to proceed with other tasks while waiting for an operation to complete. You can take advantage of callbacks, promises, futures, or events to handle completion of an asynchronous task.

.NET Core provides support for executing a piece of code or a method in a synchronous or asynchronous manner. Synchronous execution is a programming model that is blocking in nature. This implies that the execution of the program halts at each statement and waits for the statement to terminate before executing the next statement. As a result, the program pauses at each statement, resulting in execution delays. Figure 5 shows how a synchronous execution model works.

Figure 5: A typical synchronous execution model
Figure 5: A typical synchronous execution model

Asynchronous programming refers to a programming paradigm where the tasks can run independently from one another. This ensures high throughput and improved responsiveness for your application. In other words, with asynchronous programming, you can execute tasks without blocking the execution flow or responsiveness of the application. By using asynchronous programming, you can make your application more responsive, scalable, and performant. Figure 6 shows how an asynchronous execution model works.

Figure 6: A typical asynchronous execution model
Figure 6: A typical asynchronous execution model

Why Asynchrony Is Important

A non-blocking or asynchronous execution allows for executing tasks without waiting or slowing down the execution flow of your application.

Here are the benefits of asynchronous program at a glance:

  • It makes the user interface responsive.
  • It increases scalability and performance of applications.
  • It prevents thread pool starvation.

There are certain downsides as well:

  • Asynchronous code is complex and difficult to maintain.
  • Asynchronous code might consume additional resources (memory, CPU, etc.).
  • Debugging asynchronous programming can be extremely challenging.

The async and await Keywords

The async keyword converts a method into an asynchronous one enabling the use of the await keyword inside the method body. Note that Task and Task<T>v are two awaitable types provided by .NET Core. An async method returns a Task or Task<T> object and performs non-blocking operations. You must use the await keyword to call an async method.

Some key features of the async keyword include:

  • You can use the async keyword for methods, lambda expressions, and anonymous methods, but not for constructors or properties.
  • An async method should have a minimum of one await expression in its body. In an async method, you can have multiple await keywords to handle multiple non-blocking operations.
  • Methods that are async can be chained together.

Using the await keyword suspends the calling method until the async task finishes and returns control to the caller. In other words, when the await keyword is used to invoke an asynchronous method, it pauses execution of the awaited method and asynchronously waits for the Task to complete while the currently executing thread is sent back to the thread pool in lieu of keeping it in a blocked state. It should be noted that the await keyword can be used only in an async method. The caller method, i.e., the method that invokes the async method using the await keyword, should also be made asynchronous.

Here are the key features of the await keyword:

  • You can use the await keyword only inside an async method.
  • It's possible to use the await keyword in any expression that returns a Task or a Task object.
  • When the await keyword is used, the result of the Task object is unwrapped. This helps you to work directly with the result of the Task.
  • Usage of the await keyword causes the calling method to be paused and relinquish the control back to the caller method until the awaited task is finished.

Consider the following piece of code that demonstrates an asynchronous method:

public async Task MyAsyncMethod() { await Task.Delay(100); }

You can invoke this method asynchronously using this code snippet:

await MyAsyncMethod();

In C#, you can define async void methods by preceding the async before the method signature, followed by a void return type. The following code snippet shows how this can be achieved:

public async void AnAsyncVoidMethod() { // Write your code here }

CPU-Bound and I/O-Bound Code

You can use asynchronous code for CPU-bound as well as I/O-bound code but in different ways. You typically need to leverage Task and Task<T> in async code, which are constructs used to model work done in the background. When a method has the async keyword, it's converted into an asynchronous method, which allows you to include the await keyword within the body of the method. Note that unlike I/O-bound tasks, CPU-bound tasks require a worker thread to operate. These tasks leverage a thread from the thread pool to function.

The following method illustrates how you can implement CPU-bound async method in C#.

public async Task<int> GenerateFactorialAsync(int number) { int factorial = 1; for (int i = 1; i <= number; i++) { factorial *= i; } return await Task.FromResult(factorial); }

You can use the following piece of code to invoke the GneerateFactorialAsync async method you just created.

int number = 5; int result = await Task.Run(() => { return GenerateFactorialAsync(number); }); Console.WriteLine($""" The factorial of {number} is: {result} """);

The asynchronous programming model uses the async and the await keywords to implement asynchronous operations. In this programming paradigm you should:

  • Await an asynchronous operation that returns a Task or a Task<T> object for I/O-bound code.
  • Await an asynchronous operation initiated on a background thread by invoking the Task.Run() method for CPU-bound code.

The following code snippet is an example of an I/O-bound operation that shows how you can use await keyword to retrieve the content of a webpage asynchronously:

HttpClient httpClient = new HttpClient(); HttpResponseMessage httpResponseMessage = await httpClient.GetAsync("https://www.joydipkanjilal.com"); string content = await httpResponseMessage.Content.ReadAsStringAsync(); Console.WriteLine(content);

Asynchronous File Operations

Create two string variables that contain the file name of path of the input file and the destination file respectively. The following code snippet shows the ReadFileAsync method that can read a file asynchronously.

public async Task<string> ReadFileAsync(string inputFileName) { using var streamReader = new StreamReader(inputFileName); return await streamReader.ReadToEndAsync(); }

You can now invoke the ReadFileAsync method using the following piece of code:

var data = await ReadFileAsync(@"D:\SampleData.txt"); Console.WriteLine(data);

To copy files asynchronously, you can use the CopyToAsync method of the FileStream class as shown below:

public async Task CopyToAsync(string inputFileName, string destinationFileName) { using FileStream fileStream = File.Open(inputFileName, FileMode.Open); using (FileStream destination = File.Create(destinationFileName)) { await fileStream.CopyToAsync(destination); } }

The following code snippet shows how you can write a file asynchronously:

public async Task WriteFileAsync(string destinationFileName, string text) { byte[] buffer = Encoding.Unicode.GetBytes(text); int offset = 0; const int Buffer_Size = 4096; bool isAsync = true; using var fs = new FileStream(destinationFileName, FileMode.Append, FileAccess.Write, FileShare.None, Buffer_Size, isAsync); await fs.WriteAsync(buffer, offset, buffer.Length); }

The complete source code is given in Listing 4.

Listing 4: Async File Operations

using System.IO;
using System.Text;

string sourceFileName = @"D:\Input\SampleData.txt";
string destinationFileName = @"D:\Output\SampleData.txt";

string text = "Hello World!";
await WriteFileAsync(@"D:\SampleData.txt", text);

await CopyToAsync(sourceFileName, destinationFileName);

var data = await ReadFileAsync(sourceFileName);
Console.WriteLine(data);

Console.Read();
async Task<string> ReadFileAsync(string inputFileName)
{
    using var streamReader = new StreamReader(inputFileName);
    return await streamReader.ReadToEndAsync();
}

async Task WriteFileAsync(string destinationFileName, string text)
{
    byte[] buffer = Encoding.Unicode.GetBytes(text);
    int offset = 0;
    const int Buffer_Size = 4096;
    bool isAsync = true;

    using var fileStream = new FileStream(
        destinationFileName, FileMode.Append,
        FileAccess.Write, FileShare.None,
        Buffer_Size, isAsync);

    await fileStream.WriteAsync(buffer, offset, buffer.Length);
}

async Task CopyToAsync(string sourceFileName, string destinationFileName)
{
    using FileStream fs = File.Open(sourceFileName, FileMode.Open);
    using FileStream destination = File.Create(destinationFileName);

    await fs.CopyToAsync(destination);
}

Handling Exceptions

Synchronous and asynchronous methods handle exceptions differently. In a synchronous method, if the runtime raises an exception, the exception object is sent up the call stack until it hits a suitable catch block capable of handling the instance.

If an exception is thrown in an async method, it's stored in the Task object that's returned by the method. These exceptions aren't visible until the task is awaited. Notably, asynchronous methods with a return type of void lack an accompanying Task object. If exceptions occur in these methods, they're triggered on the SynchronizationContext that was in use when the asynchronous method was invoked. Listing 5 shows the RetrieveAllExceptionsAsync method that illustrates how exceptions can be retrieved from a Task instance.

Listing 5: The RetrieveAllExceptionsAsync method

public async Task RetrieveAllExceptionsAsync()
{
    var firstTask = Task.Run(() => 
      throw new ArithmeticException("Error occurred: " + 
      typeof(ArithmeticException).ToString()));

    var secondTask = Task.Run(() => 
      throw new IndexOutOfRangeException("Error occurred: " +  
      typeof(IndexOutOfRangeException).ToString()));

    var thirdTask = Task.Run(() => 
      throw new InvalidOperationException("Error occurred: " +
      typeof(InvalidOperationException).ToString()));

    var fourthTask = Task.Run(() => 
      throw new DivideByZeroException("Error occurred: " +
      typeof(DivideByZeroException).ToString()));

    var fifthTask = Task.Run(() => 
      throw new IOException("Error occurred: " + 
      typeof(IOException).ToString()));

    Task tasks = Task.WhenAll(firstTask, secondTask, thirdTask, 
      fourthTask, fifthTask);

    try
    {
        await tasks;
    }
    catch
    {
        AggregateException exceptions = tasks.Exception;
    }
}

Whenever an async method throws an exception, the exception object gets wrapped in an AggregateException instance. Note that you can retrieve all exceptions thrown inside an async method using the Exceptions property of the Task object. Listing 6 shows how you can handle multiple exceptions thrown in an asynchronous method.

Listing 6: The MultipleExceptionsAsync method

public async Task MultipleExceptionsAsync()
{
    Task tasks = null;
    try
    {
        var firstTask = Task.Run(() => throw new
        ArithmeticException(typeof(ArithmeticException).ToString()));

        var secondTask = Task.Run(() => throw new
        IndexOutOfRangeException(typeof(IndexOutOfRangeException).ToString()));

        var thirdTask = Task.Run(() => throw new
        InvalidOperationException(typeof(InvalidOperationException).ToString()));

        tasks = Task.WhenAll(firstTask, secondTask, thirdTask);
        await tasks;
    }
    catch
    {
        AggregateException exceptions = tasks.Exception;

        foreach (var ex in tasks.Exception?.InnerExceptions ?? 
          new(Array.Empty<Exception>()))
        {
            Console.WriteLine(ex.GetType().ToString());
        }
    }
}

The mechanism of handling exceptions in async void and async Task methods differ. The primary difference between async void and async Task methods depends on how exceptions are handled in them. When exceptions occur in an async Task method, the exceptions are captured by the Task object returned by the method. This enables you to handle the exception or await the Task and handle the exceptions later. On the other hand, async void methods don't have any Task object. You cannot await async void methods. Hence, any exceptions that occur inside these methods are propagated up to the SynchronizationContext that started the async method initially.

Consider the following code snippet that shows two methods, ProcessData and ProcessDataAsync. The former handles the exception and the latter throws an exception after a delay of 100 milliseconds.

public async void ProcessData() { try { ProcessDataAsync(); } catch (Exception ex) { Console.WriteLine("Error: " + ex.Message); } } public async void ProcessDataAsync() { await Task.Delay(100); throw new Exception("This is a test error message."); }

When you run the above piece of code, the exception thrown inside the ProcessDataAsync method isn't caught inside the catch block of the caller method, the ProcessData method. To solve this, change the signature of the ProcessDataAsync method from async void to async Task as shown in the code snippet given below.

public async Task ProcessDataAsync() { await Task.Delay(100); throw new Exception("This is a test error message."); } public async void ProcessData() { try { await ProcessDataAsync(); } catch (Exception ex) { Console.WriteLine("Error: " + ex.Message); } }

Note that you can also leverage AggregateException.Handle to handle exceptions that you'd like to handle and ignore those you don't want to handle, as shown in Listing 7. Note how the IndexOutOfRangeException and InvalidOperationException is handled while the ArithmeticException is ignored.

Listing 7: The MultipleExceptionsInAsyncCodeDemo method

public async Task MultipleExceptionsInAsyncCodeDemo()
{
    try
    {
        var firstTask = Task.Run(() => 
          throw new ArithmeticException("Error occurred: " 
          + typeof(ArithmeticException).ToString()));

        var secondTask = Task.Run(() => 
          throw new IndexOutOfRangeException("Error occurred: " +
          typeof(IndexOutOfRangeException).ToString()));

        var thirdTask = Task.Run(() => 
          throw new InvalidOperationException("Error occurred: " +
          typeof(InvalidOperationException).ToString()));

        Task.WaitAll(firstTask, secondTask, thirdTask);
    }
    catch (AggregateException ae)
    {
        ae.Handle(ex =>
        {
            if (ex is IndexOutOfRangeException)
                Console.WriteLine("""
                    An exception of type IndexOutOfRangeException occurred: 
                    """ + ex.Message);
            if (ex is InvalidOperationException)
                Console.WriteLine("""
                    An exception of type InvalidOperationException occurred: 
                    """ + ex.Message);
            return ex is InvalidOperationException;
        });
    }
}

Microsoft recomments that you await an async method that returns either a Task or a Task<T> for I/O-bound code. For CPU-bound code, spawn a background thread using the Task.Run() method and await a method started on this background thread.

Best Practices

Here are some of the best practices you should adhere to when working with asynchronous methods in C#:

  • Mark methods having one await expression with the async keyword so that the method signature clearly shows that such methods are asynchronous.
  • In asynchronous methods, return a Task or Task<T> rather than void, so that the caller method can await the results and handle errors elegantly.
  • Don't use async void methods because they can't be awaited and can potentially trigger unhandled exceptions. Instead, take advantage of async Task methods for implementing event handlers. This provides support for proper exception handling, enabling you to handle exceptions elegantly.
  • To reduce deadlocks, use ConfigureAwait(false) to reduce the likelihood of capturing the context.
  • Leverage Task.Run for CPU-bound operations in your application. This helps offload the work onto a separate thread and prevents blocking the main or the primary thread.
  • Take advantage of SemaphoreSlim or Task.WhenAll to throttle the maximum number of concurrent tasks when you call an asynchronous method inside a loop to avoid increased usage of system resources.

The newer versions of ASP.NET Core don't have synchronization context anymore. When an async method resumes execution, it runs on a thread pool thread because there's no need of entering the request context. Additionally, you need not use ConfigureAwait(false) because there's no SynchronizationContext. You may still need it in library code or in your desktop apps.

Parallelism

In computing, parallelism is defined as the capability of an operating system to accomplish multiple tasks concurrently by leveraging multiple processors or cores. You can take advantage of parallelism to increase the overall performance and responsiveness of your applications that process large amounts of data or handle computationally intensive tasks.

Why Parallelism Is Important

By leveraging all the of the CPUs present in multi-core PCs, and high-performance computing clusters (HPS clusters), parallel programming becomes inevitable when handling problems that involve massive computations. In parallel programming, you decompose a problem into smaller workable steps that can be executed concurrently. In parallel programming, the tasks are parallelized during their execution so that they can run simultaneously on different cores within a CPU. Parallel programming is useful where there is massive amount of data and application performance is crucial.

Although parallelism alludes to greater benefits in performance and scalability, it's not a panacea. Not all queries should be parallelized, particularly when interdependencies are involved. For example, one task might need to wait for another to complete before it can move ahead. In this case, parallelizing your tasks becomes quite a challenge.

Data and Task Parallelism

Data parallelism and task parallelism are two approaches to parallel computation often used to improve application performance on multi-core processors by splitting a problem into subtasks that can be solved at the same time. Additionally, the Task Parallel Library (TPL) supports data-parallel operations, such as executing loops in parallel (Parallel.For and Parallel.ForEach).

Data parallelism involves dividing data into subsets and performing identical operations on each subset at the same point of time. This approach is very effective for numerical algorithms that operate intensively on large datasets by repeatedly performing identical steps over elements.

In image processing, you use data parallelism if you need to go through every pixel in an image (like applying a filter), or for simulations where calculations have to be done again for lots of data points. Note that big data and other distributed systems need data parallelism because it enables developers to solve problems faster using more machines.

Unlike data parallelism, which entails performing the same operation on different data sets, task parallelism is concerned with performing multiple operations on a single dataset. When you do task parallelism, you break a problem into separate parts, called tasks, which can be executed concurrently. Leverage this type of parallelism when tasks involve complex operations that might not operate on the same dataset.

In task parallelism, each task does a different thing and operates on different datasets. These operations may involve doing the same work with various pieces of information or doing totally different things using them. This type of parallelism is appropriate when the tasks are quite separate and complex and where each one operates on its own dataset.

Parallel Programming

Parallel programming or parallel processing is a type of computation where several computations are all performed at once. This can help you solve a problem faster, generally by splitting the work across multiple processors or across multiple cores within the CPU. This technique incorporates modern computers that possess multiple CPUs or cores and can produce high throughput in less time.

Here are the key use cases of parallel programming:

  • Climate research
  • Applied physics
  • Quantum mechanics
  • Advanced graphics
  • Modular modeling

Parallel Extensions

Parallel Extensions, formerly known as Parallel Framework Extensions (PFX), is an open-source, lightweight, managed concurrency library. With it, you can support both imperative and declarative parallelism as well as data and task parallelism. This allows LINQ developers to take advantage of multi-core systems while having complete support for all .NET query operators with minimal impact on the existing LINQ model. The library includes APIs that enable you to use multiple cores in your system and implement data parallelism and task parallelism in your applications.

The following libraries are a part of PFX:

  • Task Parallel Library (TPL)
  • Parallel LINQ (PLINQ)

Task Parallel Library (TPL)

The Task Parallel Library (TPL) encompasses a collection of APIs pertaining to the System.Threading and System.Threading.Tasks namespaces that are used to implement parallelism and concurrency. It provides an abstraction over lower-level constructs like threads, synchronization primitives, etc., allowing you to write more scalable code without having to deal with these low-level details directly.

The TPL includes the Parallel class, located in the System.Threading namespace. This class provides static methods, such as For, ForEach, and Invoke. You should leverage Parallel.For and Parallel.ForEach methods to parallelize loops or achieve imperative data parallelism. Use a Parallel.Invoke method to implement task parallelism.

The TPL is available in the System.Threading.Tasks namespace and is used to work with Task and Task<T> types. TPL abstracts several complexities associated with multithreading that include creation of threads, synchronization, and thread pooling.

A task is the basic unit of work in the TPL, represented as a single asynchronous operation analogous to a thread or a ThreadPool work item at a higher level of abstraction. A task can be executed concurrently with other tasks and can be created using the Task class or by invoking the Task.Run() method explicitly.

The following code snippet shows how you can create a basic task in C#:

Task task = Task.Run(() => { Console.WriteLine(""" Demonstrating a basic task in execution """); }); task.Wait();

Note that you should use Task<T> when you need to return a value. Here, T refers to the type of the value that should be returned from the async method. The following code snippet shows how you can return a value from a task in C#.

public static async Task Start() { int x = 5, y = 10; Task<int> task = MultiplyAsync(x, y); int result = await task; Console.WriteLine($"The result is: {result}"); } public static async Task<int> MultiplyAsync(int x, int y) { await Task.Delay(100); return x * y; }

The following piece of code shows how you can leverage the AsParallel() method to store the prime numbers between 1 and 100 in a collection, while handling the exceptions (if any) thrown in the try block.

try { int[] integers = Enumerable.Range(1, 100).ToArray(); var result = integers .AsParallel() .Select(n => { if (n % 100 == 0) throw new InvalidOperationException($""" Error occurred when the value of n was {n} """); return n; }) .Where(n => IsPrimeNumber(n)) .ToList(); } catch (AggregateException ae) { foreach (var ex in ae.InnerExceptions) { Console.WriteLine($"Exception: {ex.Message}"); } }

Parallel LINQ (PLINQ)

Parallel LINQ, generally referred to as PLINQ, is a query execution engine for .NET Framework and .NET Core framework applications that runs LINQ to an object or LINQ to XML queries on multiple processors and cores. It's a part of the Parallel Extensions Library (also known as Parallel Framework Extensions—PFX—previously) that splits the input data into partitions that can be handled independently by one or more threads.

Under the hood, PLINQ uses partitioning. This means that the input data set is divided into chunks/blocks/data partitions (these are in-memory). Each processing core leverages its own partitioned subset of the total collection being processed. Once all these results have been obtained individually from each thread running at different CPU cores simultaneously, the data is merged. This merging process involves combining the results from each thread to form the final set of data.

The following code snippet illustrates a basic PLINQ query:

var integers = Enumerable.Range(1, 100); var result = integers.AsParallel().Where(n => n % 2 == 0).ToList(); Console.WriteLine($""" The total number of even numbers between 1 and 100 is: {result.Count}. """);

Handling Exceptions in PLINQ

Leverage AggregateException to handle exceptions that occur when working with PLINQ using C#. The following code snippet shows how AgreegateException can be used to handle exceptions in PLINQ:

try { int[] integers = Enumerable.Range(1, 100).ToArray(); var result = integers .AsParallel() .Select( n => { if (n % 100 == 0) throw new InvalidOperationException("Error"); return n; }) .Where(n => IsPrimeNumber(n)).ToList(); } catch (AggregateException ae) { foreach (var ex in ae.InnerExceptions) { Console.WriteLine($"Exception: {ex.Message}"); } }

The IsPrime() method is given below:

static bool IsPrime(int integer) { if (integer <= 1) return false; if (integer == 2) return true; for (int i = 2; i * i <= integer; i++) if (integer % i == 0) return false; return true; }

The degree of parallelism denotes the number of processors or cores used by PLINQ to execute a query. PLINQ allows you to control the degree of parallelism as well. For example, you can programmatically restrict the number of processors used in your PLINQ query, as shown in this code snippet:

var results = integers.AsParallel() .WithDegreeOfParallelism(2) .Select(n => n * 5) .ToList();

In C#, you can use the ThreadPool class in lieu of creating and destroying threads. In this case, the task is executed by a thread pertaining to the ThreadPool. The ThreadPool class helps you to queue tasks.

ThreadPool.QueueUserWorkItem((state) => { Console.WriteLine("This is a thread pool thread."); });

The goal of PLINQ is to optimize the performance of a query while ensuring that the data returned from execution of the query is accurate. Because ordering can be expensive computationally, PLINQ doesn't order the elements of the source sequence. Term ordering describes the ability to arrange the elements of a sequence or collection based on one or more predefined criteria that determines what should be ordered. The result set obtained from execution of PLINQ queries may or may not be ordered.

Use PLINQ when the workload is compute-intensive, the dataset is large, and the system has enough resources (CPU cores). Remember that PLINQ may become a bottleneck for smaller datasets.

Here is an example PLINQ query that doesn't order the data:

var data = integers.AsParallel() .Where(n => n % 2 == 0) .ToList();

You can enable ordering of the result set in PLINQ as shown below:

var data = integers .AsParallel() .AsOrdered() .Where(n => n % 2 == 0) .ToList();

To improve performance, turn off ordering as shown below:

var data = integers.AsParallel() .Where(n => n > 25) .AsUnordered() .Select(n => n % 2 == 0) .ToList();

Concurrency

Concurrency is defined as the capability of an operating system to allow several processes or threads to run at the same time. This is one of the most important features in modern operating systems enabling efficient use of CPU resources and enhancing system performance by enabling you to execute multiple tasks simultaneously. Concurrent execution can happen within a single processor by sharing the CPU time or in computers with multiple processors running concurrently. Concurrency provides for efficient use of resources by performing many operations simultaneously, which leads to faster completion times.

Although you can achieve concurrency in a single processor, parallelism is only possible on multi-core systems or distributed computers. This is because a single processor can execute one and only one thread at a given point of time.

Thread Safety and Synchronization Primitives

Synchronization is an imperative in multithreaded applications to ensure that your applications execute correctly and maintain consistency of shared resources. There are several synchronization primitives available as part of .NET Core that can help you implement thread safety and synchronization in your applications.

Thread safety guarantees that shared data is accessed and modified by several threads while ensuring that these threads don't corrupt the data or cause unexpected results. Synchronization primitives are language constructs or methods used for ensuring thread safety. A critical section is a shared resource that can be accessed by multiple threads where one and only one thread can access it at any given time.

I'll examine some of the synchronization primitives available in .NET Core in this section.

A lock statement is a synchronization primitive that enables one thread to enter a critical section of code at a given point in time while all other threads that try to access the shared resource are blocked until the lock on the shared resource is released.

private static readonly object objLock = new object(); lock (objLock) { //This is the critical section }

A read-only object can be initialized only at the time when the object is created or inside the constructor of the class to which it belongs.

You can also implement thread safety using another synchronization primitive, the Monitor class, that provides several methods to implement thread synchronization such as Enter, Exit, etc., as shown in the code snippet below:

Monitor.Enter(objLock); try { // This is the critical section } finally { Monitor.Exit(objLock); }

You can also use the Monitor.TryEnter method, as shown in the code snippet below:

try { if (Monitor.TryEnter(lockObj, 10)) { //This is the critical section } } catch (Exception ex) { } finally { Monitor.Exit(lockObj); }

The Semaphore class can be used as a synchronization primitive to allow multiple threads to access shared resources while restricting the number of threads allowed to execute the critical section concurrently. To do this, a semaphore count is obtained that's incremented when a thread enters the semaphore and decremented when the thread relinquishes control of the semaphore. When the semaphore count reaches its maximum, all threads are blocked from accessing the shared resource.

private static readonly Semaphore semaphoreObj = new Semaphore(3, 5); semaphoreObj.WaitOne(); try { // This is the critical section } finally { semaphoreObj.Release(); }

A mutex is a synchronization primitive used to lock a shared resource exclusively, i.e., it allows you to acquire a mutually exclusive lock where only one thread can have access to the shared resource at given point of time. You can take advantage of the Mutex class in C# to create mutexes. You can call the constructor of the Mutex class with the name of mutex as a parameter.

The following code snippet illustrates how you can work with Mutex in C#:

Mutex mutexObj = new Mutex(false, "MyExampleMutex"); if (mutexObj.WaitOne()) { try { //This is the critical section } finally { mutexObj.ReleaseMutex(); } }

The ReaderWriterLockSlim class is another synchronization primitive that can be used whenever many threads read from and one thread writes to a shared resource. It achieves this by acquiring a read lock for reading operations and an exclusive write lock for writing operations. It's an optimized version of the ReaderWriterLock class designed to manage access to a resource that's frequently read and occasionally written.

The following piece of code shows how you can work with the ReaderWriterLockSlim class:

private static readonly ReaderWriterLockSlim readerWriterLock = new ReaderWriterLockSlim(); readerWriterLock.EnterReadLock(); try { //Read data } finally { readerWriterLock.ExitReadLock(); } readerWriterLock.EnterWriteLock(); try { //Write data } finally { readerWriterLock.ExitWriteLock(); }

The complete source code is given in Listing 8.

Listing 8: The ThreadSynchronizationExample class

public class ThreadSynchronizationExample
{
    private static readonly object objLock = new object();

    private static readonly Semaphore semaphoreObj = new Semaphore(3, 5);

    private static readonly ReaderWriterLockSlim readerWriterLock = 
      new ReaderWriterLockSlim();

    public void AccessSharedDataUsingLock()
    {
        lock (objLock)
        {
            //This is the critical section
        }
    }

    public void AccessSharedDataUsingMonitor()
    {
        Monitor.Enter(objLock);
        try
        {
            //This is the critical section
        }
        finally
        {
            Monitor.Exit(objLock);
        }
    }

    public void AccessSharedDataUsingSemaphore()
    {
        semaphoreObj.WaitOne();
        try
        {
            //This is the critical section
        }
        finally
        {
            semaphoreObj.Release();
        }
    }

    public void AccessSharedDataUsingMutex()
    {
        Mutex mutexObj = new Mutex(false, "MyExampleMutex");
        if (mutexObj.WaitOne())
        {
            try
            {
                //This is the critical section
            }
            finally
            {
                mutexObj.ReleaseMutex();
            }
        }
    }

    public void AccessSharedDataUsingReaderWriterLock()
    {
        readerWriterLock.EnterReadLock();
        try
        {
            //Read data
        }
        finally
        {
            readerWriterLock.ExitReadLock();
        }

        readerWriterLock.EnterWriteLock();
        try
        {
            //Write data
        }
        finally
        {
            readerWriterLock.ExitWriteLock();
        }
    }
}

Points to Ponder

Asynchrony, parallelism, and concurrency, if used judiciously, can boost an application's performance and scalability considerably. Be careful to write your code in such a way that deadlocks and race conditions are avoided. Of course, not all of them, i.e., asynchrony, parallelism, or concurrency, are a good choice in all scenarios. The rule of thumb in determining which of them should be used depends on whether your program is CPU-intensive or I/O-intensive.