Linux threads and CPU affinity

Linux provides native threading capabilities. This allows a program to run with multiple processes all accessing the same shared data, without the headache of Interprocess Communication (IPC) calls. Essentially, a single process makes calls to the system pthread library to create threads, which can be thought of as “lightweight processes” (this is actually the Sun System V term given to them).

We first show the C language code we will use to test how Linux assigns thread operations to CPU’s for execution.

#include
#include
#include
#include
#include

pthread_t thisThread[4];

void* runNativeThread(void *arg) {

  unsigned long i = 0;
  for(i=0; i<1000000000000L;i++);
  return NULL;
}

int main(void) {
  int i = 0;
  int returnCode;

  while(i < 4) {
    returnCode = pthread_create(&(thisThread[i]), NULL, &runNativeThread, NULL);
    if (returnCode != 0) {
      printf("\ncan't create thread :[%s]", strerror(returnCode));
    }
    i++;
  }

  for (i = 0; i < 4; i++) {
    pthread_join(thisThread[i], NULL);
  }
  return 0;
}

We then compile our program. Note this must be linked against the pthread library.

qa04:oracle:qa4:/u01/orahome/>gcc -lpthread -o threads threads.c

We then run our program.

qa04:oracle:qa4:/u01/orahome/>./threads &
[1] 16830

Next, we use the taskset command to show the processor affinity for our parent process as well as each thread.

qa04:oracle:qa4:/u01/orahome/>for p in $(ps -L -p 16830 | grep -v LWP | awk '{print $2}'); do taskset -p $p; done
pid 16830's current affinity mask: ff
pid 16831's current affinity mask: ff
pid 16832's current affinity mask: ff
pid 16833's current affinity mask: ff
pid 16834's current affinity mask: ff

The hexadecimal value of 0xff is a binary bitmask that shows each processor can be used by the process (or any given thread in the process). Please note we have eight cores on our test server.

In other words:

• 00000011 = binary 3, or 0x3 in hexadecimal. This would indicate two processors could be used
• 00001111 = binary 15, or 0xF in hexadecimal. This would indicate four processors could be used
• 11111111 = binary 255, or 0xFF in hexadecimal. This would indicate eight processors could be used

qa04:oracle:qa4:/u01/orahome/>sar -P ALL 5 1 | grep ^Average
Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     51.75      0.00      1.81      0.05      0.00     46.39
Average:          0     64.71      0.00      2.64      0.00      0.00     32.66
Average:          1     99.60      0.00      0.40      0.00      0.00      0.00
Average:          2     82.20      0.00      0.60      0.00      0.00     17.20
Average:          3     10.48      0.00      1.61      0.40      0.00     87.50
Average:          4      7.29      0.00      3.64      0.00      0.00     89.07
Average:          5     22.42      0.00      2.02      0.00      0.00     75.56
Average:          6     25.81      0.00      4.03      0.00      0.00     70.16
Average:          7    100.00      0.00      0.00      0.00      0.00      0.00

Notice almost all CPU’s are being used. We then run our command again to determine CPU distribution, and we see that the distribution has changed, although it is still fairly well balanced.

qa04:oracle:qa4:/u01/orahome/>sar -P ALL 5 1 | grep ^Average
Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     50.98      0.00      0.53      0.03      0.00     48.46
Average:          0     29.80      0.00      2.04      0.20      0.00     67.96
Average:          1     44.56      0.00      0.81      0.00      0.00     54.64
Average:          2     88.78      0.00      0.40      0.00      0.00     10.82
Average:          3     55.73      0.00      0.00      0.00      0.00     44.27
Average:          4      7.83      0.00      0.60      0.20      0.00     91.37
Average:          5     86.20      0.00      0.20      0.00      0.00     13.60
Average:          6     15.49      0.00      0.60      0.20      0.00     83.70
Average:          7     78.24      0.00      0.20      0.00      0.00     21.56

Next, we change our CPU affinity for each thread using the taskset command. For the sake of completeness, we show the system call to sched_setaffinity() when we run this.

qa04:oracle:qa4:/u01/orahome/>for p in $(ps -L -p 16830 | grep -v LWP | awk '{print $2}'); do strace -f -etrace=sched_setaffinity taskset -p f $p 2>&1; done
pid 16830's current affinity mask: ff
sched_setaffinity(16830, 256,  { f, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 0
pid 16830's new affinity mask: f
pid 16831's current affinity mask: ff
sched_setaffinity(16831, 256,  { f, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 0
pid 16831's new affinity mask: f
pid 16832's current affinity mask: ff
sched_setaffinity(16832, 256,  { f, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 0
pid 16832's new affinity mask: f
pid 16833's current affinity mask: ff
sched_setaffinity(16833, 256,  { f, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 0
pid 16833's new affinity mask: f
pid 16834's current affinity mask: ff
sched_setaffinity(16834, 256,  { f, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 0
pid 16834's new affinity mask: f

Lastly, we show our threads are now tied to the first four processors.

qa04:oracle:qa4:/u01/orahome/>sar -P ALL 5 1 | grep ^Average
Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     50.80      0.00      0.93      0.13      0.00     48.14
Average:          0     99.20      0.00      0.80      0.00      0.00      0.00
Average:          1     99.40      0.00      0.60      0.00      0.00      0.00
Average:          2     99.60      0.00      0.40      0.00      0.00      0.00
Average:          3    100.00      0.00      0.00      0.00      0.00      0.00
Average:          4      1.01      0.00      0.61      0.00      0.00     98.38
Average:          5      1.20      0.00      1.41      0.00      0.00     97.39
Average:          6      1.62      0.00      2.43      0.00      0.00     95.94
Average:          7      2.63      0.00      1.01      0.81      0.00     95.55
qa04:oracle:qa4:/u01/orahome/>

Our next test will be a java program that again creates four threads.

public class mythread implements Runnable {
  public static void main (String args[]) {
    for (int i = 1; i <= 4; i++) {
      mythread m = new mythread();
    }
  }
  public mythread() {
    Thread t = new Thread(this);
    t.start();
  }
  public void run() {
    for (long j = 1; j <= 1000000000000L; j++) {
    }
  }
}

We then compile our class and run it.

qa04:oracle:qa4:/u01/orahome>javac mythread.java
qa04:oracle:qa4:/u01/orahome>java mythread &
[1] 16483

We see that each thread created has CPU affinity of 255.

qa04:oracle:qa4:/u01/orahome>for p in $(ps -L -p 16483 | grep -v LWP | awk '{print $2}'); do taskset -p $p; done
pid 16483's current affinity mask: ff
pid 16484's current affinity mask: ff
pid 16485's current affinity mask: ff
pid 16486's current affinity mask: ff
pid 16487's current affinity mask: ff
pid 16488's current affinity mask: ff
pid 16489's current affinity mask: ff
pid 16490's current affinity mask: ff
pid 16491's current affinity mask: ff
pid 16492's current affinity mask: ff
pid 16493's current affinity mask: ff
pid 16494's current affinity mask: ff
pid 16495's current affinity mask: ff
pid 16496's current affinity mask: ff
pid 16497's current affinity mask: ff
pid 16498's current affinity mask: ff
pid 16499's current affinity mask: ff
pid 16500's current affinity mask: ff
pid 16501's current affinity mask: ff
pid 16502's current affinity mask: ff
pid 16503's current affinity mask: ff
pid 16504's current affinity mask: ff

We first see that our CPU’s are being fairly equally used.

qa04:oracle:qa4:/u01/orahome>sar -P ALL 5 1 | grep ^Average
Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     50.62      0.00      0.75      0.05      0.00     48.58
Average:          0     61.69      0.00      0.20      0.00      0.00     38.10
Average:          1     68.47      0.00      0.20      0.00      0.00     31.33
Average:          2     12.73      0.00      1.21      0.20      0.00     85.86
Average:          3     54.22      0.00      1.41      0.00      0.00     44.38
Average:          4     37.68      0.00      2.40      0.00      0.00     59.92
Average:          5    100.00      0.00      0.00      0.00      0.00      0.00
Average:          6     33.80      0.00      0.20      0.00      0.00     66.00
Average:          7     35.94      0.00      0.40      0.00      0.00     63.65
qa04:oracle:qa4:/u01/orahome>sar -P ALL 5 1 | grep ^Average
Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     52.42      0.00      2.31      0.08      0.00     45.19
Average:          0     45.55      0.00      2.23      0.00      0.00     52.23
Average:          1     64.33      0.00      5.21      0.20      0.00     30.26
Average:          2     45.25      0.00      1.82      0.40      0.00     52.53
Average:          3     46.79      0.00      1.20      0.20      0.00     51.81
Average:          4     45.38      0.00      4.82      0.00      0.00     49.80
Average:          5     91.80      0.00      0.20      0.00      0.00      8.00
Average:          6     32.33      0.00      2.61      0.00      0.00     65.06
Average:          7     47.70      0.00      0.20      0.00      0.00     52.10

We then change the CPU affinity for each thread to 15, and again show our system call.

qa04:oracle:qa4:/u01/orahome>for p in $(ps -L -p 16483 | grep -v LWP | awk '{print $2}'); do strace -f -etrace=sched_setaffinity taskset -p f $p 2>&1 | awk '{print $1,$2,$3,$4,$5,$6}'; done
sched_setaffinity(16483, 256, { f, 0, 0,
pid 16483's current affinity mask: f
pid 16483's new affinity mask: f
sched_setaffinity(16484, 256, { f, 0, 0,
pid 16484's current affinity mask: f
pid 16484's new affinity mask: f
sched_setaffinity(16485, 256, { f, 0, 0,
pid 16485's current affinity mask: f
pid 16485's new affinity mask: f
sched_setaffinity(16486, 256, { f, 0, 0,
pid 16486's current affinity mask: f
pid 16486's new affinity mask: f
sched_setaffinity(16487, 256, { f, 0, 0,
pid 16487's current affinity mask: f
pid 16487's new affinity mask: f
sched_setaffinity(16488, 256, { f, 0, 0,
pid 16488's current affinity mask: f
pid 16488's new affinity mask: f
sched_setaffinity(16489, 256, { f, 0, 0,
pid 16489's current affinity mask: f
pid 16489's new affinity mask: f
sched_setaffinity(16490, 256, { f, 0, 0,
pid 16490's current affinity mask: f
pid 16490's new affinity mask: f
sched_setaffinity(16491, 256, { f, 0, 0,
pid 16491's current affinity mask: f
pid 16491's new affinity mask: f
sched_setaffinity(16492, 256, { f, 0, 0,
pid 16492's current affinity mask: f
pid 16492's new affinity mask: f
sched_setaffinity(16493, 256, { f, 0, 0,
pid 16493's current affinity mask: f
pid 16493's new affinity mask: f
sched_setaffinity(16494, 256, { f, 0, 0,
pid 16494's current affinity mask: f
pid 16494's new affinity mask: f
sched_setaffinity(16495, 256, { f, 0, 0,
pid 16495's current affinity mask: f
pid 16495's new affinity mask: f
sched_setaffinity(16496, 256, { f, 0, 0,
pid 16496's current affinity mask: f
pid 16496's new affinity mask: f
sched_setaffinity(16497, 256, { f, 0, 0,
pid 16497's current affinity mask: f
pid 16497's new affinity mask: f
sched_setaffinity(16498, 256, { f, 0, 0,
pid 16498's current affinity mask: f
pid 16498's new affinity mask: f
sched_setaffinity(16499, 256, { f, 0, 0,
pid 16499's current affinity mask: f
pid 16499's new affinity mask: f
sched_setaffinity(16500, 256, { f, 0, 0,
pid 16500's current affinity mask: f
pid 16500's new affinity mask: f
sched_setaffinity(16501, 256, { f, 0, 0,
pid 16501's current affinity mask: f
pid 16501's new affinity mask: f
sched_setaffinity(16502, 256, { f, 0, 0,
pid 16502's current affinity mask: f
pid 16502's new affinity mask: f
sched_setaffinity(16503, 256, { f, 0, 0,
pid 16503's current affinity mask: f
pid 16503's new affinity mask: f
sched_setaffinity(16504, 256, { f, 0, 0,
pid 16504's current affinity mask: f
pid 16504's new affinity mask: f
qa04:oracle:qa4:/u01/orahome>

After doing this, we see that the first four CPU’s are used by our threads.

qa04:oracle:qa4:/u01/orahome>sar -P ALL 5 1 | grep ^Average
Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     50.46      0.00      0.93      0.03      0.00     48.58
Average:          0    100.00      0.00      0.00      0.00      0.00      0.00
Average:          1     98.00      0.00      2.00      0.00      0.00      0.00
Average:          2     99.60      0.00      0.40      0.00      0.00      0.00
Average:          3     98.80      0.00      1.20      0.00      0.00      0.00
Average:          4      1.42      0.00      0.61      0.00      0.00     97.98
Average:          5      1.62      0.00      0.81      0.20      0.00     97.36
Average:          6      1.01      0.00      1.21      0.00      0.00     97.79
Average:          7      1.21      0.00      0.81      0.00      0.00     97.98
qa04:oracle:qa4:/u01/orahome>

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.