Does NodeJS have a socket per connection, even though it is single threaded?

Yes.

We start with our express module running our server at port 8080. We show there are no connections…

[esb@cmhlcarchapp01 fuse]$ netstat -anp | grep 8080
tcp        0      0 0.0.0.0:8080                0.0.0.0:*                   LISTEN      -
[esb@cmhlcarchapp01 fuse]$

…and then run from our client ten connection requests in a python script…

>>> import urllib2
>>> url = 'http://cmhlcarchapp01:8080/xml/1'
>>> for i in range(10):
...   a = urllib2.urlopen(url).read()
>>>

…and then show we have ten connections open…

[esb@cmhlcarchapp01 fuse]$ netstat -anp | grep 8080
tcp        0      0 0.0.0.0:8080                0.0.0.0:*                   LISTEN      -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49976        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49991        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49956        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49992        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49983        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49978        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49997        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49999        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49996        TIME_WAIT   -
tcp        0      0 172.27.2.98:8080            172.26.248.151:49981        TIME_WAIT   -
[esb@cmhlcarchapp01 fuse]$

NodeJS is single threaded, but it simply has no way to take traffic from more than one client using only a single socket.

This is not a drawback, just an observation. It is a relevant observation, as the old days of thread pools filling may be largely removed in NodeJS, but socket connections filling the OS slots for such still exists as a potential issue.

We then move on to see if multiple threads are created when we run a multi threaded python program to get a page from our NodeJS express web server…

>>> import thread, urllib2
>>> url = 'http://cmhlcarchapp01:8080/xml/1'
>>> def getRequest():
...   a = urllib2.urlopen(url).read()
...
>>> for i in range(50):
...   thread.start_new_thread(getRequest, ())
...
>>>

When we run strace against our NodeJS express web server at the same time, we see only a single thread doing all the work. Again, notice each write occurs on an individual socket file descriptor…

-bash-4.1$ strace -f -p 1540 -e trace=\!futex -o t.txt
Process 1540 attached with 6 threads
^CProcess 1540 detached
Process 1541 detached
Process 1542 detached
Process 1543 detached
Process 1544 detached
Process 1545 detached
-bash-4.1$ egrep HTTP.*Powered t.txt  | wc -l
50
-bash-4.1$ egrep HTTP.*Powered t.txt | tail -10
1540  write(57, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(11, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356 
1540  write(12, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(17, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(48, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(54, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(47, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356 
1540  write(55, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(52, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
1540  write(12, "HTTP/1.1 200 OK\r\nX-Powered-By: E"..., 356) = 356
-bash-4.1$ egrep HTTP.*Powered t.txt | awk '{print $1}' | sort -u
1540
-bash-4.1$

The question is if the same behavior is exhibited in another language such as java? As such, we run a java socket server, and run a similar test in python.

import java.net.*;
import java.io.*;
import java.util.*;

class serverSocket {
  public static void main (String args[]) throws Exception {
    ServerSocket ss;
    DataInputStream is;
    PrintStream os;
    ss = new ServerSocket(5000);
    while (true) {
      try {
        Socket clientSocket = ss.accept();
        System.out.println("got socket");
        is = new DataInputStream(clientSocket.getInputStream());
        clientSocket.close();
        os = new PrintStream(clientSocket.getOutputStream());
        String line = is.readLine();
        Thread.sleep(Integer.parseInt(args[0]));
        os.println(line.toUpperCase());
        os.close();
        is.close();
        clientSocket.close();
	  }
	  catch (Exception e) {
	  }
    }
  }
}

We then run the following python program…

>>> import thread
>>> def getRequest():
...   sock = socket.socket()
...   sock.connect(("cmhlcarchapp01", 5000))
...
>>> for i in range(50):
...   thread.start_new_thread(getRequest, ())
...

When we run the command below on our server…

strace -f -o l.txt java serverSocket

…we see only a single thread processing the connections in java…

-bash-4.1$ grep 172.26 l.txt | head -10
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55320), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55322), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55321), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55326), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55325), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55324), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55328), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55327), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55331), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
4848  accept(5, {sa_family=AF_INET6, sin6_port=htons(55332), inet_pton(AF_INET6, "::ffff:172.26.8.170", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 6
-bash-4.1$ grep 172.26 l.txt | awk '{print $1}' | sort -u
4848
-bash-4.1$

So what is the difference? The way NodeJS handles the requests from each connection. If we were to add functionality to our serverSocket class to query a database for example, each request would be executed serially by the same thread, 4848 shown above. This would result in the 50th thread above waiting to get any results until the first 49 threads were served. In previous paradigms, this meant that we needed to multi thread our serverSocket class. This has been the design of middleware servers such as weblogic for years.

NodeJS changes this concept to use a single thread, and start hanging requests on a queue to be executed. The requests themselves are implemented via a callback function.

More on that in the next post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.