Diagonal Tutorial

1 Introduction
2 Getting started
3 Implementing Bubble sort easier with diag-fix
4 Simple report of elapsed time by diag-rep and diag-medi
5 Easy process pooling by diag-pool

1 Introduction

This documentation is a tutorial of diagonal. You will find a basic idea about what you can do by diagonal.

2 Getting started

To build and install Diagonal from its source code is easy on every supported platform. We are now using Git¹ for its revision control, so you can get its source code with Git by the following command:

$ git clone git://git.code.sf.net/p/diagonal/code diagonal

Prerequisites for building Diagonal are CMake² and a modern C compiler, such as clang³ or gcc⁴, that conforms to C99⁵. For FreeBSD / Linux / Mac OS X / NetBSD, a typical installation process of Diagonal is as follows:

$ mkdir build
$ cd build
$ cmake -DCMAKE_INSTALL_PREFIX=$PREFIX ../diagonal
$ make all
# make install

where $PREFIX is where you would like to install it. For Windows, currently you will need MinGW-w64⁶ as well.

$ mkdir build
$ cd build
$ cmake -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_TOOLCHAIN_FILE=../diagonal/Toolchain-$HOST.cmake ../diagonal
$ make all
# make install

where $HOST is x86_64-w64-mingw32 if you are going to build it for 64-bit Windows, or i686-w64-mingw32 for 32-bin Windows.

3 Implementing Bubble sort easier with diag-fix

If you have learnt programming, you understand Bubble sort⁷ is too naive to be practical. So you do not want to bother thinking it again, right? What if we can implement a bubble sort even easier with diag-fix :)

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
  int prev, next;
  int r = fscanf(stdin, "%d", &prev);
  if (r == EOF) return EXIT_SUCCESS;
  if (r == 0) {
      fprintf(stderr, "read error\n");
      return EXIT_FAILURE;
  }
  for (;;) {
      r = fscanf(stdin, "%d", &next);
      if (r == EOF) break;
      if (r == 0) {
          fprintf(stderr, "read error\n");
          return EXIT_FAILURE;
      }
      if (prev < next) {
          printf("%d ", prev);
          prev = next;
      } else {
          printf("%d ", next);
      }
  }
  printf("%d\n", prev);
  return EXIT_SUCCESS;
}

Let's compile the above source code and call it bubble. It can execute a single run of swapping input elements as follows:

$ cat input
2 3 4 10 -2 2 -3 1 5 10 9 8 7 -7
$ ./bubble < input
2 3 4 -2 2 -3 1 5 10 9 8 7 -7 10
$ ./bubble < input | ./bubble
2 3 -2 2 -3 1 4 5 9 8 7 -7 10 10

Then diag-fix does the rest of all for sorting:

$ diag fix -i input ./bubble
-7 -3 -2 1 2 2 3 4 5 7 8 9 10 10

4 Simple report of elapsed time by diag-rep and diag-medi

Repeated observation is important for a scientific experiment. Let's see how diag-rep and diag-medi helps us to produce a reasonable guess of a process's elapsed time from repeated trials. When you would like to tell how long dd(1) takes to copy /dev/zero to /dev/null, time(1) is your friend:

$ /usr/bin/time dd if=/dev/zero of=/dev/null count=1m
1048576+0 records in
1048576+0 records out
536870912 bytes (537 MB) copied, 0.274461 s, 2.0 GB/s
0.08user 0.18system 0:00.27elapsed 98%CPU (0avgtext+0avgdata 804maxresident)k
0inputs+0outputs (0major+257minor)pagefaults 0swaps

From now on, suppose Linux's time(1) for its useful options other time(1) implementations may lack. More specifically, you can use options -f, -o and redirection in order to focus on elapsed wall clock time:

$ /usr/bin/time -f "%e" -o /dev/stdout dd if=/dev/zero of=/dev/null count=1m 2> /dev/null
0.26

Of course the resulting time (in seconds) can vary every time you run the command. A procedure of repeated runs is needed for making your number as reliable as possible. And that's what diag-rep is for.

$ diag rep -n 15 /usr/bin/time -f "%e" -o /dev/stdout dd if=/dev/zero of=/dev/null count=1m 2> /dev/null
0.27
0.28
0.26
0.26
0.27
0.27
0.27
0.27
0.27
0.26
0.28
0.26
0.27
0.27
0.27

where diag-rep drives fifteen trials. In addition, you can choose a reasonable one in the candidate numbers with diag-medi:

$ diag rep -n 15 /usr/bin/time -f "%e" -o /dev/stdout dd if=/dev/zero of=/dev/null count=1m 2> /dev/null | diag-medi
0.27

5 Easy process pooling by diag-pool

BSD's SO_REUSEPORT, which becomes available as of Linux 3.9⁸, is a neat option for TCP/UDP server socket if you would like to serve clients with process pooling. diag-pool makes it even easier as you see below. Let's describe a simple program to wait for a TCP client on a port, then reply the process ID, and exit:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

static void usage(void)
{
  fprintf(stderr, "usage: mypid port\n");
}

int main(int argc, char *argv[])
{
  if (argc < 2) {
      usage();
      return EXIT_FAILURE;
  }
  int port = atoi(argv[1]);
  if (port == 0) {
      usage();
      return EXIT_FAILURE;
  }

  int s = socket(PF_INET, SOCK_STREAM, 0);
  if (s == -1) {
      fprintf(stderr, "failed to create an endpoint\n");
      return EXIT_FAILURE;
  }
  int val = 1;
  setsockopt(s, SOL_SOCKET, SO_REUSEPORT, &val, sizeof(val));

  struct sockaddr_in sa;
  memset(&sa, 0, sizeof(sa));
  sa.sin_family = AF_INET;
  sa.sin_port = htons(port);
  sa.sin_addr.s_addr = htonl(INADDR_ANY);
  if (bind(s, (struct sockaddr *)&sa, sizeof(sa)) != 0) {
      perror(argv[0]);
      return EXIT_FAILURE;
  }

  if (listen(s, 1) != 0) {
      perror(argv[0]);
      return EXIT_FAILURE;
  }

  socklen_t socklen = sizeof(sa);
  int d = accept(s, (struct sockaddr *)&sa, &socklen);
  if (d < 0) {
      perror(argv[0]);
      return EXIT_FAILURE;
  }
  FILE *fp = fdopen(d, "w");
  if (!fp) {
      perror(argv[0]);
      return EXIT_FAILURE;
  }
  fprintf(fp, "pid %d\n", (int)getpid());
  fclose(fp);
  close(d);
  close(s);
  return EXIT_SUCCESS;
}

We suppose you save the above source code in a file named mypid.c, then

$ gcc -o mypid mypid.c

will give you an executable mypid. OK, here is all you have to do for process pooling:

$ diag pool ./mypid 43210

Please keep it running and switch another terminal. Now you will see all five processes of mypid listening the same port 43210:

$ pgrep mypid
5657
5656
5655
5654
5653
$ netstat -al | head
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address          Foreign Address        (state)
tcp4       0      0 *.43210                *.*                    LISTEN
tcp4       0      0 *.43210                *.*                    LISTEN
tcp4       0      0 *.43210                *.*                    LISTEN
tcp4       0      0 *.43210                *.*                    LISTEN
tcp4       0      0 *.43210                *.*                    LISTEN
...

In order to access the port you can use nc(1) as follows:

$ nc localhost 43210
pid 5653

Note that retrying the access many times is a typical use case of diag-rep:

$ diag rep -n 10 nc localhost 43210
pid 5654
pid 5655
pid 5656
pid 5657
pid 5741
pid 5746
pid 5748
pid 5750
pid 5752
pid 5754

Footnotes:

¹ http://git-scm.com/

² http://www.cmake.org/

³ http://clang.llvm.org/

⁴ http://gcc.gnu.org/

⁵ http://en.wikipedia.org/wiki/C99

⁶ http://mingw-w64.sourceforge.net/

⁷ http://en.wikipedia.org/wiki/Bubble_sort

⁸ The SO_REUSEPORT socket option

Author: Takeshi Abe <tabe@fixedpoint.jp>

HTML generated by org-mode 6.33x in emacs 23