Know Your Linux: System call interface

System Call

A system call is an interface between a user-space application and a service that the kernel provides. Because the service is provided in the kernel, a direct call cannot be performed; instead, you must use a process of crossing the user-space/kernel boundary. The way you do this differs based on the particular architecture.

Let’s see how to add the system call to the Linux kernel 2.6 and how to use it from the user space for i386 architecture.

Implementation of system call

The implementation of system calls in Linux varies based on the architecture, but it can also differ within a given architecture. For example, older x86 processors used an interrupt mechanism to migrate from user-space to kernel-space, but new IA-32 processors provide instructions that optimize this transition (using sysenter and sysexit instructions).

Each system call is multiplexed into the kernel through a single entry point. The eax register is used to identify the particular system call that should be invoked, which is specified in the C library (per the call from the user-space application). When the C library has loaded the system call index and any arguments, a software interrupt is invoked (interrupt 0x80), which results in execution (through the interrupt handler) of the system_call function. This function handles all system calls, as identified by the contents of eax. After a few simple tests, the actual system call is invoked using the system_call_table and index contained in eax. Upon return from the system call, syscall_exit is eventually reached, and a call to resume_userspace transitions back to user-space. Execution resumes in the C library, which then returns to the user application.

The simplified flows of the system call (ex getpid) using the interrupt method:

The system call table and various linkages:

The system call table uses the index provided in eax to identify which system call to invoke from the table (sys_call_table). A sample of the contents of this table and the locations of these entities is also shown.

Adding a Linux System Call

Three basic steps to add a new system call to the kernel:

Add the new function.
Update the header files.
Update the system call table for the new function.

You can create a new file for your system calls. Let’s see a simple example.

Step 1 is to add kernel Functions for the new system call:

asmlinkage long sys_getjiffies( void )

  return (long)get_jiffies_64();

asmlinkage long sys_diffjiffies( long ujiffies )

  return (long)get_jiffies_64() - ujiffies;

The above two functions are used for monitoring jiffies. The first one returns the current jiffies, while the second returns the difference of the current and the value that the caller passes in. Note the use of the asmlinkage modifier. This macro (defined in linux/include/asm-i386/linkage.h) tells the compiler to pass all function arguments on the stack.

Final Kernel Function for the system call example:

asmlinkage long sys_pdiffjiffies( long ujiffies,

                                  long __user *presult )

  long cur_jiffies = (long)get_jiffies_64();

  long result;

  int  err = 0;

  if (presult) {

    result = cur_jiffies - ujiffies;

    err = put_user( result, presult );

  return err ? -EFAULT : 0;

This function takes two arguments: a long and a pointer to a long that's defined as __user. The __user macro simply tells the compiler that the pointer should not be dereferenced. This function calculates the difference between two jiffies values, and then provides the result to the user through a user-space pointer. The put_user function places the result value into user-space at the location that presult specifies. If an error occurs during this operation, it will be returned, and you'll likewise notify the user-space caller.

Step 2 is to update the header files to make room for the new functions in the system call table:

For this, update the header file linux/include/asm/unistd.h with the new system call numbers.

#define __NR_getcpu                             318

#define __NR_epoll_pwait       319

#define __NR_getjiffies           320

#define __NR_diffjiffies          321

#define __NR_pdiffjiffies        322

#define NR_syscalls                               323

Step 3 is updating the system call table:

Update the file linux/arch/i386/kernel/syscall_table.S for the new functions that will populate the particular indexes

.long sys_getcpu

.long sys_epoll_pwait

.long sys_getjiffies                 /* 320 */

.long sys_diffjiffies

.long sys_pdiffjiffies

Note: The size of this table is defined by the symbolic constant NR_syscalls.

Add your system call file to your Kernel Makefile (/usr/src/linux-x.x.x./kernel/Makefile) so that it gets updated.

Now the system needs to be re-compiled to include the chages. This is done by executing the below at /usr/src/linux-x.x.x

Make menuconfig or xconfig or config

Make dep

Make

Now a new image needs to be created to use the system call. This could be done by executing make install at /usr/src/linux-x.x-x. or by manually editing /etc/grub.conf or lilo. Now you can write an application to test your system call from the booted new image.

Q & A on system call

Differences between read and fread?

The former is a system call, while the latter is a C standard library function call. The latter is more efficient, because it use buffering. When an fread() call is made, more than the requested amount is read() from the file. The extra bytes are held in a buffer, local to the C standard library, and not directly accessible by your program. When your program next calls fread(), it may be able to satisfy the request using bytes already in its buffer, eliminating the need for another read() system call.

Saturday, October 24, 2009

System call interface

No comments:

Post a Comment

KYL

About Me

ClusterMaps

Followers

Blog Archive