I was busy upgrading my operating system..
Today, I researched about the detailed sequence of linux executing a executable file. Thankfully, there was very easy and comprehensive materials that explains that very thoroughly.
By the way, if you're interested on where I got these informations, check out these links :
From what I know, the linux executes program by execve() system call. When execve() is called, the sys_execve is called(according to the system call table.) sys_execve() calls do_execve(), and finally, do_execve() calls do_execveat_common(). (Obviously,) the important function here is do_execvat_common().
static int do_execveat_common(int fd, struct filename *filename,
struct user_arg_ptr argv,
struct user_arg_ptr envp,
int flags) {
...
The do_execveat_common does the following (this is very very abbreviated!!) :
And if we look at the tree view of function calls..
do_execvat_common
-> allocate bprm
-> bprm_execve
-> ...
-> exec_binprm
...
-> search_binary_handler
-> load_binary()
(...load_elf_binary(), in case of elf file)
-> START_THREAD ---> program finally executed!
-> ...
-> ...
-> ...
Basically, this do_execat_common() function first writes the basic information necessary for loading a file, and it calls search_binary_handler() function. The search_binary_handler() searches for the module suitable for the target executable file. When found a suitable module, the function calls another function in the module that loads the binary code of the program and (finally) starts the thread using the loaded binary. (lengthy tedious elaboration..)
.. I will show you how everything works
static int do_execveat_common(int fd, struct filename *filename,
struct user_arg_ptr argv,
struct user_arg_ptr envp,
int flags)
{
...
bprm = alloc_bprm(fd, filename);
if (IS_ERR(bprm)) {
retval = PTR_ERR(bprm);
goto out_ret;
}
retval = count(argv, MAX_ARG_STRINGS);
if (retval == 0)
pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
current->comm, bprm->filename);
if (retval < 0)
goto out_free;
bprm->argc = retval;
retval = count(envp, MAX_ARG_STRINGS);
if (retval < 0)
goto out_free;
bprm->envc = retval;
retval = bprm_stack_limits(bprm);
if (retval < 0)
goto out_free;
...
retval = bprm_execve(bprm, fd, filename, flags);
...
}
The 30th line from above code is where the actual binary handler is called. When the handler recognizes the file format, the binary handler that actually reads the file's binary code, loads the data and creates the thread is called. This is the actual module(handler) of the elf file format.
static struct linux_binfmt elf_format = {
.module = THIS_MODULE,
.load_binary = load_elf_binary,
.load_shlib = load_elf_library,
#ifdef CONFIG_COREDUMP
.core_dump = elf_core_dump,
.min_coredump = ELF_EXEC_PAGESIZE,
#endif
};
....
Ok. That was quite a long journey than I expected. Amazing thing is that linux actually has a list of modules that detects the file type, and what I wanted is exactly that. I exactly needed some kind of module that detects the file type, because the system call table needs to be different for different formats of executables. Now what I have to consider is the way to implement the system call table switching to the kernel. To do that, I need to understand how linux manages the system call..