mov eax , cr0
or eax , 0x01
mov cr0 , eax



back to months list

Project : Research on Multi-platform System Call Table

Journal Entry Date : 2024.03.14

Happy π day! Today I planned and finally constructed the system call detection system, and also learned lots of new things about how linux execute a file.

I learned that there is a structure called "linux_binprm" that contains all the information about executive files, and this structure is created and passed to the search_binary_handler() that searches the binary handler and ultimately executes the file.

/*
 * This structure is used to hold the arguments that are used when loading binaries.
 */
struct linux_binprm {
#ifdef CONFIG_MMU
	struct vm_area_struct *vma;
	unsigned long vma_pages;
#else
# define MAX_ARG_PAGES	32
	struct page *page[MAX_ARG_PAGES];
#endif
	struct mm_struct *mm;
	unsigned long p; /* current top of mem */
	unsigned long argmin; /* rlimit marker for copy_strings() */
	unsigned int
		/* Should an execfd be passed to userspace? */
		have_execfd:1,

		/* Use the creds of a script (see binfmt_misc) */
		execfd_creds:1,
		/*
		 * Set by bprm_creds_for_exec hook to indicate a
		 * privilege-gaining exec has happened. Used to set
		 * AT_SECURE auxv for glibc.
		 */
		secureexec:1,
		/*
		 * Set when errors can no longer be returned to the
		 * original userspace.
		 */
		point_of_no_return:1;
	struct file *executable; /* Executable to pass to the interpreter */
	struct file *interpreter;
	struct file *file;
	struct cred *cred;	/* new credentials */
	int unsafe;		/* how unsafe this exec is (mask of LSM_UNSAFE_*) */
	unsigned int per_clear;	/* bits to clear in current->personality */
	int argc, envc;
	const char *filename;	/* Name of binary as seen by procps */
	const char *interp;	/* Name of the binary really executed. Most
				   of the time same as filename, but could be
				   different for binfmt_{misc,script} */
	const char *fdpath;	/* generated filename for execveat */
	unsigned interp_flags;
	int execfd;		/* File descriptor of the executable */
	unsigned long loader, exec;

	struct rlimit rlim_stack; /* Saved RLIMIT_STACK used during exec. */

	char buf[BINPRM_BUF_SIZE];
} __randomize_layout;

One thing that got my interest is the "buf" field. Surprisingly, linux saves the first 256 bytes of whatever file that will be executed. I think this is for discriminating file's signature quickly, because most executable files have their signature at the start of the file. The function prepare_bimprm() does that job of copying first section of file into the buffer.

static int search_binary_handler(struct linux_binprm *bprm)
{
	...
	retval = prepare_binprm(bprm);
        ...
}

/* --- omitted --- */

/*
 * Fill the binprm structure from the inode.
 * Read the first BINPRM_BUF_SIZE bytes
 *
 * This may be called multiple times for binary chains (scripts for example).
 */
static int prepare_binprm(struct linux_binprm *bprm)
{
	loff_t pos = 0;

	memset(bprm->buf, 0, BINPRM_BUF_SIZE);
	return kernel_read(bprm->file, bprm->buf, BINPRM_BUF_SIZE, &pos);
}

One more interesting thing that I learned is that the bash script is actually processed Not by the bash, but by kernel itself. You can see that there is binary handler for bash script in linux source!

And if we look up at elf loader... we can see that elf loader uses the "buf" field to discriminate the file type.

static int load_elf_binary(struct linux_binprm *bprm)
{
	...
	struct elfhdr *elf_ex = (struct elfhdr *)bprm->buf;
	struct elfhdr *interp_elf_ex = NULL;
	struct arch_elf_state arch_state = INIT_ARCH_ELF_STATE;
	struct mm_struct *mm;
	struct pt_regs *regs;

	retval = -ENOEXEC;
	/* First of all, some simple consistency checks */
	if (memcmp(elf_ex->e_ident, ELFMAG, SELFMAG) != 0)
		goto out;
	
	...

How cool is that? Actually being able to look at the code that does the fundamental and crucial job for everything.. Linux is definitely an astonishing piece of artwork created by wonders of thousands.

...Enough of that, I actually planned some major stuff of this research today. It's quite same as the diagram from previous journal, but this is bit more detailed than the previous one.

It's a picture... so it's little bit hazy.. sorry

Basically, I separated the system into two option. First one is designed just to work in linux(...and other kernel/os that has similar system of binary handler.) The binary handler determines what system call handler the exe. file should use, which removes the necessity of making one more detector-like system that detects what system call the program uses. It basically reduces the duplicated job and just use the convenient feature of linux kernel. Second one is designed to work bascially everywhere(every kernel/os that has basic ability to execute/create a process.) There is the system call detector(like the linux kernel's binary handler) that discriminates what system call a process uses. This is necessary for kernel/os that does Not have the system like linux's bin. handler, and requires some extra detection of file handler(or whatever handler..)

And once the correct system call table is wired to each every process, the global kernel system call table is exchanged to the processor's system call table, for every context switching. Probably, the system call table will exist in the each processor's PCB.

...Now we need a place for every system call table that will exist on the kernel... and the binary handler for microsoft exe file... and actually incorporating this system to kernel... and LOTS and LOTS of other things that I definitely did not considered and even expected.

I really hope this goes well..