Intro

2024 Fall NCU Linux OS Project 2

Implement a custom wait queue-like functionality in kernel space. Allowing user applications to operate through the system call.
Add a thread to the wait queue to wait. Remove threads from the wait queue, allowing them exit in FIFO order

Link : Project2.pdf , github , hackmd , DEMO

Environment

OS: Ubuntu 22.04 on VMware workstation 17
ARCH: X86_64
Kernel Version: 5.15.137

Wait Queue Implemention

以下為kernel space code

wait queue node

typedef struct wait_queue_node {
    struct list_head list;
    struct task_struct *task;
} wait_queue_node_t;

每個node存表示 my_wait_queue 中的一個節點，每個節點對應一個process，裡面存了:
- list_head list：使用 list_head(Circular doubly linked list）將所有等待的process接在一起。
- task_struct *task: 每個process 指向自己的 process descriptor

global var.

static LIST_HEAD(my_wait_queue); // self-define FIFO Queue
static spinlock_t queue_lock;    // protect list_add, list_del

my_wait_queue：使用 LIST_HEAD 定義了一個空的circularly doubly linked list，作為head node of wait queue。
queue_lock：使用 spinlock_t 定義一個spin lock，用於保護multi-threads對 my_wait_queue 的操作，防止race condition。
- 一次就只會有一個thread 取得鑰匙，目的就是要保護my_wait_queue ，同時間只會有一個thread 在操作，包含 delete, add 都不能同時間執行

enter_wait_queue

static int enter_wait_queue(void)
{
    wait_queue_node_t *node;

    // 分配新節點
    node = kmalloc(sizeof(wait_queue_node_t), GFP_KERNEL);
    if (!node)
        return 0;

    node->task = current;
    
    spin_lock(&queue_lock);
    ////////// Critical Region ///////////////////
    list_add_tail(&node->list, &my_wait_queue); // 加入 FIFO Queue
    printk(KERN_INFO "Thread ID %d added to wait queue\n", current->pid);
    ////////// Critical Region End ///////////////
    spin_unlock(&queue_lock);

    set_current_state(TASK_INTERRUPTIBLE); // 設定為可中斷的睡眠狀態
    schedule(); // 進入睡眠

    return 1; // 成功加入
}

使用 spin_lock 保護list_add_tail，同時間只會有一個node 加入 my_wait_queue ，確保不會出現race condition。
使用 list_add_tail 將節點加入queue尾部。
設置process state為 TASK_INTERRUPTIBLE，並使用 schedule() 將當前process 設為睡眠，直到呼叫 wake_up_process 才可以離開wait queue。

:::warning 記得 schedule(); 要加在spinlock之外，
否則在 critical region 中使用 schedule(); 會使 process 占用 lock 不放，導致deadlock 發生 :::

clean_wait_queue

static int clean_wait_queue(void)
{
    wait_queue_node_t *node, *temp;

    spin_lock(&queue_lock);
    ////////// Critical Region ///////////////////
    list_for_each_entry_safe(node, temp, &my_wait_queue, list) {
        list_del(&node->list); // 從 Queue 中移除

        printk(KERN_INFO "Thread ID %d removed from wait queue\n", node->task->pid);

        wake_up_process(node->task); // 喚醒進程
        kfree(node); // 釋放記憶體

        msleep(100); // sleep 100 ms
    }
    ////////// Critical Region End ///////////////
    spin_unlock(&queue_lock);

    return 1; // 成功清理
}

使用 list_for_each_entry_safe 遍歷wait queue
- 使用 list_del 從queue中移除節點。
- 使用 wake_up_process 喚醒process。
- 釋放節點的記憶體。

:::success 這邊加入 msleep(100)
因為在user space 中印出的順序受到multi-thread的影響可能會亂掉
因此即使kernel space 的wait queue已經確保是FIFO的順序移除，但是在User space I/O 卻無法保證照順序印出，
所以在wake_up_process之後加上睡眠100ms，讓user space 有時間可以印出資訊而不被多執行緒打亂 :::

Function explanation

list_head

/*
 * Circular doubly linked list implementation.
 *
 * Some of the internal functions ("__xxx") are useful when
 * manipulating whole lists rather than single entries, as
 * sometimes we already know the next/prev entries and we can
 * generate better code by using them directly rather than
 * using the generic single-entry routines.
 */
#define LIST_HEAD_INIT(name) { &(name), &(name) }

#define LIST_HEAD(name) \
	struct list_head name = LIST_HEAD_INIT(name)

可以看到這是一個 Circular doubly linked list , 也就是我們要用來儲存wait queue的資料結構，
將linked list的 next 和 prev 都指向變數 name 本身，形成一個空的環狀雙向鏈表。
也就是說，使用 LIST_HEAD 就定義了一個空的雙向鏈表，作為等待Queue的head節點。

list_add_tail

/*
 * Insert a new entry between two known consecutive entries.
 *
 * This is only for internal list manipulation where we know
 * the prev/next entries already!
 */
static inline void __list_add(struct list_head *new,
			      struct list_head *prev,
			      struct list_head *next)
{
	if (!__list_add_valid(new, prev, next))
		return;

	next->prev = new;
	new->next = next;
	new->prev = prev;
	WRITE_ONCE(prev->next, new);
}

// ...

/**
 * list_add_tail - add a new entry
 * @new: new entry to be added
 * @head: list head to add it before
 *
 * Insert a new entry before the specified head.
 * This is useful for implementing queues.
 */
static inline void list_add_tail(struct list_head *new, struct list_head *head)
{
	__list_add(new, head->prev, head);
       // = add   (new, tail-node , head)
}

head 是circular linked list的 head節點
透過 head->prev 獲取尾節點，並將新節點插入到尾節點和頭節點之間，也就是加入linked list最後面。
__list_add 中，其實就是將new 插入到prev和next當中，透過4個操作:
- 更新 next->prev = new 和 new->next = next
- 更新 new->prev = prev 和 prev->next = new

圖例:

# circularly linked list
before add: next <-> A <-> B <-> C <->prev

# 加入new node到最後面:
after add : next <-> A <-> B <-> C <-> prev <-> new

list_del

/*
 * Delete a list entry by making the prev/next entries point to each other.
 * This is only for internal list manipulation where we know
   the prev/next entries already!
 */
static inline void __list_del(struct list_head * prev, struct list_head * next)
{
	next->prev = prev;
	WRITE_ONCE(prev->next, next);
}


/*
 * Check validity of deletion
 * */
static inline void __list_del_entry(struct list_head *entry)
{
	if (!__list_del_entry_valid(entry))
		return;

	__list_del(entry->prev, entry->next);
}


/**
 * list_del - deletes entry from list.
 * @entry: the element to delete from the list.
 * Note: list_empty() on entry does not return true after this, the entry is
 * in an undefined state.
 */
static inline void list_del(struct list_head *entry)
{
	__list_del_entry(entry);
	entry->next = LIST_POISON1;  // (void *) 0x00100100, invalid addr.
	entry->prev = LIST_POISON2;  // (void *) 0x00200200, invalid addr.
}

從雙向鏈表（list_head）中刪除指定的節點。
- 透過 next->prev = prev 及 prev->next = next 達到delete node
使用 LIST_POISON1 及 LIST_POISON2 明確標記已刪除的節點，便於檢測誤用

圖例:

# circularly linked list
before delete: head <-> A <-> prev <-> dnode <-> next

# Delete dnode:
after delete : head <-> A <-> prev <-> next

list_for_each_entry_safe

/**
 * list_for_each_entry_safe - iterate over list of given type safe against removal of list entry
 * @pos:	the type * to use as a loop cursor.
 * @n:		another type * to use as temporary storage
 * @head:	the head for your list.
 * @member:	the name of the list_head within the struct.
 */
#define list_for_each_entry_safe(pos, n, head, member)			\
	for (pos = list_first_entry(head, typeof(*pos), member),	\
		n = list_next_entry(pos, member);			\
	     !list_entry_is_head(pos, head, member); 			\
	     pos = n, n = list_next_entry(n, member))

pos 保存當前節點，n 保存pos 的下一個節點（通常用存放在tmp）。
n 確保當前節點(pos)刪除後，仍然能獲取下一個節點的地址。
刪除 pos 後，會透過 pos=n 及 n=list_next_entry() 一棟到下一個node，因此可以確保刪除的正確性．

User space code

:::spoiler wait_queue.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/syscall.h>
#define NUM_THREADS 10

#define SYS_CALL_MY_WAIT_QUEUE 452

void *enter_wait_queue(void *thread_id)
{
	pid_t tid = syscall(SYS_gettid); // get thread id
	fprintf(stderr, "enter wait queue thread_id: %d, current thread id = %d\n", *(int *)thread_id, tid);	
	int result;	// 
	result = syscall(SYS_CALL_MY_WAIT_QUEUE, 1);
	
	fprintf(stderr, "exit wait queue thread_id: %d, current thread id = %d\n", *(int *)thread_id, tid);
}
void *clean_wait_queue()
{

	int result;	//
	result = syscall(SYS_CALL_MY_WAIT_QUEUE, 2);
}

int main()
{
	void *ret;
	pthread_t id[NUM_THREADS];
	int thread_args[NUM_THREADS];

	for (int i = 0; i < NUM_THREADS; i++)
 	{
 		thread_args[i] = i;
 		pthread_create(&id[i], NULL, enter_wait_queue, (void *)&thread_args[i]);
 	}
 	sleep(1);
        
        fprintf(stderr, "start clean queue ...\n");
	
	clean_wait_queue();
 	
	for (int i = 0; i < NUM_THREADS; i++)
 	{
 		pthread_join(id[i], &ret);
 	}
	 return 0;
}

:::

執行結果

User space result:

透過原本的User space code for迴圈中 i 值以及真正的thread id來檢查，
進入wait queue以及離開wait queue的順序是保持FIFO

Kernel space result:

Another implementation

:::spoiler my_wait_queue.c

#include <linux/sched.h>
#include <linux/kernel.h>
#include <linux/syscalls.h>
#include <linux/delay.h>

static int condition = 0;
DECLARE_WAIT_QUEUE_HEAD(my_wait_queue);
static DEFINE_MUTEX(my_wait_queue_lock);
static int enter_wait_queue(void)
{   
    condition = 0;
    DECLARE_WAITQUEUE(wait_entry,current);
    mutex_lock(&my_wait_queue_lock); // 加鎖
    add_wait_queue_exclusive(&my_wait_queue, &wait_entry);
    printk("before join wait queue,  %d",current->pid);

    wait_event_exclusive_cmd(my_wait_queue,condition,mutex_unlock(&my_wait_queue_lock);,mutex_lock(&my_wait_queue_lock););
    printk("after waiting event,  %d",current->pid);
    remove_wait_queue(&my_wait_queue,&wait_entry);
    if(waitqueue_active(&my_wait_queue)!=0){
        msleep(100);
        wake_up(&my_wait_queue);    
    }
    mutex_unlock(&my_wait_queue_lock); // 加鎖
    return 0;
}
static int clean_wait_queue(void)
{
    printk("start waking up process.");
    mutex_lock(&my_wait_queue_lock); // 加鎖
    condition = 1;
    if(waitqueue_active(&my_wait_queue)!=0)
        wake_up(&my_wait_queue);
    mutex_unlock(&my_wait_queue_lock); // 加鎖
    return 0;
}
SYSCALL_DEFINE1(call_my_wait_queue, int, id)
{
switch (id){
case 1:
    enter_wait_queue();
    break;
case 2:
    clean_wait_queue();
    break;
}
return 0;
}

:::

全域、static變數初始化

static int condition = 0;
DECLARE_WAIT_QUEUE_HEAD(my_wait_queue);
static DEFINE_MUTEX(my_wait_queue_lock);

此區塊會初始化變數 6：首先將用於wait_event()判斷process是否擁有所需的資源的condition變數設為0(告訴wait_event目前process無法取得所需資源，需要進行wait)。

7：透過DECLARE_WAIT_QUEUE_HEAD()來建立一個作為全域變數的wait queue head實例：my_wait_queue。

8：DEFINE一個static的Mutex lock實例：my_wait_queue_lock。

enter_wait_queue()

static int enter_wait_queue(void)
{   
    condition = 0;
    DECLARE_WAITQUEUE(wait_entry,current);
    mutex_lock(&my_wait_queue_lock); // 加鎖
    add_wait_queue_exclusive(&my_wait_queue, &wait_entry);
    printk("before join wait queue,  %d",current->pid);

    wait_event_exclusive_cmd(my_wait_queue,condition,mutex_unlock(&my_wait_queue_lock);,mutex_lock(&my_wait_queue_lock););
    printk("after waiting event,  %d",current->pid);
    remove_wait_queue(&my_wait_queue,&wait_entry);
    if(waitqueue_active(&my_wait_queue)!=0){
        msleep(100);
        wake_up(&my_wait_queue);    
    }
    mutex_unlock(&my_wait_queue_lock); // 加鎖
    return 0;
}

11：將condition設回0以確保想加入wait queue的process能夠正確進入wait狀態。

12：透過wait.h中的DECLARE_WAITQUEUE() macro來建立一個新的wait queue entry。

13：加鎖以確保進行add_wait_queue()操作時不會有競爭條件 (Race Condition)的情況發生。

14：透過wait.h中的function add_wait_queue_exclusive()來將process加入wait queue。之所以會選擇exclusive加入的原因一方面是讓之後可以直接使用wake_up()就可以每次叫醒一個process進行執行(減少規定不能改動的user space code中printf順序出錯的機率)，另一方面是考量add_wait_queue_exclusive()本身會將要進入wait_queue的process串接在wait_queue最後方的性質，正好符合這次題目FIFO的需求。

17：透過同樣於wait.h中的functionwait_event_exclusive_cmd()來讓process進入wait狀態(因為沒有所需資源(condition=0)，故讓process wait)，而最後面的兩個cmd分別會於process開始wait前;process被叫醒時執行，這裡設定於開始wait前將mutex lock解鎖，並於被叫醒後嘗試獲得mutex lock。 當process進入wait狀態後實際上被暫停在這行

18：當process被叫醒後會由這行繼續執行(得到mutex lock後)。

19：透過wait.h中的function remove_wait_queue()將process由wait queue中移除。

20~23：讓此process檢查，假設waitqueue中還有其他在wait的process，就將其叫醒(同樣是應對題目不可修改的Userspace code中printf順續亂掉的問題)。

24：在此被叫醒的process返回userspace繼續執行前將mutex lock解鎖讓剛剛wake_up()的其他process可以開始執行。

clean_wait_queue()

static int clean_wait_queue(void)
{
    printk("start waking up process.");
    mutex_lock(&my_wait_queue_lock); // 加鎖
    condition = 1;
    if(waitqueue_active(&my_wait_queue)!=0)
        wake_up(&my_wait_queue);
    mutex_unlock(&my_wait_queue_lock); // 加鎖
    return 0;
}

此function內的實作與上方enter_wait_queue()中負責叫醒其他process的19~24行相同，主要用於叫醒第一個process。

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
KernelSpace		KernelSpace
UserSpace		UserSpace
LICENSE		LICENSE
Project_2.pdf		Project_2.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Wait Queue Implemention

Function explanation

User space code

執行結果

Another implementation

全域、static變數初始化

enter_wait_queue()

clean_wait_queue()

執行結果：

UserSpace output：

KernelSpace output：

Reference

About

Releases

Packages

Languages

License

gary7102/Linux-Wait-Queue

Folders and files

Latest commit

History

Repository files navigation

Intro

Wait Queue Implemention

Function explanation

User space code

執行結果

Another implementation

全域、static變數初始化

enter_wait_queue()

clean_wait_queue()

執行結果：

UserSpace output：

KernelSpace output：

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages