Skip to content

This repository has all of the codes related to my BSc Thesis Project including a MERNG application for data collection and the Jupyter Notebook which does the information Extraction.

License

Notifications You must be signed in to change notification settings

ParsaHejabi/BSc-Thesis-Project

Repository files navigation

Bachelor Thesis Project: Information Extraction from users’ requests in the Soha system

About The Project

Soha is the name of a Persian chatbot that will be used by the faculty, students, and members of the Department of Computer Science and Engineering of Shahid Beheshti University.

This chatbot will process the users' inputs and provide guidelines to them for their requests. This project handles the information extraction phase of this megaproject by extracting entities from unstructured inputs such as:

  • The student ID
  • Students' entry year
  • Students' GPA
  • The name of the student
  • The name of the course
  • Type of the request (Like dropping a course, semester withdrawal, etc.)

A combination of rule-based methods (using regex) and deep learning methods (using the BERT language model) was used for this task.

Also, due to the unavailability of a dataset that meets this project's needs, a crowdsourcing website was launched to encourage people to enter their possible requests in a two-week contest with prizes. This dataset is going to be used for other subprojects of the Soha system.

This repository contains:

  • Codes related to a MERNG crowdsourcing web application to collect students' sample inputs using gamification methods
  • A Jupyter notebook containing the Information Extraction phase of the project using the BERT language model.
Notebook Link
Jupyter Notebook Open In Colab

(back to top)

Built With

(back to top)

Contact

Parsa Hejabi - @callme_parsa

Project Link: https://github.com/ParsaHejabi/DS-BankFinder-Project

(back to top)

About

This repository has all of the codes related to my BSc Thesis Project including a MERNG application for data collection and the Jupyter Notebook which does the information Extraction.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published