Description
In this course, you will learn :
- Using cutting-edge natural language processing algorithms, generate assessments such as MCQs, True/False questions, and so on from any content.
- Use recent breakthroughs such as BERT, OpenAI GPT-2, and T5 transformers to solve real-world edtech problems.
- Make use of NLP libraries such as Spacy, NLTK, AllenNLP, HuggingFace transformers, and so on.
- Deploy transformer models such as T5 to production in a serverless manner utilising ONNX quantization and dockerization with FastAPI.
- To execute all of these algorithms, use the Google Colab environment.
Syllabus :
1. Generate distractors (wrong choices) for MCQ options
- Theory - Generate distractors using wordnet
- Code - Generate Distractors using Wordnet
- Theory - Generate distractors using Conceptnet
- Code - Generate distractors using Conceptnet
- Theory - Generate distractors using Sense2vec
- Code - Generate distractors using Sense2vec
- Theory - Generate distractors using Sentence Transformers
- Code - Generate distractors using Sentence Transformers
2. Generate True or False Questions using Constituency Parsing and OpenAI GPT
- Theory - Constituency Parsing and OpenAI GPT2
- Code - Split a sentence using constituency parsing
- Code - Another example to split a sentence using constituency parsing
- Code - Generate alternate endings to a split sentence using OpenAI GPT2
- Assignment - Sort the generated sentences in the order of dissimilarity
- Assignment Solution - Sort the generated sentences using Sentence BERT
3. Train a question generation model using T5 transformer
- Training methodology, dataset and decoding methods for text generation
- Code - Download SQUAD dataset and preprocess
- Code - Understanding T5 Tokenizer
- Code - Prepare Pytorch Dataset class for T5
- Code - Train T5 transformer model
- Code - Use the trained T5 model to perform inference
4. Generate Fill in the blanks questions from any content
- Generate fill in the blanks - Theory
- Generate fill in the blanks - Code
5. Generate Match the following questions from any content
- Generate Match the following - Theory
- Extract keywords from any content - Code
- BERT Word Sense Disambiguation (WSD) - Code
6. Production deployment of Question Generation Models
- Speed up T5 model by ONNX conversion and use Gradio app for easy visualization
- Install Docker locally in your Operating System
- Dockerize T5 model with FastAPI and create a local API
- Serverless deployment on Google Cloud Run