Semester 3 Review of MTech DSE
Written: 04 May 2023 by Vinayak Nayak 🏷 ["miscellaneous"]Introduction
Hello World! I completed the third semester of Master in Data Science and Technology offered by Work Integrated Learning (WILP) Section of BITS Pilani.
In this post, we shall look at the following topics one by one.
- Courses offered
- Personal Experience
- Coursework
- Course Assignment/s
- Evaluation component involving tests
- Tips for getting the most of this semester
Courses
In my opinion, this is the most important semester of the entire course so far. It’s primarily elective based. You have to complete a minimum of 15 credits in this semester and select 3 elective courses out of all the offered ones. One course i.e. Deep Learning is a mandatory one. During my time the courses offered were as follows
Course | Credits |
---|---|
Information Retrieval | 3 |
Natural Lanuage Processing | 3 |
Deep Learning | 4 |
Probabilistic Graphical Models | 4 |
Big Data Systems | 5 |
Stream Processing & Analytics | 5 |
The decision of elective needs to be taken keeping several factors in mind, some of which are as follows
- Which elective interests you the most?
- Which elective offers you skills that are relevant to your current work?
- Which elective may offer skills that can help you transition from your current role to your dream role?
- Who are instructors for that elective? What is their background, what are your seniors’ feedback regarding the instructors etc.
- Which courses can help me gain the skills needed to undertake a topic of my interest for dissertation in the subsequent semester?
- What you already know vs what you want to learn?
Personal Experience
I opted for the electives Probabilistic Graphical Models (PGM), Big Data Systems (BDS) and Stream Processing and Analytics (SPA) respectively. I was really keen on learning Information Retrieval but the coursework seemed far too basic and introductory and I was already familiar with many topics covered therein. The same was the case with Natural Language Processing. Hence I took the remaining electives. I was especially keen to learn PGM and BDS and these courses didn’t let me down as I thouroughly enjoyed taking them. SPA is interesting and very much relevant in recent times primarily because it is essential to handle the large volume, variety and velocity of data that is generated in realtime and has been gaining traction exponentially since the last decade or so.
Let me highlight a few important topics that we covered in each of the above electives along with DL which was a mandatory subject.
Deep Learning
- Perceptron/ Multi-Layer Perceptron/Artificial Neural Networks
- Optimization Techniques for DL (GD, SGD, SGD with Momentum, R-Prop, Adam)
- Convolutional Neural Networks
- Recurrent Neural Networks
- Basics of Transformers & Attention Mechanism
- Basics of GANs and Autoencoders
Probabilistic Graphical Models
- Basics of Probability Theory & Statistics
- Directed Graphical Models (Bayesian Networks)
- Undirected Graphical Models (Markov Networks)
- Exact Inference
- Approximate Inference
- Markov Chains
- Parameter Learning
- Structured Learning
Big Data Systems
- Refresher on Memory Hierarchy, Caching, Spatial & Temporal Locality
- Big Data Systems & Distributed Computing
- Hadoop Ecosystem
- CAP Theorem & NoSQL Datastores
- In Memory Computing (Apache Spark Core)
- Cloud Computing
Stream Processing and Analytics
- Scalable Streaming Systems
- Generalized Stream System architecture
- Apache Kafka
- Apache Spark Streaming
- Streaming system Algorithms
The assignments for each of the courses were meticulously crafted with the intention of demonstrating our learning in the course. We simulated stock trading, generated captions for images and analyzed huge datasets (~ 1 GB).
They were centered around learning things which are useful in real world and at the same time made us pick up the most relevant tech-stack for the respective project. Some technologies used as a result of coding the assignments first hand were as follows
- Hadoop (HDFS) + Map-Reduce
- MongoDB
- Hive/Pig/HBase
- Apache Spark
- Apache Kafka
- Tensorflow/Keras
A complete overview of the assignments along with the source-code could be found here and here respectively.
Attending lectures gives a lot of food for thought and helps us understand many things; but assignments are the most critical part where rubber meets the road and the concepts/ideas conveyed by the professor crystallize, and you feel more confident of your abilities to do things in the real world. As much as 20-25% of the evaluation component is reserved for assignment which is indicative of its importance.
Please make a note that the assignments are not a piece of cake which is why the program creators have made them into Group Assignments as opposed to Individual Assignments. Considering the fact that working professionals are the target audience, who are not in a position to dedicate 5-6 hours a day or so for studies/assignment, it is very important that you choose your group members wisely. Have a look at their LinkedIn profile, understand their interests, get on a quick call with them, spend as little as 3-4 hours with them before you decide to form a group with them. This could really go a long way as you would have someone to assist you with your learning & vice-versa. I am telling this through personal experience; it will get very hectic if you don’t choose your assignment partners wisely. Of course you might feel you will learn more stuff individually if you are stuck with lazy partners, but it will be at the cost of your health and well-being. So choose wisely because with good partners you could complement your learning and understanding better and also effectively get a wider perspective on several things as opposed to one.
Mid semester (30%) as well as Comprehensive tests (40%) are designed to test your understanding and not your agility. The faculty understands that we aren’t giving CET to get into an Engineering college but we’re already working professionals and value clarity and critical thinking more over speed. Hence, the papers will be set accordingly i.e.
- Most of the time, the paper would be more than 50% numericals and little theory.
- Theory questions would also be along the lines of comparison/contrast/application as opposed to definition.
- Rarely would you find time to be a constraint, mostly they’re doable well within the provided time.
- Most of them are genuine so no use doing Chegg or searching online for exact solutions.
- You can expect proofs, justifying/rejecting arguments, logical reasoning kind of questions in the exams.
- Although the examination platform permits you to type your answers, I would recommend you to write it down on paper and upload scanned pictures because there are some glitches with submission of typed answers and people had to suffer because of that. Trust me even if you are a slow writer you will be able to complete the paper if you’ve attended the classes and know your concepts well.
How to get the most out of this Semester
- Select the elective topics based out of either or all of the following points
- Interest/Curiosity
- Immediate aptitude for current job/job that you aspire to be in
- Professors’ aptitude and pedagogy (Look at the slides that I have uploaded for reference here or connect with your seniors to take feedback)
- Which skills/technologies are/will be more relevant at this point in time/at the time that you will graduate
- Actively engage in lectures, ask questions/doubts and get most things cleared there. This would really prove super helpful in your mid-sem and comprehensive examinations.
- Fully participate in the assignments, they are really thought-provoking and at the same time very practical which will build your knowledge of many libraries/frameworks/platforms which will improve your stature in the job market.
- Select your peers in assignment groups very mindfully. You may select different people for different electives, it’s not compulsory to have one group for all the subjects’ assignments.
I hope this article gave you a decent idea of how you could go about the final coursework semester in the BITS MTech Programme for Data Science. If you liked what you read, you could read my other posts here and connect with me on linkedin here. Thanks for reading through!