JUSTIN
SEONYONG
LEE


My Experience Taking "XCS224n: NLP with Deep Learning"

December 29, 2022


This is a short post on my experience taking XCS224n, or "NLP with Deep Learning," this fall through the Stanford Center for Professional Development (SCPD). Prior to taking this course, I didn't find too much information about it online. Here, I will share what it covers, how it differs from the more well-known CS224n, and who I think the course would benefit.



What is XCS224n? [Top]

XCS224n is an entirely online version of Stanford University's on-campus course CS224n, also called "NLP with Deep Learning," taught by Christopher Manning. Being offered through the SCPD, the target audience for XCS224n is working professionals with some machine learning experience. The course cost $1595 when I took it, although the SCPD is increasing tuition to $1750 for all of their "AI Professional Program" courses.

Some fast facts: the course is an in-depth and technical survey of natural language processing using neural networks. It covers the progression of NLP technology from the first efforts to create word vectors, on to RNN-based approaches for various tasks (e.g. classification, NER, translation) and the motivation for attention in RNNs, leading into the current Transformer-based state of affairs. Along the way, Prof. Manning gives a detailed review of general deep learning concepts, such as gradient descent, computation graphs, and the backpropagation algorithm.

The course requires comfort with multivariable calculus, probability, and linear algebra. Basic knowledge of data structures and algorithms is also helpful for both understanding various course concepts (e.g. topological sort for backprop), as well as their applications (such as how stacks are used in dependency parsers). All coding assignments are done in Python, and requires the use of Pytorch and Numpy. The course provides lectures and tutorials on these libraries if you are not familiar with them; that said, I would personally advise to get some hands-on exposure to both before taking this course.

The course is conducted entirely online and lasts around 10 weeks. It is designed to require an average commitment of 10-15 hours per week, which I found to be more or less correct. Course materials are delivered through a combination of GitHub and SCPD's own online portal. The course is conducted asynchronously: students are provided with a series of CS224n video lectures pre-recorded by Prof. Manning, edited for brevity and split into shorter sub-videos for ease of navigation. The SCPD portal keeps track of your progress on each lecture as you move through the curriculum. The lectures can be watched at any pace; however, there are five assignments with hard deadlines distributed throughout the 10 weeks. At the end, all students with a score of 70% or higher receive a certificate from the SCPD acknowledging that they passed the course.

There is no interaction with students of the on-campus CS224n, nor is there in-person instruction by Prof. Manning. Each student is assigned to a Course Facilitator (CF), all of whom are Stanford affilates that have taken CS224n previously. Communication with CFs, course staff, and fellow students is done via a Slack community maintained by SCPD.

The main difference between XCS224n and CS224n is the final project; the capstone of CS224n is a final project in which students work either individually or in groups to apply their learnings from the semester to a problem of their choosing. XCS224n does not have a final project. If this is a dealbreaker for you, there does seem to be a way of taking CS224n online. The cost is in the neighborhood of $5000 and, unlike XCS224n, grants Stanford academic credit. I cannot speak further to this as I didn't take this option.



Isn't all of this available online for free? [Top]

Stanford has made the lecture videos for CS224n (in fact, both the Winter 2019 and Winter 2021 iterations) available on YouTube for free. These, in and of themselves, are some of the best resources available anywhere for getting an introduction to NLP. Since the assignments are also available on the CS224n website, you could definitely go through the course curriculum on your own.

I won't go into the broader pros and cons of paid courses versus self-guided learning; what you prefer would depend on your learning style, personal circumstances, and professional goals. That said, I can think of several benefits to taking the course in this instance.

First, having access to the CFs was by far the best part of the course. One difficulty in self-studying a topic is that when you have a question, you may not have anyone readily available to give you a precise, reliable answer. Stack Overflow or Reddit may have some gems, but your mileage will vary. Having someone available via Slack or email to answer whatever question you may have, especially ones that are theoretical or research-oriented, vastly enhances the learning experience.

The course staff also organized various events, such as talks by the CFs on careers in machine learning and issues in machine learning ethics, and a live Q&A session with Prof. Manning. If you are in the process of switching careers, you may find the opportunity to connect with and learn from the CFs, Prof. Manning, and fellow students to be valuable.

On a more technical note, Assignments 4 and 5 require the use of GPUs. The SCPD provides 65 hours of computing credits for Microsoft Azure to enable everyone to complete the assignments. It's not required to use Azure; you can use whatever cloud provider you prefer, or go local. I was able to do the assignments on a local GPU with 6 GB of memory, and I saw some chatter on the course Slack about students successfully using Apple M1/M2 chips. But for anyone who needs it, it is nice to have the computing credits and technical support from the CFs.

In closing this section, I would recommend that anyone who is interested in XCS224n take advantage of the free lecture videos and assignments beforehand. After watching the first couple lectures and trying out the first assignment, you can come to a determination about whether to turn back and review more fundamental ML, keep going on your own, or enroll in the course. Another consideration is that your employer may cover part or all of the tuition as part of employee professional development programs.



More on the Assignments [Top]

There are five graded assignments in the course. They are a mixture of multiple-choice quizzes, coding portions that are 100% autograded via Gradescope, and (mostly extra credit) written portions that are more math-focused. These assignments are, in varying degrees, condensed versions of those in CS224n. The topics of the assignments are:

  1. Exploring Word Embeddings
  2. Understanding and Implementing Word2Vec
  3. Neural Transition-based Dependency Parsing
  4. Neural Machine Translation with RNNs
  5. Self-attention, Transformers, and Pretraining

Assignments 1 and 2 cover various techniques for training word vectors from large text corpora. Assignment 3 deals with dependency parsing, which is the problem of assigning directional relationships to the words of a sentence that collectively capture the grammatical structure and intended meaning of that sentence. Assignment 4 involves implementing and training an LSTM-based model to translate from Cherokee to English, and Assignment 5 applies a generative Transformer model to a task involving real-world knowledge.

Overall, I was happy with the pacing and difficulty level of the assignments. The coding portions were balanced between writing data preprocessing code, and implementing models and training pipelines in Pytorch according to a spec to perform the actual tasks. The written portions were thought-provoking and provided mathematical rigor to concepts that are often introduced in more handwavy ways.

Because this course skips the final project, the assignments only cover about half of the curriculum. Subsequent modules delve deeper into applications of the Transformer architecture, along with techniques for training, improving, and evaluating models in NLP.


Conclusion [Top]

I had a good experience with this course, and would recommend it to anyone looking for an efficient but thorough introduction to (or review of) modern NLP. The decision to pursue the course through the SCPD versus self-study comes down to the need for guidance from the CFs and fellow students, financial considerations, and professional considerations such as certification and networking. I think that the SCPD option gets you through the curriculum faster. Either way, this course will leave you prepared to apply existing NLP models to real-world problems, develop your own models, and follow along with the latest research.