Spring 2013
|
|
IBM
Watson is ushering in a new era of computing, cognitive systems. Based on
natural language processing, generate hypothesis, algorithmically test possible
responses, navigate Big Data, deliver evidence based insights, and learn through
iterations and outcomes, this new class of computing behaves more like
the world’s most sophisticated computer—the human brain. Since
its historic debut on Jeopardy!, IBM Watson has been
put to work in healthcare and financial services helping to transform these
industries.
This
course covers background concepts for working with IBM Watson and semantic
technology in general, highlighting the current research problems and
describing existing solutions. Students will gain an appreciation for what it’s
like to work in one of the most advanced software research environments in the
world. They will have the opportunity of learning from the developers of IBM
Watson, providing their own insights, ideas and solutions to problems.
This is a
seminar style class that will be taught directly from the voice of the
developers of IBM Watson. Guest speakers from the IBM Watson team will be
presenting their research areas.
Students will
be required to perform a research project on areas of interest for the Watson
Technology team, contributing to the advancement of the State of the Art in the
field.
|
Selected papers from IBM Journal of Research and
Development Volume: 56 , Issue: 3.4 “This
is Watson”. |
Students will design
and carry out a research project. A list of possible projects will be provided
by the professor, but students may also propose projects of their own, provided
they are approved by the professor. Throughout the course, students will submit
incremental versions of their project. There will be no midterms or finals.
Research projects will
be assigned in the area of Natural Language Processing, Machine Learning and Information
Retrieval, with particular focus on developing components and techniques that
can be potentially beneficial for the IBM Watson technology. They will involve
the description of the state of the art in the selected task,
the identification of an innovative solution to the given problem, coding UIMA
based text analytics to implement the proposed solutions and evaluating the technique
in benchmark tasks.
All students are
required to have a Computer Science Account for this class. To
sign up for one, go to the CRF website and then click on "Apply for an Account".
|
Date |
Topic |
Speaker |
Reading (* means optional) |
|
Jan 25th |
Introduction: The JEOPARDY! Challenge |
Alfio Gliozzo |
1. Special Questions and techniques * 2. Simulation, learning, and optimization techniques in Watson's game strategies * 3. In the game: The interface between Watson and Jeopardy! * |
|
Feb 1st |
The Deep QA architecture |
Alfio Gliozzo |
|
|
Feb 8th |
The Deep QA architecture Natural Language Processing Background |
Alfio Gliozzo |
1. Finding needles in the haystack: Search and candidate generation |
|
Feb 15th |
Natural Language Processing in Watson |
Alfio Gliozzo |
|
|
Feb 22nd |
Knowledge representation Background Structured Knowledge in Watson (basic) Semantic Web |
Alfio Gliozzo |
1. Typing candidate answers using type
coercion 2. Structured data and inference in DeepQA |
|
Mar 1st |
Domain Adaptation |
Alfio Gliozzo |
|
|
Mar 8th |
UIMA |
Siddharth Patwardhan |
2. UIMA tutorials and users guides * 3. UIMA tools * 4. UIMA references * 5. UIMA async scaleout * |
|
Mar 15th |
UIMA (hands on) |
Siddharth Patwardhan |
Recommended: bring a laptop to class. Make sure Java and Eclipse are installed. (if you have never used Eclipse, go over an online tutotial such as this one). |
|
Mar 22nd |
SPRING BREAK |
|
|
|
Mar 29th |
Midterm Student Workshop |
|
|
|
Apr 5th |
Distributional Semantics |
Alfio Gliozzo |
1.
From
Distributional to Contextual Similarity 2. Using Distributional Similarity for Lexical Expansion in Knowledge-based Word Sense Disambiguation 3. Semantic Domains in Computational Linguistics 4. http://www.machinelinking.com/ * 5. www.jobimtext.org * |
|
Apr 12nd |
Distributional Semantics I |
Alfio Gliozzo |
|
|
Apr 19th |
Machine Learning and Strategy in Watson |
David Gondek |
|
|
Apr 26th |
Advanced Semantic Analysis, Sources Linked Data and Text, Tycor, Answer Lookup, Evidence Diffusion, Semantic Technologies (vision) Crowdsourcing, Information Extraction |
Chris Welty |
|
|
May 17th |
Final Student Workshop |
|
|
Slides from the
classes are available on courseworks (if you are
auditing the class, contact Or to get access)
§
Deep QA publications website
Students must have
taken one of Artificial Intelligence, Natural Language Processing, Machine
Learning or Search Engine Technology as a pre-requisite.
Coming soon
|
Alfio Gliozzo is a research staff member at the IBM T.J. Watson Research Center. He is currently a technical leader on the Dee pQA team, coordinating a research team focused on unsupervised learning from text. At the same time, he is a key contributor of the Watson core technology for domain adaptation. He has been involved in both academic research and industry for 12 years, achieving a significant track record in delivering semantic technologies across different applications, patents and scientific publications. |