Multi-modal Deep Learning for Complex Document Understanding with Doug Burdick - #541
Today we’re joined by Doug Burdick, a principal research staff member at IBM Research. In a recent interview, Doug’s colleague Yunyao Li joined us to talk through some of the broader enterprise NLP problems she’s working on. One of those problems is making documents machine consumable, especially with the traditionally archival file type, the PDF. That’s where Doug and his team come in. In our conversation, we discuss the multimodal approach they’ve taken to identify, interpret, contextualize and extract things like tables from a document, the challenges they’ve faced when dealing with the tables and how they evaluate the performance of models on tables. We also explore how he’s handled generalizing across different formats, how fine-tuning has to be in order to be effective, the problems that appear on the NLP side of things, and how deep learning models are being leveraged within the group.
The complete show notes for this episode can be found at
Subscribe:
A
8 views
77
26
2 years ago 04:11:23 10
Deep Learning for Multi-Modal Systems | Data Science Summer School 2022
8 years ago 00:26:03 5
MIT Lecture 5 Multimodal Deep Learning
9 years ago 00:04:23 2
Multimodal Emotion Recognition Using Deep Learning Architectures
5 years ago 00:14:30 4
EuroSciPy 2019 Bilbao - Deep Learning for Understanding Human Multi-modal Behavior - Ricardo Manhães
3 years ago 00:49:44 8
Multi-modal Deep Learning for Complex Document Understanding with Doug Burdick - #541
11 years ago 00:20:26 24
Recent Advances in Deep Learning: Learning Structured, Robust, and Multimodal Models (Salakhutdinov, 2013)
5 years ago 00:25:21 9
Deep Learning with Multimodal Representation for... - Olivier Gavaert - TransMed - ISMB ECCB 2019
9 years ago 00:24:26 31
Multimodal Question Answering for Language and Vision (Richard Socher, Founder & CEO, MetaMind)
9 years ago 01:47:00 48
Machine learning for neuroscience: HMMs, reinforcement learning, and deep learning
12 years ago 01:12:39 25
Deep Neural Networks for Speech and Image Processing (Acero, 2012)
11 years ago 00:59:04 16
Miss Kiyami, Deep House, CDJ 2000 nexus, DJM 900 nexus, Multimodal Radio Show - November 2013
7 years ago 02:02:11 14
Vinyl Only Deep Chicago House Lounge Mix - November 2016 by Rafael Silesia
3 years ago 00:01:59 10
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning
5 years ago 00:22:14 9
MMF, a PyTorch powered MultiModal Framework
3 years ago 00:58:51 1
PyTorch BiCycleGAN from scratch - Toward Multimodal Image to Image Translation | Deep Learning
5 years ago 00:39:36 6
Going deep on deep learning with Dr. Jianfeng Gao | Podcast
4 years ago 00:54:36 1
Knowledge Extraction from Multimodal & Multilingual sources | AISC
4 years ago 00:06:09 2
AI 360: 08/03/2021. A Chinese PLM, Multi-modal Neurons, Productionising ML/DL, PyTorch 1.8 and SEER
6 years ago 00:58:20 25
Kaggle Reading Group: Probing the Need for Visual Context in Multimodal Machine Translation| Kaggle
4 years ago 00:21:01 3
Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained
4 years ago 00:13:11 1
Multimodal Few-Shot Learning with Frozen Language Models
10 years ago 00:56:49 18
German DJane Miss Kiyami @ Multimodal September 2014
8 years ago 01:46:09 116
Dark Techno Mix by Greyscale. (club!ajz, electric SMALLroom Forum Bielefeld)