Skip to content
/ lm4db Public

Material for my VLDB'22 and BTW'23 tutorials on application for language models in data management

License

Notifications You must be signed in to change notification settings

itrummer/lm4db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

<iframe width="560" height="315" src="https://www.youtube.com/embed/tYKH7Q5MDcg" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

The introduction of Transformer-based language models has led to astonishing advances in the domain of natural language processing over the past years. Not only do such models dominate in a variety of standard benchmarks. The latest generation of language models can be specialized to novel, formerly unseen tasks with little to virtually no training data.

In this tutorial, I discuss the two key ideas enabling ultra-large language models: a new neural network architecture, the Transformer, and an unsupervised training process, based on the idea of transfer learning. After discussing the theoretical concepts behind language models, I demonstrate GPT-3 and other models and provide pointers on how to get access to this technology. Finally, I discuss novel use cases in data management that are enabled by language models, covering recent research and open problems.

Slides of the VLDB'22 tutorial (90 minutes) are here.

Slides of the BTW'23 tutorial (180 minutes) are here.

Slides of the ICDE'24 tutorial (90 minutes) are here.

Please use the following citation to refer to this tutorial:

@article{Trummer2022e,
author = {Trummer, Immanuel},
doi = {10.14778/3554821.3554896},
journal = {PVLDB},
number = {12},
pages = {3770 -- 3773},
title = {From BERT to GPT-3 Codex: Harnessing the Potential of Very Large Language Models for Data Management},
volume = {15},
year = {2022}
}

About

Material for my VLDB'22 and BTW'23 tutorials on application for language models in data management

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages