Code and slides to accompany the online series of webinars: https://data4sci.com/nlp-with-pytorch by Data For Science.
Natural Language lies at the heart of current developments in Artificial Intelligence, User Interaction, and Information Processing. The combination of unprecedented corpora of written text provided by social media and the massification of computational power has led to increased interest in the development of modern NLP tools based on state-of-the-art Deep Learning tools.
In this course, participants are introduced to the fundamental concepts and algorithms used for Natural Language Processing (NLP) through an in-depth exploration of different examples built using the PyTorch framework for deep learning. Applications to real datasets will be explored in detail.
- One-Hot Encoding
- TF/IDF and Stemming
- Stopwords
- N-grams
- Working with Word Embeddings
- PyTorch review
- Activation Functions
- Loss Functions
- Training procedures
- Network Architectures
- Feed Forward Networks
- Convolutional Neural Networks
- Applications
- Motivations
- Skip-gram and Continuous Bag of words
- Transfer Learning
- Recurrent Network Networks
- Gated Recurrent Unit
- Long-Short Term Memory
- Encoder-Decoder Models
- Text Generation
|
Web: www.data4sci.com |
- Python 3.10 or higher (up to 3.13)
- macOS, Linux, or Windows
uv is a fast Python package installer and resolver. If you don't have it installed:
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | shThen clone and setup the repository:
# Clone the repository
git clone https://github.com/DataForScience/AdvancedNLP.git
cd AdvancedNLP
# Install dependencies
uv sync
# Run Jupyter
uv run jupyter notebook# Clone the repository
git clone https://github.com/DataForScience/AdvancedNLP.git
cd AdvancedNLP
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install dependencies
pip install -e .
# Run Jupyter
jupyter notebookThis project supports hardware acceleration for faster training:
- Apple Silicon (M1/M2/M3): Automatically uses MPS (Metal Performance Shaders) backend
- NVIDIA GPUs: Automatically uses CUDA if available
- CPU: Falls back to CPU if no GPU is available
Once Jupyter is running, open any of the numbered notebooks:
- Foundations of NLP.ipynb
- Neural Networks with PyTorch.ipynb
- Text Classification.ipynb
- Word Embeddings.ipynb
- Sequence Modeling.ipynb
