Status: Ongoing Research & Implementation Repo Stack: PyTorch, Hugging Face, LangChain, FAISS

Motivation

Modern NLP relies heavily on abstraction layers. To master the fundamentals of Large Language Models (LLMs), I implemented core deep learning architectures from scratch in PyTorch. This project serves as a testbed for benchmarking attention mechanisms, sequence modeling techniques, and retrieval strategies.


1. Sequence Modeling (RNN & LSTM)

Implemented recurrent architectures to analyze the Vanishing Gradient Problem and long-term dependency retention.

2. The Transformer & Attention

Replicated the “Attention Is All You Need” architecture without using nn.Transformer.

3. BERT & Transfer Learning

4. Retrieval-Augmented Generation (RAG)

Built a modular RAG pipeline to ground LLM responses in external data.