tfs-mt
Transformer from scratch for Machine Translation

🏠 Homepage • ▶️ Getting started • 🤗 Hugging Face • 🎬 Demo

This project implements the Transformer architecture from scratch considering Machine Translation as the usecase. It's mainly intended as an educational resource and a functional implementation of the architecture and the training/inference logic.

Getting started¶

From pip¶

pip install tfs-mt

From source¶

Prerequisites¶

uv [install]

Steps¶

git clone https://github.com/Giovo17/tfs-mt.git
cd tfs-mt

uv sync

cp .env.example .env
# Edit .env file with your configuration

Usage¶

Training¶

To start training the model with the default configuration:

uv run src/train.py

Inference¶

To run inference using the trained model from the HuggingFace repo:

uv run src/inference.py

Configuration¶

The whole project parameters can be configured in src/tfs_mt/configs/config.yml. Key configurations include:

Model Architecture: Config, dropout, GloVe embedding init, ...
Training: Optimizer, Learning rate scheduler, number of epochs, ...
Data: Dataset, Dataloader, Tokenizer, ...

Architecture¶

For a detailed explanation of the architecture and design choices, please refer to the Architecture Documentation.

Model sizes¶

The project supports various model configurations to suit different computational resources:

Parameter	Nano	Small	Base	Original
Encoder Layers	4	6	8	6
Decoder Layers	4	6	8	6
d_model	50	100	300	512
Num Heads	4	6	8	8
d_ff	200	400	800	2048
Norm Type	PostNorm	PostNorm	PostNorm	PostNorm
Dropout	0.1	0.1	0.1	0.1
GloVe Dim	50d	100d	300d	-

Citation¶

If you use tfs-mt in your research or project, please cite:

@software{Spadaro_tfs-mt,
author = {Spadaro, Giovanni},
license = {MIT},
title = {{tfs-mt}},
url = {https://github.com/Giovo17/tfs-mt}
}

tfs-mt Transformer from scratch for Machine Translation