Python – czxttkl

Minimal examples of HuggingFace LLM training

I’m sharing a minimal example of training an LLM model using HuggingFace’s libraries trl/transformers/evaluate/datasets/etc. The example is mainly borrowed from https://wandb.ai/capecape/alpaca_ft/reports/How-to-Fine-tune-an-LLM-Part-3-The-HuggingFace-Trainer–Vmlldzo1OTEyNjMy and its github repo https://github.com/tcapelle/llm_recipes/blob/main/scripts/train_hf.py. Here is the full file: Now let’s examine the code in more details: First, we initialize a weights & bias project (wandb.init(…)), which is used for logging intermediate training/evaluation …

Continue reading “Minimal examples of HuggingFace LLM training”

Dollar cost average on TQQQ vs QQQ [Real Data]

(Please cross reference to my previous post for simulation-based results: https://czxttkl.com/2023/01/15/dollar-cost-average-on-tqqq-vs-qqq/) In this post, we use real data (from 2021 april to 2024 jan) to show that even after a bear market (in 2022), DCA on TQQQ is still more profitable than QQQ. UPRO is also more profitable than SPY but the margin is not that …

Continue reading “Dollar cost average on TQQQ vs QQQ [Real Data]”

Dollar cost average on TQQQ vs QQQ [Simulation]

This post runs a simple simulation on using the Dollar-Cost-Average strategy to invest in QQQ vs. TQQQ, its 3x-leveraged ETF. In the simulation, QQQ will plunge 34% after 20 rounds. One round is a small up-and-down cycle – the index first moves up 1% then 3% down, until 34% down from the top. After reaching …

Continue reading “Dollar cost average on TQQQ vs QQQ [Simulation]”

Check object memory in Python

We can use Pympler (https://pympler.readthedocs.io/en/latest/) to inspect an object’s memory in Python. It can get an object’s memory including their references. Here is an example:

Run a specific parent’s method from a child class

This is an example of how to run a specific parent’s method from a child class in Python. Results:

PyTorch Lightning template

Back to the old days, I’ve studied how to implement highly efficient PyTorch pipelines for multi-gpu training [1]. DistributedDataParallel is the way to go, but it is cumbersome that we need boilerplates for spawning workers and constructing data readers. Now, PyTorch Lighting offers clean API for setting up multi-gpu training easily. Here is a template …

Continue reading “PyTorch Lightning template”

Analyze DistributedDataParallel (DPP)’s behavior

DistributedDataParallel implements data parallelism at the module level which can run across different machines. There is one process running on each device where one copy of the module is held. Each process loads its own data which is non-overlapping with other processes’. At the initialization phase, all copies are synchronized to ensure they start from …

Continue reading “Analyze DistributedDataParallel (DPP)’s behavior”

EmbeddingBag from PyTorch

EmbeddingBag in PyTorch is a useful feature to consume sparse ids and produce embeddings. Here is a minimal example. There are 4 ids’ embeddings, each of 3 dimensions. We have two data points, the first point has three ids (0, 1, 2) and the second point has the id (3). This is reflected in input …

Continue reading “EmbeddingBag from PyTorch”

Test with torch.multiprocessing and DataLoader

As we know PyTorch’s DataLoader is a great tool for speeding up data loading. Through my experience with trying DataLoader, I consolidated my understanding in Python multiprocessing. Here is a didactic code snippet: from torch.utils.data import DataLoader, Dataset import torch import time import datetime import torch.multiprocessing as mp num_batches = 110 print(“File init”) class DataClass: …

Continue reading “Test with torch.multiprocessing and DataLoader”

Indexing data on GPU

This correspond a question I asked on Pytorch forum. When we want to use indexing to extract data which is already on GPU, should indexing arrays better be on GPU as well? The answer is yes. Here is the evidence: I also created some other examples to show that if you are generating indexing arrays …

Continue reading “Indexing data on GPU”