TSX sector rotation signals
What it does: Trained an LSTM on 8 years of TSX sector ETF data to flag rotation windows between defensive and cyclical holdings. The model runs on weekly rebalancing logic, not daily noise.
These are real projects built by students in the programme — not exercises, not demos. Each one started with a dataset, a question, and about twelve hours of confusion before something clicked.
What it does: Trained an LSTM on 8 years of TSX sector ETF data to flag rotation windows between defensive and cyclical holdings. The model runs on weekly rebalancing logic, not daily noise.
What it does: Processes S&P 500 earnings call transcripts using a fine-tuned BERT variant and correlates language tone shifts with next-session price movement. Built on publicly available SEC filings.
What it does: Classifies market conditions into four volatility regimes using a random forest trained on VIX derivatives, put/call ratios, and historical realized vol windows. Outputs a regime label, not a price target.
What it does: A GRU that estimates how long a morning gap or momentum push typically sustains itself before mean-reversion dominates. Validated on 15-minute bar data across three years.
What it does: Combines satellite foot traffic estimates, job posting counts, and shipping data into a structured feature set fed to an XGBoost regression model for retail sector earnings estimation.
What it does: Uses a lightweight transformer architecture to detect when historical correlation relationships between asset classes break down — a leading signal for portfolio stress, not alpha generation.
Honest accounts. These are notes shared by participants after presenting their final work — they reflect what the process actually felt like, not what it looked like from the outside.
Completion takes longer than expected. Most students underestimate the data cleaning phase by a significant margin — that part alone tends to consume a third of the total build time.
I spent the first four weeks convinced my feature selection was wrong. It wasn't — the model just needed a proper validation window. Getting that part right changed everything about how I think about backtesting.
The transcripts were messier than any tutorial data I'd worked with before. Cleaning them taught me more about real NLP pipelines than the architecture decisions did — and that surprised me.
My first transformer attempt overfit badly on 2008 data. Rebuilding it with rolling windows instead of fixed splits was a painful lesson, but the final model actually generalises to out-of-sample periods.
Aggregate figures across all student projects submitted for final review — reported as-is, without rounding or selective framing.
This is what a real project environment looks like — messy, annotated, and built iteratively over the course of about nine weeks.
Yahoo Finance API combined with FRED macroeconomic series — fully reproducible, no paid data feeds required.
Python, pandas, scikit-learn, PyTorch — standard tools, no proprietary libraries. Everything runs locally.
Walk-forward cross-validation with a minimum 90-day out-of-sample holdout period — no look-ahead leakage.