Simulate March Madness with a Monte Carlo Bracket (Python)

Filling out one bracket tells you almost nothing. The tournament is a coin-flip tournament — even a heavy favorite has to survive six straight games — so the honest question isn’t "who wins?" but "how often does each team win, across thousands of possible tournaments?" That’s a Monte Carlo simulation, and you can build a working one in a couple dozen lines of Python. Full code: scripts/monte-carlo-bracket-simulator-python.py.

The idea: ratings in, probabilities out

Everything starts with a single-game win probability. We’ll use the log5 / Elo-style formula, the same logistic-on-the-rating-difference idea behind our logistic-regression predictor and the basketball Elo engine. Given two team ratings, the probability team A beats team B is:

p = 1 / (1 + 10 ** (-(rating_a - rating_b) / scale))

The scale controls how much a rating gap matters: a smaller scale makes the favorite more certain, a larger one flattens things toward a coin flip. With Elo-style numbers, a scale around 400 is the classic choice (a 400-point edge is about a 10-to-1 favorite). The crucial point: the ratings are an input you supply. Pull them from KenPom, Bart Torvik, or the NET — whatever you trust — and the simulator turns them into bracket odds.

The win-probability function

import random
import numpy as np

def game_prob(rating_a, rating_b, scale=400.0):
    """Probability team A beats team B (log5 / Elo logistic)."""
    return 1.0 / (1.0 + 10.0 ** (-(rating_a - rating_b) / scale))

One function, the whole model. Everything else is just bookkeeping the bracket.

To simulate a single game we draw a random number and compare it to the probability — if random.random() < game_prob(...), team A advances; otherwise team B does. That one stochastic step, repeated, is the entire engine.

A toy field (clearly made up)

You need real ratings to get real answers, so the numbers below are a toy example with invented teams — eight fictional schools so the code runs out of the box. Do not read anything into them; swap in a real 64-team field with real ratings when you run it for keeps.

# Toy 8-team bracket: (name, rating). MADE UP for illustration only.
field = [
    ("Riverside Otters", 1720),
    ("Granite State",     1635),
    ("Cedar Valley",      1610),
    ("Lakeshore Tech",    1560),
    ("Fort Banner",       1545),
    ("Pine Hollow",       1500),
    ("Bay City",          1480),
    ("Sandstorm A&M",     1450),
]

Eight fictional teams. The real version is a 64-entry list seeded into bracket order.

Simulate one bracket, round by round

A bracket is just a list in seed order. To play a round, pair adjacent teams (0 vs 1, 2 vs 3, …), simulate each game, and keep the winners — halving the list each time until one team remains. This works for any power-of-two field: 8, 16, 32, or the real 64.

def play_round(teams):
    """Take a list of (name, rating); return the winners, in order."""
    winners = []
    for i in range(0, len(teams), 2):
        a, b = teams[i], teams[i + 1]
        if random.random() < game_prob(a[1], b[1]):
            winners.append(a)
        else:
            winners.append(b)
    return winners

def simulate_bracket(field):
    """Play down to a champion. Return (champion, final_four_names)."""
    teams = list(field)
    final_four = None
    while len(teams) > 1:
        if len(teams) == 4:          # snapshot the Final Four
            final_four = [t[0] for t in teams]
        teams = play_round(teams)
    return teams[0][0], final_four

For a true 64-team bracket, seed the field so 1 plays 16, 8 plays 9, and so on; the round logic is unchanged.

Run it ten thousand times and tally

One simulated bracket is one random outcome. Run it many times — 10,000 is plenty for stable odds — and count how often each team wins it all or reaches the Final Four. Those counts, divided by the number of simulations, are the probabilities.

def monte_carlo(field, n=10000, seed=0):
    random.seed(seed)
    titles = {t[0]: 0 for t in field}
    ff     = {t[0]: 0 for t in field}
    for _ in range(n):
        champ, final_four = simulate_bracket(field)
        titles[champ] += 1
        for name in final_four:
            ff[name] += 1
    # convert counts to probabilities
    title_odds = {k: v / n for k, v in titles.items()}
    ff_odds    = {k: v / n for k, v in ff.items()}
    return title_odds, ff_odds

title_odds, ff_odds = monte_carlo(field, n=10000)
for name, p in sorted(title_odds.items(), key=lambda kv: -kv[1]):
    print(f"{name:18s}  title {p:5.1%}   Final Four {ff_odds[name]:5.1%}")

The output is each team’s odds to win it all and to reach the Final Four. With the toy field above, the top seed leads — as it should — but never approaches certainty.

Why numpy if the core uses random? Because once this works, the natural next step is to vectorize — simulate thousands of games at once with numpy.random.random arrays instead of a Python loop — which turns 10,000 brackets from a noticeable wait into a blink. Start with the readable loop above; reach for numpy when you scale to the full field and want millions of simulated games.

Reading the output honestly

The favorite’s title odds are lower than your gut says. Winning six straight games is hard even at 75% per game (0.75 to the sixth is only about 18%). Monte Carlo makes that compounding visceral.
Garbage in, garbage out. The simulation is only as good as your ratings. It faithfully propagates whatever edge your ratings claim — including their mistakes. Use ratings you trust, and consider running it with two different rating sources to see how much the answer moves.
It assumes independence and ignores matchups. Real games have injuries, styles, and rest that a single rating can’t see. The model treats every game as a fresh draw from the ratings; reality is messier.
More simulations, smoother odds. At 1,000 runs the tail teams’ numbers jump around; at 10,000+ they settle. If a team’s odds change a lot when you re-run with a new seed, you need more iterations.

That’s a complete, runnable bracket simulator: a probability formula, a round function, and a loop that counts. Point it at real ratings and you can answer the questions a single bracket never could — not "who will win," but "who’s most likely to, and by how much" — which is the only honest way to think about a one-and-done tournament.

Sources & further reading

Theory: Chapter 3: Python for Sports Analytics — a free chapter at DataField.dev.
KenPom — kenpom.com (a source of team ratings to feed the model)
Companion code: scripts/monte-carlo-bracket-simulator-python.py
Related: Logistic-regression game predictor · Basketball Elo from scratch · Win-probability charts in Python

C. B. Zakarian

C. B. Zakarian is an independent analyst who writes about what he can measure: ball sports and the player-run economies inside Roblox. He builds every model, chart, and calculator here himself from public data, shows the working, and never invents a number. When the data can't answer a question, he says so. On CollegeAthleteInsider, that means college football and basketball by the numbers, plus a plain-English read on the NIL-era rules. More about the methodology →

The idea: ratings in, probabilities out

The win-probability function

A toy field (clearly made up)

Simulate one bracket, round by round

Run it ten thousand times and tally

Reading the output honestly

Sources & further reading

C. B. Zakarian

Related in Tutorials

Build a Strength-of-Schedule-Adjusted Ranking in a Spreadsheet

Pull College Basketball Data with sportsdataverse: Men's and Women's Hoops

From CSV to Chart: a Repeatable Analysis Pipeline with pandas