pyball Tutorial¶

I'll go through every module's public functions, and you'll see sample output.

Player ID Lookup¶

Search for a given player's MLB Advanced Media (used on savant) and Baseball-Reference unique id keys by name.

def search(self, last_name: str, first_name: str = None, ignore_accents: bool = True) -> pd.DataFrame:
    """
    Searches for a player in the registry based on their name.

    Parameters:
    - last_name (str): The last name of the player to search for.
    - first_name (str, optional): The first name of the player to search for. Defaults to None.
    - ignore_accents (bool, optional): Whether to ignore accents in the search. Defaults to True.

    Returns:
    - pd.DataFrame: A DataFrame containing the search results.
    """
In [ ]:
from pyball.playerid_lookup import PlayerLookup

client = PlayerLookup()
client.search("Ramirez", "Jose")
Out[ ]:
name_last name_first key_mlbam key_retro key_bbref key_fangraphs mlb_played_first mlb_played_last
0 ramirez jose 542432 ramij004 ramirjo02 10171 2014.0 2018.0
1 ramirez jose 608070 ramij003 ramirjo01 13510 2013.0 2024.0

I was looking for Jose Ramirez on the Guardians, so I would use the second entry, because he is currently playing (as of 2024).

Utility functions¶

Create various valid URLs easily.

def make_bbref_player_url(bbref_key):
    """
    Function to generate baseball-reference url from bbref_key

    Parameters
    ----------
    bref_key: String
        bbref_key of the player

    Returns
    ----------
    String
        baseball-reference url of the player
    """
In [ ]:
from pyball import utils

# Use the key_bbref from the player lookup
utils.make_bbref_player_url("ramirjo01")
Out[ ]:
'https://www.baseball-reference.com/players/r/ramirjo01.shtml'
def make_bbref_team_url(team, year):
    """
    Function to generate a baseball-reference team url from team and year

    Parameters
    ----------
    team: String
        team name

    Returns
    ----------
    String
        baseball-reference team url
    """
In [ ]:
# Manually enter valid team and year
utils.make_bbref_team_url("CLE", "2017")
Out[ ]:
'https://www.baseball-reference.com/teams/CLE/2017.shtml'
def make_savant_player_url(last, first, key_mlbam):
    """
    Function to generate baseball savant url from last name, first name, and mlbam key

    Parameters
    ----------
    last: String
        last name of the player
    first: String
        first name of the player
    key_mlbam: String
        mlbam key of the player

    Returns
    ----------
    String
        baseball savant url of the player
    """
In [ ]:
# Use the key_mlbam from the player lookup
utils.make_savant_player_url("Ramirez", "Jose", "608070")
Out[ ]:
'https://baseballsavant.mlb.com/savant-player/Jose-Ramirez-608070'

Baseball-Reference Player Stats¶

In [ ]:
from pyball.baseball_reference_player import BaseballReferencePlayerStatsScraper

# Batter example: Hank Aaron
url = "https://www.baseball-reference.com/players/a/aaronha01.shtml"
scraper = BaseballReferencePlayerStatsScraper(url)
Using cached data
In [ ]:
scraper.batting_stats().head()
Out[ ]:
Year Age Tm Lg G PA AB R H 2B ... OPS OPS+ TB GDP HBP SH SF IBB Pos Awards
0 1954 20 MLN NL 122 509 468 58 131 27 ... .769 104 209 13 3 6 4 0 *79/H RoY-4
1 1955 21 MLN NL 153 665 602 105 189 37 ... .906 141 325 20 3 7 4 5 *974/H AS,MVP-9
2 1956 22 MLN NL 153 660 609 106 200 34 ... .923 151 340 21 2 5 7 6 *9/H AS,MVP-3
3 1957 23 MLN NL 151 675 615 118 198 27 ... .978 166 369 13 0 0 3 15 *98/H AS,MVP-1
4 1958 24 MLN NL 153 664 601 109 196 34 ... .931 153 328 21 1 0 3 16 *98 AS,MVP-3,GG

5 rows × 30 columns

In [ ]:
# Pitcher example: Clayton Kershaw
url = "https://www.baseball-reference.com/players/k/kershcl01.shtml"
scraper = BaseballReferencePlayerStatsScraper(url)
Using cached data
In [ ]:
scraper.pitching_stats().head()
Out[ ]:
Year Age Tm Lg W L W-L% ERA G GS ... BF ERA+ FIP WHIP H9 HR9 BB9 SO9 SO/W Awards
0 2008 20 LAD NL 5 5 .500 4.26 22 21 ... 470 98 4.08 1.495 9.1 0.9 4.3 8.4 1.92
1 2009 21 LAD NL 8 8 .500 2.79 31 30 ... 701 143 3.08 1.228 6.3 0.4 4.8 9.7 2.03
2 2010 22 LAD NL 13 10 .565 2.91 32 32 ... 848 133 3.12 1.179 7.0 0.6 3.6 9.3 2.62
3 2011 23 LAD NL 21 5 .808 2.28 33 33 ... 912 161 2.47 0.977 6.7 0.6 2.1 9.6 4.59 AS,CYA-1,MVP-12,GG
4 2012 24 LAD NL 14 9 .609 2.53 33 33 ... 901 150 2.89 1.023 6.7 0.6 2.5 9.1 3.63 AS,CYA-2,MVP-16

5 rows × 35 columns

Baseball-Reference Team Stats¶

In [ ]:
from pyball.baseball_reference_team import BaseballReferenceTeamStatsScraper

url = "https://www.baseball-reference.com/teams/LAD/2017.shtml"
scraper = BaseballReferenceTeamStatsScraper(url)
Fetching from URL
In [ ]:
scraper.batting_stats().head()
Out[ ]:
Rk Pos Name Age G PA AB R H 2B ... OBP SLG OPS OPS+ TB GDP HBP SH SF IBB
0 1 C Yasmani Grandal# 28 129 482 438 50 108 27 ... .308 .459 .767 101 201 10 0 1 3 0
1 2 1B Cody Bellinger* 21 132 548 480 87 128 26 ... .352 .581 .933 143 279 5 1 0 3 13
2 3 2B Logan Forsythe 30 119 439 361 56 81 19 ... .351 .327 .678 83 118 12 4 0 5 1
3 4 SS Corey Seager* 23 145 613 539 85 159 33 ... .375 .479 .854 126 258 14 4 0 3 5
4 5 3B Justin Turner 32 130 543 457 72 147 32 ... .415 .530 .945 150 242 12 19 1 7 5

5 rows × 28 columns

In [ ]:
scraper.pitching_stats().head()
Out[ ]:
Rk Pos Name Age W L W-L% ERA G GS ... WP BF ERA+ FIP WHIP H9 HR9 BB9 SO9 SO/W
0 1 SP Clayton Kershaw* 29 18 4 .818 2.31 27 27 ... 4 679 179 3.07 0.949 7.0 1.2 1.5 10.4 6.73
1 2 SP Alex Wood* 26 16 3 .842 2.72 27 25 ... 2 614 152 3.32 1.057 7.3 0.9 2.2 8.9 3.97
2 3 SP Rich Hill* 37 12 8 .600 3.32 25 25 ... 2 552 125 3.72 1.091 6.6 1.2 3.3 11.0 3.39
3 4 SP Kenta Maeda 29 13 6 .684 4.22 29 25 ... 4 557 98 4.07 1.154 8.1 1.5 2.3 9.4 4.12
4 5 SP Hyun Jin Ryu* 30 5 9 .357 3.77 25 24 ... 4 541 110 4.74 1.366 9.1 1.6 3.2 8.2 2.58

5 rows × 34 columns

Baseball Savant¶

In [ ]:
from pyball import savant

ohtani_batter = savant.SavantScraper(
    "https://baseballsavant.mlb.com/savant-player/shohei-ohtani-660271?stats=statcast-r-hitting-mlb"
)
ohtani_pitcher = savant.SavantScraper(
    "https://baseballsavant.mlb.com/savant-player/shohei-ohtani-660271?stats=statcast-r-pitching-mlb&playerType=pitcher"
)
Fetching from URL
Fetching from URL
Out[ ]:
Year xwOBA xBA xSLG xISO xOBP Brl Brl% EV Max EV Hard Hit% K% BB% Whiff% Chase Rate Speed OAA Arm Strength Bat Speed Swing Length
0 2018 94 80 97 97 77 76 98 96 93 98 8 68 10 51 82 NaN NaN NaN NaN
1 2019 73 84 74 62 65 65 86 97 97 88 22 36 28 36 79 NaN NaN NaN NaN
2 2020 54 29 49 57 52 55 71 55 85 68 18 79 16 69 92 NaN NaN NaN NaN
3 2021 97 71 100 100 95 100 100 97 100 97 7 98 3 54 90 NaN NaN NaN NaN
4 2022 98 88 99 98 91 99 98 97 100 93 30 81 26 57 76 NaN NaN NaN NaN
In [ ]:
ohtani_batter.get_percentile_stats().head()
Out[ ]:
Year xwOBA xBA xSLG xISO xOBP Brl Brl% EV Max EV Hard Hit% K% BB% Whiff% Chase Rate Speed OAA Arm Strength Bat Speed Swing Length
0 2018 94 80 97 97 77 76 98 96 93 98 8 68 10 51 82 NaN NaN NaN NaN
1 2019 73 84 74 62 65 65 86 97 97 88 22 36 28 36 79 NaN NaN NaN NaN
2 2020 54 29 49 57 52 55 71 55 85 68 18 79 16 69 92 NaN NaN NaN NaN
3 2021 97 71 100 100 95 100 100 97 100 97 7 98 3 54 90 NaN NaN NaN NaN
4 2022 98 88 99 98 91 99 98 97 100 93 30 81 26 57 76 NaN NaN NaN NaN

Pitcher only function¶

In [ ]:
ohtani_pitcher.get_pitching_stats().head()
Out[ ]:
Season Age Pitches Batted Balls Barrels Barrel % Barrel/PA Exit Velocity Max EV Launch Angle ... XBA XSLG WOBA XWOBA XWOBACON HardHit% K% BB% ERA xERA
0 2018 23.0 853.0 125.0 6.0 4.8 2.8 87.0 111.9 16.0 ... 0.207 0.338 0.277 0.284 0.352 31.2 29.9 10.4 3.31 3.37
1 2020 25.0 80.0 5.0 0.0 0.0 0.0 97.6 102.0 10.8 ... 0.238 0.349 0.515 0.476 0.405 60.0 18.8 50.0 37.80 11.70
2 2021 26.0 2027.0 323.0 23.0 7.1 4.3 88.4 112.7 11.7 ... 0.207 0.344 0.279 0.282 0.351 39.9 29.3 8.3 3.18 3.32
3 2022 27.0 2629.0 394.0 25.0 6.3 3.8 87.1 113.3 14.5 ... 0.204 0.311 0.255 0.256 0.347 33.2 33.2 6.7 2.33 2.68
4 2023 28.0 2094.0 297.0 30.0 10.1 5.6 86.4 110.7 11.5 ... 0.206 0.377 0.277 0.302 0.383 35.0 31.5 10.4 3.14 3.82

5 rows × 21 columns

Batter only function¶

In [ ]:
ohtani_batter.get_batting_stats().head()
Out[ ]:
Season Age Pitches Batted Balls Barrels Barrel % Barrel/PA Exit Velocity Max EV Launch Angle LA Sweet- Spot % XBA XSLG WOBA XWOBA XWOBACON HardHit% K% BB%
0 2018 23.0 1455.0 225.0 36.0 16.0 9.8 92.9 113.9 12.4 35.6 0.272 0.542 0.390 0.381 0.502 51.1 27.9 10.1
1 2019 24.0 1683.0 278.0 34.0 12.2 8.0 92.8 115.1 6.8 31.7 0.280 0.487 0.352 0.350 0.446 47.1 25.9 7.8
2 2020 25.0 739.0 103.0 11.0 10.7 6.3 89.1 111.9 9.2 32.0 0.234 0.423 0.290 0.331 0.413 42.7 28.6 12.6
3 2021 26.0 2594.0 350.0 78.0 22.3 12.2 93.6 119.0 16.6 35.4 0.266 0.612 0.393 0.408 0.566 53.6 29.6 15.0
4 2022 27.0 2546.0 428.0 72.0 16.8 10.8 92.9 119.1 12.1 35.0 0.275 0.549 0.370 0.383 0.481 49.8 24.2 10.8
In [ ]:
ohtani_pitcher.get_batted_ball_profile().head()
Out[ ]:
Season GB % FB % LD % PU % Pull % Straight % Oppo % Weak % Topped % Under % Flare/Burner % Solid % Barrel % Barrel/PA
0 2018 40.0 24.0 25.6 10.4 36.8 32.8 28.8 6.4 27.2 28.8 24.8 8.0 4.8 2.8
1 2020 40.0 20.0 20.0 20.0 20.0 20.0 60.0 20.0 20.0 0.0 40.0 20.0 0.0 0.0
2 2021 46.4 23.8 22.3 7.4 38.4 37.5 24.1 6.5 31.6 24.8 22.0 8.0 7.1 4.3
3 2022 41.9 26.6 22.8 8.6 37.8 36.3 25.9 4.1 31.2 27.4 25.6 5.1 6.3 3.8
4 2023 45.8 25.6 20.5 8.1 44.1 29.6 26.3 5.1 34.0 23.9 21.2 5.4 10.1 5.6
In [ ]:
ohtani_batter.get_pitch_tracking().head()
Out[ ]:
Year Pitch Type # % PA AB H 1B 2B 3B ... BA XBA SLG XSLG WOBA XWOBA EV LA Whiff% PutAway%
0 2024 Fastball 1011 54.2 258 219 72 37 18 1 ... 0.329 0.351 0.639 0.705 0.448 0.477 95.9 13 25.9 20.0
1 2024 Breaking 545 29.2 135 120 34 15 5 1 ... 0.283 0.282 0.667 0.651 0.427 0.426 94.4 18 37.6 23.0
2 2024 Offspeed 310 16.6 75 68 22 12 5 2 ... 0.324 0.321 0.588 0.590 0.405 0.413 97.1 12 30.3 17.4
3 2023 Fastball 1140 49.5 266 216 82 43 13 3 ... 0.380 0.351 0.787 0.743 0.517 0.494 97.1 13 25.0 16.9
4 2023 Breaking 712 30.9 184 163 38 16 5 1 ... 0.233 0.233 0.571 0.587 0.369 0.378 92.2 18 40.3 26.2

5 rows × 23 columns

¶