pyball
Tutorial¶
I'll go through every module's public functions, and you'll see sample output.
Player ID Lookup¶
Search for a given player's MLB Advanced Media (used on savant) and Baseball-Reference unique id keys by name.
def search(self, last_name: str, first_name: str = None, ignore_accents: bool = True) -> pd.DataFrame:
"""
Searches for a player in the registry based on their name.
Parameters:
- last_name (str): The last name of the player to search for.
- first_name (str, optional): The first name of the player to search for. Defaults to None.
- ignore_accents (bool, optional): Whether to ignore accents in the search. Defaults to True.
Returns:
- pd.DataFrame: A DataFrame containing the search results.
"""
In [ ]:
from pyball.playerid_lookup import PlayerLookup
client = PlayerLookup()
client.search("Ramirez", "Jose")
Out[ ]:
name_last | name_first | key_mlbam | key_retro | key_bbref | key_fangraphs | mlb_played_first | mlb_played_last | |
---|---|---|---|---|---|---|---|---|
0 | ramirez | jose | 542432 | ramij004 | ramirjo02 | 10171 | 2014.0 | 2018.0 |
1 | ramirez | jose | 608070 | ramij003 | ramirjo01 | 13510 | 2013.0 | 2024.0 |
I was looking for Jose Ramirez on the Guardians, so I would use the second entry, because he is currently playing (as of 2024).
Utility functions¶
Create various valid URLs easily.
def make_bbref_player_url(bbref_key):
"""
Function to generate baseball-reference url from bbref_key
Parameters
----------
bref_key: String
bbref_key of the player
Returns
----------
String
baseball-reference url of the player
"""
In [ ]:
from pyball import utils
# Use the key_bbref from the player lookup
utils.make_bbref_player_url("ramirjo01")
Out[ ]:
'https://www.baseball-reference.com/players/r/ramirjo01.shtml'
def make_bbref_team_url(team, year):
"""
Function to generate a baseball-reference team url from team and year
Parameters
----------
team: String
team name
Returns
----------
String
baseball-reference team url
"""
In [ ]:
# Manually enter valid team and year
utils.make_bbref_team_url("CLE", "2017")
Out[ ]:
'https://www.baseball-reference.com/teams/CLE/2017.shtml'
def make_savant_player_url(last, first, key_mlbam):
"""
Function to generate baseball savant url from last name, first name, and mlbam key
Parameters
----------
last: String
last name of the player
first: String
first name of the player
key_mlbam: String
mlbam key of the player
Returns
----------
String
baseball savant url of the player
"""
In [ ]:
# Use the key_mlbam from the player lookup
utils.make_savant_player_url("Ramirez", "Jose", "608070")
Out[ ]:
'https://baseballsavant.mlb.com/savant-player/Jose-Ramirez-608070'
Baseball-Reference Player Stats¶
In [ ]:
from pyball.baseball_reference_player import BaseballReferencePlayerStatsScraper
# Batter example: Hank Aaron
url = "https://www.baseball-reference.com/players/a/aaronha01.shtml"
scraper = BaseballReferencePlayerStatsScraper(url)
Using cached data
In [ ]:
scraper.batting_stats().head()
Out[ ]:
Year | Age | Tm | Lg | G | PA | AB | R | H | 2B | ... | OPS | OPS+ | TB | GDP | HBP | SH | SF | IBB | Pos | Awards | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1954 | 20 | MLN | NL | 122 | 509 | 468 | 58 | 131 | 27 | ... | .769 | 104 | 209 | 13 | 3 | 6 | 4 | 0 | *79/H | RoY-4 |
1 | 1955 | 21 | MLN | NL | 153 | 665 | 602 | 105 | 189 | 37 | ... | .906 | 141 | 325 | 20 | 3 | 7 | 4 | 5 | *974/H | AS,MVP-9 |
2 | 1956 | 22 | MLN | NL | 153 | 660 | 609 | 106 | 200 | 34 | ... | .923 | 151 | 340 | 21 | 2 | 5 | 7 | 6 | *9/H | AS,MVP-3 |
3 | 1957 | 23 | MLN | NL | 151 | 675 | 615 | 118 | 198 | 27 | ... | .978 | 166 | 369 | 13 | 0 | 0 | 3 | 15 | *98/H | AS,MVP-1 |
4 | 1958 | 24 | MLN | NL | 153 | 664 | 601 | 109 | 196 | 34 | ... | .931 | 153 | 328 | 21 | 1 | 0 | 3 | 16 | *98 | AS,MVP-3,GG |
5 rows × 30 columns
In [ ]:
# Pitcher example: Clayton Kershaw
url = "https://www.baseball-reference.com/players/k/kershcl01.shtml"
scraper = BaseballReferencePlayerStatsScraper(url)
Using cached data
In [ ]:
scraper.pitching_stats().head()
Out[ ]:
Year | Age | Tm | Lg | W | L | W-L% | ERA | G | GS | ... | BF | ERA+ | FIP | WHIP | H9 | HR9 | BB9 | SO9 | SO/W | Awards | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2008 | 20 | LAD | NL | 5 | 5 | .500 | 4.26 | 22 | 21 | ... | 470 | 98 | 4.08 | 1.495 | 9.1 | 0.9 | 4.3 | 8.4 | 1.92 | |
1 | 2009 | 21 | LAD | NL | 8 | 8 | .500 | 2.79 | 31 | 30 | ... | 701 | 143 | 3.08 | 1.228 | 6.3 | 0.4 | 4.8 | 9.7 | 2.03 | |
2 | 2010 | 22 | LAD | NL | 13 | 10 | .565 | 2.91 | 32 | 32 | ... | 848 | 133 | 3.12 | 1.179 | 7.0 | 0.6 | 3.6 | 9.3 | 2.62 | |
3 | 2011 | 23 | LAD | NL | 21 | 5 | .808 | 2.28 | 33 | 33 | ... | 912 | 161 | 2.47 | 0.977 | 6.7 | 0.6 | 2.1 | 9.6 | 4.59 | AS,CYA-1,MVP-12,GG |
4 | 2012 | 24 | LAD | NL | 14 | 9 | .609 | 2.53 | 33 | 33 | ... | 901 | 150 | 2.89 | 1.023 | 6.7 | 0.6 | 2.5 | 9.1 | 3.63 | AS,CYA-2,MVP-16 |
5 rows × 35 columns
Baseball-Reference Team Stats¶
In [ ]:
from pyball.baseball_reference_team import BaseballReferenceTeamStatsScraper
url = "https://www.baseball-reference.com/teams/LAD/2017.shtml"
scraper = BaseballReferenceTeamStatsScraper(url)
Fetching from URL
In [ ]:
scraper.batting_stats().head()
Out[ ]:
Rk | Pos | Name | Age | G | PA | AB | R | H | 2B | ... | OBP | SLG | OPS | OPS+ | TB | GDP | HBP | SH | SF | IBB | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | C | Yasmani Grandal# | 28 | 129 | 482 | 438 | 50 | 108 | 27 | ... | .308 | .459 | .767 | 101 | 201 | 10 | 0 | 1 | 3 | 0 |
1 | 2 | 1B | Cody Bellinger* | 21 | 132 | 548 | 480 | 87 | 128 | 26 | ... | .352 | .581 | .933 | 143 | 279 | 5 | 1 | 0 | 3 | 13 |
2 | 3 | 2B | Logan Forsythe | 30 | 119 | 439 | 361 | 56 | 81 | 19 | ... | .351 | .327 | .678 | 83 | 118 | 12 | 4 | 0 | 5 | 1 |
3 | 4 | SS | Corey Seager* | 23 | 145 | 613 | 539 | 85 | 159 | 33 | ... | .375 | .479 | .854 | 126 | 258 | 14 | 4 | 0 | 3 | 5 |
4 | 5 | 3B | Justin Turner | 32 | 130 | 543 | 457 | 72 | 147 | 32 | ... | .415 | .530 | .945 | 150 | 242 | 12 | 19 | 1 | 7 | 5 |
5 rows × 28 columns
In [ ]:
scraper.pitching_stats().head()
Out[ ]:
Rk | Pos | Name | Age | W | L | W-L% | ERA | G | GS | ... | WP | BF | ERA+ | FIP | WHIP | H9 | HR9 | BB9 | SO9 | SO/W | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | SP | Clayton Kershaw* | 29 | 18 | 4 | .818 | 2.31 | 27 | 27 | ... | 4 | 679 | 179 | 3.07 | 0.949 | 7.0 | 1.2 | 1.5 | 10.4 | 6.73 |
1 | 2 | SP | Alex Wood* | 26 | 16 | 3 | .842 | 2.72 | 27 | 25 | ... | 2 | 614 | 152 | 3.32 | 1.057 | 7.3 | 0.9 | 2.2 | 8.9 | 3.97 |
2 | 3 | SP | Rich Hill* | 37 | 12 | 8 | .600 | 3.32 | 25 | 25 | ... | 2 | 552 | 125 | 3.72 | 1.091 | 6.6 | 1.2 | 3.3 | 11.0 | 3.39 |
3 | 4 | SP | Kenta Maeda | 29 | 13 | 6 | .684 | 4.22 | 29 | 25 | ... | 4 | 557 | 98 | 4.07 | 1.154 | 8.1 | 1.5 | 2.3 | 9.4 | 4.12 |
4 | 5 | SP | Hyun Jin Ryu* | 30 | 5 | 9 | .357 | 3.77 | 25 | 24 | ... | 4 | 541 | 110 | 4.74 | 1.366 | 9.1 | 1.6 | 3.2 | 8.2 | 2.58 |
5 rows × 34 columns
Baseball Savant¶
In [ ]:
from pyball import savant
ohtani_batter = savant.SavantScraper(
"https://baseballsavant.mlb.com/savant-player/shohei-ohtani-660271?stats=statcast-r-hitting-mlb"
)
ohtani_pitcher = savant.SavantScraper(
"https://baseballsavant.mlb.com/savant-player/shohei-ohtani-660271?stats=statcast-r-pitching-mlb&playerType=pitcher"
)
Fetching from URL Fetching from URL
Out[ ]:
Year | xwOBA | xBA | xSLG | xISO | xOBP | Brl | Brl% | EV | Max EV | Hard Hit% | K% | BB% | Whiff% | Chase Rate | Speed | OAA | Arm Strength | Bat Speed | Swing Length | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2018 | 94 | 80 | 97 | 97 | 77 | 76 | 98 | 96 | 93 | 98 | 8 | 68 | 10 | 51 | 82 | NaN | NaN | NaN | NaN |
1 | 2019 | 73 | 84 | 74 | 62 | 65 | 65 | 86 | 97 | 97 | 88 | 22 | 36 | 28 | 36 | 79 | NaN | NaN | NaN | NaN |
2 | 2020 | 54 | 29 | 49 | 57 | 52 | 55 | 71 | 55 | 85 | 68 | 18 | 79 | 16 | 69 | 92 | NaN | NaN | NaN | NaN |
3 | 2021 | 97 | 71 | 100 | 100 | 95 | 100 | 100 | 97 | 100 | 97 | 7 | 98 | 3 | 54 | 90 | NaN | NaN | NaN | NaN |
4 | 2022 | 98 | 88 | 99 | 98 | 91 | 99 | 98 | 97 | 100 | 93 | 30 | 81 | 26 | 57 | 76 | NaN | NaN | NaN | NaN |
In [ ]:
ohtani_batter.get_percentile_stats().head()
Out[ ]:
Year | xwOBA | xBA | xSLG | xISO | xOBP | Brl | Brl% | EV | Max EV | Hard Hit% | K% | BB% | Whiff% | Chase Rate | Speed | OAA | Arm Strength | Bat Speed | Swing Length | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2018 | 94 | 80 | 97 | 97 | 77 | 76 | 98 | 96 | 93 | 98 | 8 | 68 | 10 | 51 | 82 | NaN | NaN | NaN | NaN |
1 | 2019 | 73 | 84 | 74 | 62 | 65 | 65 | 86 | 97 | 97 | 88 | 22 | 36 | 28 | 36 | 79 | NaN | NaN | NaN | NaN |
2 | 2020 | 54 | 29 | 49 | 57 | 52 | 55 | 71 | 55 | 85 | 68 | 18 | 79 | 16 | 69 | 92 | NaN | NaN | NaN | NaN |
3 | 2021 | 97 | 71 | 100 | 100 | 95 | 100 | 100 | 97 | 100 | 97 | 7 | 98 | 3 | 54 | 90 | NaN | NaN | NaN | NaN |
4 | 2022 | 98 | 88 | 99 | 98 | 91 | 99 | 98 | 97 | 100 | 93 | 30 | 81 | 26 | 57 | 76 | NaN | NaN | NaN | NaN |
Pitcher only function¶
In [ ]:
ohtani_pitcher.get_pitching_stats().head()
Out[ ]:
Season | Age | Pitches | Batted Balls | Barrels | Barrel % | Barrel/PA | Exit Velocity | Max EV | Launch Angle | ... | XBA | XSLG | WOBA | XWOBA | XWOBACON | HardHit% | K% | BB% | ERA | xERA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2018 | 23.0 | 853.0 | 125.0 | 6.0 | 4.8 | 2.8 | 87.0 | 111.9 | 16.0 | ... | 0.207 | 0.338 | 0.277 | 0.284 | 0.352 | 31.2 | 29.9 | 10.4 | 3.31 | 3.37 |
1 | 2020 | 25.0 | 80.0 | 5.0 | 0.0 | 0.0 | 0.0 | 97.6 | 102.0 | 10.8 | ... | 0.238 | 0.349 | 0.515 | 0.476 | 0.405 | 60.0 | 18.8 | 50.0 | 37.80 | 11.70 |
2 | 2021 | 26.0 | 2027.0 | 323.0 | 23.0 | 7.1 | 4.3 | 88.4 | 112.7 | 11.7 | ... | 0.207 | 0.344 | 0.279 | 0.282 | 0.351 | 39.9 | 29.3 | 8.3 | 3.18 | 3.32 |
3 | 2022 | 27.0 | 2629.0 | 394.0 | 25.0 | 6.3 | 3.8 | 87.1 | 113.3 | 14.5 | ... | 0.204 | 0.311 | 0.255 | 0.256 | 0.347 | 33.2 | 33.2 | 6.7 | 2.33 | 2.68 |
4 | 2023 | 28.0 | 2094.0 | 297.0 | 30.0 | 10.1 | 5.6 | 86.4 | 110.7 | 11.5 | ... | 0.206 | 0.377 | 0.277 | 0.302 | 0.383 | 35.0 | 31.5 | 10.4 | 3.14 | 3.82 |
5 rows × 21 columns
Batter only function¶
In [ ]:
ohtani_batter.get_batting_stats().head()
Out[ ]:
Season | Age | Pitches | Batted Balls | Barrels | Barrel % | Barrel/PA | Exit Velocity | Max EV | Launch Angle | LA Sweet- Spot % | XBA | XSLG | WOBA | XWOBA | XWOBACON | HardHit% | K% | BB% | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2018 | 23.0 | 1455.0 | 225.0 | 36.0 | 16.0 | 9.8 | 92.9 | 113.9 | 12.4 | 35.6 | 0.272 | 0.542 | 0.390 | 0.381 | 0.502 | 51.1 | 27.9 | 10.1 |
1 | 2019 | 24.0 | 1683.0 | 278.0 | 34.0 | 12.2 | 8.0 | 92.8 | 115.1 | 6.8 | 31.7 | 0.280 | 0.487 | 0.352 | 0.350 | 0.446 | 47.1 | 25.9 | 7.8 |
2 | 2020 | 25.0 | 739.0 | 103.0 | 11.0 | 10.7 | 6.3 | 89.1 | 111.9 | 9.2 | 32.0 | 0.234 | 0.423 | 0.290 | 0.331 | 0.413 | 42.7 | 28.6 | 12.6 |
3 | 2021 | 26.0 | 2594.0 | 350.0 | 78.0 | 22.3 | 12.2 | 93.6 | 119.0 | 16.6 | 35.4 | 0.266 | 0.612 | 0.393 | 0.408 | 0.566 | 53.6 | 29.6 | 15.0 |
4 | 2022 | 27.0 | 2546.0 | 428.0 | 72.0 | 16.8 | 10.8 | 92.9 | 119.1 | 12.1 | 35.0 | 0.275 | 0.549 | 0.370 | 0.383 | 0.481 | 49.8 | 24.2 | 10.8 |
In [ ]:
ohtani_pitcher.get_batted_ball_profile().head()
Out[ ]:
Season | GB % | FB % | LD % | PU % | Pull % | Straight % | Oppo % | Weak % | Topped % | Under % | Flare/Burner % | Solid % | Barrel % | Barrel/PA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2018 | 40.0 | 24.0 | 25.6 | 10.4 | 36.8 | 32.8 | 28.8 | 6.4 | 27.2 | 28.8 | 24.8 | 8.0 | 4.8 | 2.8 |
1 | 2020 | 40.0 | 20.0 | 20.0 | 20.0 | 20.0 | 20.0 | 60.0 | 20.0 | 20.0 | 0.0 | 40.0 | 20.0 | 0.0 | 0.0 |
2 | 2021 | 46.4 | 23.8 | 22.3 | 7.4 | 38.4 | 37.5 | 24.1 | 6.5 | 31.6 | 24.8 | 22.0 | 8.0 | 7.1 | 4.3 |
3 | 2022 | 41.9 | 26.6 | 22.8 | 8.6 | 37.8 | 36.3 | 25.9 | 4.1 | 31.2 | 27.4 | 25.6 | 5.1 | 6.3 | 3.8 |
4 | 2023 | 45.8 | 25.6 | 20.5 | 8.1 | 44.1 | 29.6 | 26.3 | 5.1 | 34.0 | 23.9 | 21.2 | 5.4 | 10.1 | 5.6 |
In [ ]:
ohtani_batter.get_pitch_tracking().head()
Out[ ]:
Year | Pitch Type | # | % | PA | AB | H | 1B | 2B | 3B | ... | BA | XBA | SLG | XSLG | WOBA | XWOBA | EV | LA | Whiff% | PutAway% | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2024 | Fastball | 1011 | 54.2 | 258 | 219 | 72 | 37 | 18 | 1 | ... | 0.329 | 0.351 | 0.639 | 0.705 | 0.448 | 0.477 | 95.9 | 13 | 25.9 | 20.0 |
1 | 2024 | Breaking | 545 | 29.2 | 135 | 120 | 34 | 15 | 5 | 1 | ... | 0.283 | 0.282 | 0.667 | 0.651 | 0.427 | 0.426 | 94.4 | 18 | 37.6 | 23.0 |
2 | 2024 | Offspeed | 310 | 16.6 | 75 | 68 | 22 | 12 | 5 | 2 | ... | 0.324 | 0.321 | 0.588 | 0.590 | 0.405 | 0.413 | 97.1 | 12 | 30.3 | 17.4 |
3 | 2023 | Fastball | 1140 | 49.5 | 266 | 216 | 82 | 43 | 13 | 3 | ... | 0.380 | 0.351 | 0.787 | 0.743 | 0.517 | 0.494 | 97.1 | 13 | 25.0 | 16.9 |
4 | 2023 | Breaking | 712 | 30.9 | 184 | 163 | 38 | 16 | 5 | 1 | ... | 0.233 | 0.233 | 0.571 | 0.587 | 0.369 | 0.378 | 92.2 | 18 | 40.3 | 26.2 |
5 rows × 23 columns