Skip to content

Conversation

@mlinenweber
Copy link

@mlinenweber mlinenweber commented Sep 2, 2024

  • Fixed lahman.py - now downloads from dropbox [does so in a generic way, by finding the link in http://seanlahman.com]
  • Added test_lahman.py
  • Refactored conftest.py, by combining response_get_monkeypatch and bref_get_monkeypatch to use target_get_monkeypatch [they differed by only 1 line].
  • Added fixtures get_data_file_bytes and target.
  • Updated documentation.
  • Updated setup.py to install py7zr and requests_cache

fixes #391

@bdilday
Copy link
Contributor

bdilday commented Sep 4, 2024

there is also a fix here #434
I think there's a preference to avoid relying on 7-zip and py7zr?

@bdilday
Copy link
Contributor

bdilday commented Sep 4, 2024

I meant this PR #435

@mlinenweber
Copy link
Author

mlinenweber commented Sep 8, 2024

Respectfully, I don't know of any reason why not to use 7zip/py7z. Now, the justification to use it is that it is the compression format Sean Lahman has chosen. Furthermore, to that extent, I don't think it is appropriate to rely upon a third-party to host a zip file. (Same goes for putting the zip file into this repository). I would rather go directly to the source, i.e., seanlahman.com - that way when he updates the data this library can download the latest without delay.

Can discuss more here or on Discord. Thanks.

soup = BeautifulSoup(response.content, "html.parser")

anchor = soup.find("a", string="Comma-delimited version")
url = anchor["href"].replace("dl=0", "dl=1")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mlinenweber Can you explain these 2 lines and what _get_download_url() is supposed to return?

    anchor = soup.find("a", string="Comma-delimited version")
    url = anchor["href"].replace("dl=0", "dl=1")

I get the below error when trying to run people()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[24], line 31
     28 soup = BeautifulSoup(response.content, "html.parser")
     30 anchor = soup.find("a", string="Comma-delimited version")
---> 31 url = anchor["href"].replace("dl=0", "dl=1")

TypeError: 'NoneType' object is not subscriptable

Is soup.find("a", string="Comma-delimited version") supposed to be None?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

download_lahman() failing

3 participants