Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.
#325 It's called a merge conflict
February 28, 2023
00:39:32
38.31 MB
Downloads: 0
Watch on YouTube
About the show
Sponsored by Microsoft for Startups Founders Hub.
Connect with the hosts
- Michael: @mkennedy@fosstodon.org
- Brian: @brianokken@fosstodon.org
- Show: @pythonbytes@fosstodon.org
Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too.
Michael #1: Python Parquet and Arrow: Using PyArrow With Pandas
- Parquet is an efficient, compressed, column-oriented storage format for arrays and tables of data.
- Less wrangle-able than Pandas, but way faster and lower memory
- Questions answered
- Can we use Pandas DataFrames and Arrow tables together, and if so, how is this done? (It turns out the answer is yes, and it’s quite simple, as we’ll see).
- In what ways are Arrow tables “better” than Pandas DataFrames? In other words, for which tasks are Arrow tables better suited? Conversely, what tasks are possible or easy in Pandas that are difficult or impossible in Arrow?
- As an on-disk format, how does Parquet compare to popular alternatives such as feather, orc, CSV, etc.?
Brian #2: FastAPI-Filter
- Arthur Rio
- Add query string filters to your api endpoints and show them in the swagger UI.
- The supported backends are SQLAlchemy and MongoEngine.
- FastAPI-Filter documentation
- The philosophy of fastapi_filter is to be very declarative. You define the fields you want to be able to filter on as well as the type of operator, then tie your filter to a specific model.
- default filters: neq, gt, gte, in, isnull, lt, lte, not/ne, not_in, nin, like/ilike
- The swagger support is actually quite cool.
Michael #3: 12 Python Decorators to Take Your Code to the Next Level
- Decorators are awesome
- This is mostly home-grown decorators, but some standard ones too
- Notable ones:
- @warps
- @lru_cache
- @repeat
- @timeit
- @retry ← no please use tenacity
- @countcall
- @rate_limited
- @dataclass
- @register
- @property
- @singledispatch
Brian #4: PyHamcrest
- Contributed by Txels
- PyHamcrest is a framework for writing matcher objects, allowing you to declaratively define “match” rules.
- PyHamcrest tutorial
- Having a tool that allows you to pick out precisely the aspect under test and describe the values it should have, to a controlled level of precision, helps greatly in writing tests that are “just right.”
- From Brian: I’ve been reluctant to try matcher style assertion helper libraries, as, with pytest, assert works just fine. However, I can see cases where PyHamcrest assertions could help test readability, and that’s always a win.
- Examples:
- equality:
assert_that(theBiscuit, equal_to(myBiscuit))
- exceptions:
assert_that(calling(parse, bad_data), raises(ValueError))
- async:
assert_that(``**await**
resolved(future), future_raising(ValueError))
- boolean:
assert_that(theBiscuit.isCooked())
- equality:
- There’s predefined matchers for
- objects, numbers, text, logical checks, dequences, dictionaries
Extras
Brian:
- pytest tips and tricks - recent post, and discussion on upcoming Talk Python episode
- sharing pytest fixtures - placeholder page where I’ll share slides and code after my talk.
Michael: