Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.

#343 So Much Pydantic!

July 11, 2023 00:35:51 34.72 MB Downloads: 0
Watch on YouTube

About the show

Sponsored by us! Support our work through:

Connect with the hosts

Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too.

Michael #1: Pydantic v2 released

  • Pydantic V2 is compatible with Python 3.7 and above.
  • There is a migration guide.
  • Check out the bump-pydantic tool to auto upgrade your classes

Brian #2: Two Ways to Turbo-Charge tox

  • Hynek
  • Not just tox run-parallel or tox -p or tox --``parallel , but you should know about that also.
  • The 2 ways
    • Build one wheel instead of N sdists
    • Run pytest in parallel
  • tox builds source distributions, sdists, for each environment before running tests.
    • that’s not really what we want, especially if we have a test matrix.
    • It’d be better to build a wheel once, and use that for all the environments.
    • Add this to your tox.ini and now we get one wheel build [testenv] package = wheel wheel_build_env = .pkg
    • It will save time. And a lot if you have a lengthy build.
  • Run pytest in parallel, instead of tox in parallel, with pytest -n auto
    • Requires the pytest-xdist plugin.
    • Can slow down tests if your tests are pretty fast anyway.
    • If you’re using hypothesis, you probably want to try this.
  • There are some gotchas and workarounds (like getting coverage to work) in the article.

Michael #3: Awesome Pydantic

  • A curated list of awesome things related to Pydantic! 🌪️
  • Notable items for me:
    • ML:
      • spaCy 🌟(26575) - spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more.
      • ray 🌟(26496) - Ray provides a simple, universal API for building distributed applications.
      • jina 🌟(18734) - Jina is geared towards building search systems for any kind of data, including text, images, audio, video and many more. With the modular design & multi-layer abstraction, you can leverage the efficient patterns to build the system by parts, or chaining them into a Flow for an end-to-end experience.
    • Data
      • Beanie 🌟(1287) - Beanie - is an Asynchronous Python object-document mapper (ODM) for MongoDB, based on Motor and Pydantic.
    • Utilities
      • datamodel-code-generator 🌟(1694) - Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
      • Goodconf 🌟(99) - A thin wrapper over Pydantic's settings management. Allows you to define configuration variables and load them from environment or JSON/YAML file. Also generates initial configuration files and documentation for your defined configuration.

Brian #4: CLI tools hidden in the Python standard library

  • Simon Willison (and hat tip to Seth Larson)
  • Simon looked for all of the command line goodies in the standard library.
  • I knew about python -m http.server to run a server at port 8000 from the local directory, but there’s so much more.
  • Here are a few
    • python -m gzip --decompress pypi.db.gz as a gzip utility.
      • Especially handy on Windows as it doesn’t come with gzip by default
    • python -m base64 with -d decode, -e encode, and -t encode and decode
    • python -m asyncio for an asyncio REPL
    • Tokenize a Python file with python -m tokenize somefile.py
    • View the AST with python -m ast somefile.py
    • Pretty print JSON with python -m json.tool

Extras

Brian:

Michael:

Joke: