Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.
Similar Podcasts

24H24L
Evento en línea, de 24 horas de duración que consiste en la emisión de 24 audios de diversas temáticas sobre GNU/Linux. Estos son los audios del evento en formato podcast.

The Infinite Monkey Cage
Brian Cox and Robin Ince host a witty, irreverent look at the world through scientists' eyes.

Talking Kotlin
A bimonthly podcast that covers the Kotlin programming language by JetBrains, as well as related technologies. Hosted by Hadi Hariri
#330 Your data, validated 5x-50x faster, coming soon
Watch on YouTube About the show Sponsored by Influxdb Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: Pydantic V2 Pre Release Terrence Dorsey & Samuel Colvin Alpha release available to everyone: pip install --pre -U "pydantic>=2.0a1" Headlines: pydantic-core - all validation logic rewritten in Rust and moved to separate package, pytest-core 5-50x faster separation will aid safety and maintainability Lots ready for experimentation BaseModel, Dataclasses, Serialization, … Much still under construction Docs, BaseSettings→ pydantic-settings, … Michael #2: microdot The impossibly small web framework for Python and MicroPython Microdot is a minimalistic Python web framework inspired by Flask, and designed to run on systems with limited resources such as microcontrollers. It runs on standard Python and on MicroPython. Support for async, websockets, tls, even ASGI servers. Less mem usage by a big margin. Brian #3: GitHub Actions Tools: watchgha, build and inspect, and pytest annotate failures watchgha Ned Batchelder Watch GH Actions progress on the command line build-and-inspect-python-package Hynek Test the build of wheels, check contents, lint README print sdist contents, wheel contents, and metadata pytest-github-actions-annotate-failures utgwkk Nice traceback annotations for pytest Michael #4: PEP 709 – Inlined comprehensions by Carl Meyer Comprehensions are currently compiled as nested functions, which provides isolation of the comprehension’s iteration variable, but is inefficient at runtime. This PEP proposes to inline list, dictionary, and set comprehensions into the code where they are defined, and provide the expected isolation by pushing/popping clashing locals on the stack. This change makes comprehensions much faster: up to 2x faster for a microbenchmark of a comprehension alone. Extras Michael: Python Web Apps that Fly with CDNs Course Joke: Can’t watch movies
#329 Creating very old Python code
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Michael #1: Prefix-cache via Brendan Hannigan You can set an environment variable or use it as a command line argument and then instead of creating tons of __pycache__ folders to store your *.pyc files right next to the source code, it puts them in some specified folder. Introduced in python 3.8. Brian #2: NiceGUI Suggested by several listeners Browser based GUI “NiceGUI is an easy-to-use, Python-based UI framework, which shows up in your web browser. You can create buttons, dialogs, Markdown, 3D scenes, plots and much more. It is great for micro web apps, dashboards, robotics projects, smart home solutions and similar use cases. You can also use it in development, for example when tweaking/configuring a machine learning algorithm or tuning motor controllers.” - from the README Michael #3: flask-ngrok A simple way to demo Flask apps from your machine. Makes your Flask apps running on localhost available over the internet via ngrok. Great for testing API consumers too. app = Flask(__name__) run_with_ngrok(app) # Start ngrok when app is run # Endpoints ... if __name__ == '__main__': app.run() Brian #4: No-async async with Python Will McGugan Allowing async while not requiring async Await me (maybe) borrowed from Simon Willison’s The “await me maybe” pattern for Python asyncio Optionally awaitable Providing API methods that can be called by both async and non-async code. The called method really is async, but if a caller doesn’t want to know when the code is done, it can ignore the return value and not await. MK: I had to solve a similar problem in fastapi-chameleon MK: Syncify async functions. Extras: Brian: PyPI has a blog Docker no longer sunsetting free team plan Jokes: Long-lived software Mysteries make life more interesting last paragraph, discussing the cov fixture of pytest-cov
#328 We are going to need some context here
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: zipapp Part of standard library since 3.5 Yet another thing I learned recently from Brett Cannon “This module provides tools to manage the creation of zip files containing Python code, which can be executed directly by the Python interpreter. The module provides both a Command-Line Interface and a Python API.” Including: Creating Standalone Applications with zipapp Michael #2: Reverse engineering the Apple News app with #python and #nerd power As we navigate the digital world, we often come across articles we don't have time to read but still want to save for later. One way to accomplish this is by using the Read Later feature in Apple News. But what if you want to access those articles outside the Apple News app, such as on a different device or with someone who doesn't use Apple News? Or what if you want to automatically post links to those articles on your blog? That's where the nerd powers come in. The linked article shows how to use Python to solve your own problem Leading to Rhet Turnbull’s CLI: apple-news-to-sqlite Brian #3: What is a context manager? Trey Hunner Also look at all the cool goodies in contextlib from standard library @contextmanager closing suppress redirect_stdout, redirect_stderr chdir Michael #4: nox-poetry: Use Poetry inside Nox sessions via 2 people: John Hagen and Marc Prewitt This package provides a drop-in replacement for the nox.session decorator, and for the nox.Session object passed to user-defined session functions. Comes from Claudio Jolowicz's hypermodern python cookiecutter Covered this on Talk Python: talkpython.fm/episodes/show/362/hypermodern-python-projects This session performs the following steps: Build a wheel from the local package. Install the wheel as well as the pytest package. Invoke pytest to run the test suite against the installation. Consider what would happen in this session if we had imported @session from nox instead of nox_poetry: Package dependencies would only be constrained by the wheel metadata, not by the lock file. In other words, their versions would not be pinned. The pytest dependency would not be constrained at all. Poetry would be installed as a build backend every time. Extras Brian: Sharing is Caring: Sharing pytest fixtures talk availabe at about 2:40:58 on Day 2 video of PyCascades 2023. Also full Day 1 and Day 2 Michael: Wired connection to remote mesh router == wow! Using the Linksys Atlas Max 6E Joke: UnsafeWarnings
#327 Untangling XML with Pydantic
Watch on YouTube About the show Sponsored by Compiler Podcast from Red Hat. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: pydantic-xml extension via Ilan Recall untangle. How about some pydantic in the mix? pydantic-xml is a pydantic extension providing model fields xml binding and xml serialization / deserialization. It is closely integrated with pydantic which means it supports most of its features. Brian #2: How virtual environments work Brett Cannon This should be required reading for anyone learning Python. Maybe right after “Hello World” and right before “My first pytest test”, approximately. Some history of environments Back in the day, there was global and your directory. How environments work structure: bin, include, and lib pyvenv.cfg configuration file How Python uses virtual environments What activation does, and that it’s optional. Yes, activation is optional. A new project called microvenv that helps VS Code. Mostly to fix the “Debian doesn’t ship python3 with venv” problem. It doesn’t include script activation stuff It’s super small, less than 100 lines of code, in one file. Michael #3: DbDeclare Declarative layer for your database. https://raaidarshad.github.io/dbdeclare/guide/controller/#example Sent in by creator raaid DbDeclare is a Python package that helps you create and manage entities in your database cluster, like databases, roles, access control, and (eventually) more. It aims to fill the gap between SQLAlchemy (SQLA) and infrastructure as code (IaC). You can: Declare desired state in Python Avoid maintaining raw SQL Tightly integrate your databases, roles, access control, and more with your tables Migrations like alembic coming too. Brian #4: Testing multiple Python versions with nox and pyenv Seth Michael Larson This is a cool “what to do first” with nox. Specifically, how to use it to run pytest against your project on multiple versions of Python. Example noxfile.py is super small import nox @nox.session(python=["3.8", "3.9", "3.10", "3.11", "3.12", "pypy3"]) def test(session): session.install(".") session.install("-rdev-requirements.txt") session.run("pytest", "tests/") How to run everything, nox or nox -s test. How to run single sessions, nox -s test-311 for just Python 3.11 Also how to get this to work with pyenv. pyenv global 3.8 3.9 3.10 3.11 3.12-dev This reminds me that I keep meaning to write a workflow comparison post about nox and tox. Extras Michael: GitHub makes 2FA mandatory next week for active developers New adventure bike [image 1, image 2]. Who’s got good ideas for where to ride in the PNW? Wondering why I got it, here’s a fun video. Joke: Case of the Mondays
#326 Let's Go for a PyGWalk
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: Data Classification : Does Python still have a need for class without @dataclass? Glyph dataclasses have been in the the language since 3.7 That’s pretty much all modern Python, right? “…, is there any point to having non-@dataclass classes any more? Is there any remaining justification for writing them in new code?” Options: class just becomes a dataclass if you have typehinted members in it. data instead of class, to avoid decorators Michael #2: PyGWalker Turn your pandas dataframe into a Tableau-style User Interface for visual analysis. Works with pandas and polars Open-source alternative to Tableau It allows data scientists to analyze data and visualize patterns with simple drag-and-drop operations. Brian #3: An opinionated Python boilerplate Duarte O.Carmo Tools and processes for new projects pip-tools - Pip-tools strikes the right balance between simplicity, effectiveness, and speed. especially for generating pinned requirements.txt files, if necessary pyproject.toml - for configuration. packaging, but also any tool that supports it. ruff black, isort no pre-commit hooks, just run it in CI Michael #4: Front Matter VS Code via Mark Little If you have content that supports frontmatter and is markdown-based, check this out. Stay in your editor and easily create, manage, and publish content. Don’t make front matter mistakes When was it published? What is the timezone text formatting again? Learn new features of your existing static site (e.g. article image) Manage images and more. Extras Brian: VSCode improves IntelliSense support for pytest in Feb release Michael: AI search wars get weird Proton Drive is Out of Beta, Available for Everyone Joke: Is your computer on? Is it on fire?
#325 It's called a merge conflict
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: Python Parquet and Arrow: Using PyArrow With Pandas Parquet is an efficient, compressed, column-oriented storage format for arrays and tables of data. Less wrangle-able than Pandas, but way faster and lower memory Questions answered Can we use Pandas DataFrames and Arrow tables together, and if so, how is this done? (It turns out the answer is yes, and it’s quite simple, as we’ll see). In what ways are Arrow tables “better” than Pandas DataFrames? In other words, for which tasks are Arrow tables better suited? Conversely, what tasks are possible or easy in Pandas that are difficult or impossible in Arrow? As an on-disk format, how does Parquet compare to popular alternatives such as feather, orc, CSV, etc.? Brian #2: FastAPI-Filter Arthur Rio Add query string filters to your api endpoints and show them in the swagger UI. The supported backends are SQLAlchemy and MongoEngine. FastAPI-Filter documentation The philosophy of fastapi_filter is to be very declarative. You define the fields you want to be able to filter on as well as the type of operator, then tie your filter to a specific model. default filters: neq, gt, gte, in, isnull, lt, lte, not/ne, not_in, nin, like/ilike The swagger support is actually quite cool. Michael #3: 12 Python Decorators to Take Your Code to the Next Level Decorators are awesome This is mostly home-grown decorators, but some standard ones too Notable ones: @warps @lru_cache @repeat @timeit @retry ← no please use tenacity @countcall @rate_limited @dataclass @register @property @singledispatch Brian #4: PyHamcrest Contributed by Txels PyHamcrest is a framework for writing matcher objects, allowing you to declaratively define “match” rules. PyHamcrest tutorial Having a tool that allows you to pick out precisely the aspect under test and describe the values it should have, to a controlled level of precision, helps greatly in writing tests that are “just right.” From Brian: I’ve been reluctant to try matcher style assertion helper libraries, as, with pytest, assert works just fine. However, I can see cases where PyHamcrest assertions could help test readability, and that’s always a win. Examples: equality: assert_that(theBiscuit, equal_to(myBiscuit)) exceptions: assert_that(calling(parse, bad_data), raises(ValueError)) async: assert_that(``**await** resolved(future), future_raising(ValueError)) boolean: assert_that(theBiscuit.isCooked()) There’s predefined matchers for objects, numbers, text, logical checks, dequences, dictionaries Extras Brian: pytest tips and tricks - recent post, and discussion on upcoming Talk Python episode sharing pytest fixtures - placeholder page where I’ll share slides and code after my talk. Michael: Python runtime updates Django 4.2 beta 1 released Joke: A group of developers is called …
#324 JSON in My DB?
Watch on YouTube About the show Sponsored by Compiler Podcast from Red Hat. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Special guest, Erin Mullaney: @erinrachel@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: Use TOML for .env files? Brett Cannon .env files are used to store default settings that can be overridden by environmental variables. Possibly brought on by twelve-factor app design. Supported by python-dotenv, which is also used by pydantic, pipenv, and others. One issue is that it’s not a defined standard. from python-dotenv docs “The format is not formally specified and still improves over time. That being said, .env files should mostly look like Bash files.” Adafruit decided that an upcoming CircuitPython will use TOML as the format for settings.toml files, which are to be used mostly how .env files are being used. Brett notices this may fix things for Python for VS Code, and other people as well. So… Is this a good idea? I think so. Michael #2: Pydantic gets serious funding via Mark Little (was on episode 285) Sequoia backs open source data-validation framework Pydantic to commercialize with cloud services. Pydantic Services Inc. emerges from stealth today with $4.7 million in seed funding. Pydantic’s new commercial entity will incorporate a swath of new tools and services that are both “powered-by and inspired-by the Pydantic library” Pydantic will start with an initial team of six, with the first three engineers based in Montana, Chicago and Berlin. “With $4.7 million in the bank, Colvin said that they’re continuing to rewrite parts of Pydantic in Rust, with a view toward making it more efficient via a ten-fold performance improvement.” Erin #3: JSON Fields for performance (Denormalization) David Stokes Using JSON fields when you design your databases is a good way to improve database query performance. Brian #4: f-strings with pandas and Jupyter keyboard shortcuts Kevin Markham After a couple year break from blogging, friend of the show Kevin Markham has a couple great, short, useful posts. How to use Python's f-strings with pandas My favorite bit is the part about using f-strings for dictionary keys Fly through Jupyter with keyboard shortcuts 🚀 I’m a sucker for a rocket emoji Not an overwhelming list. Just the essentials for even the casual Jupyter user. Examples Esc and Enter for command mode/edit mode a and b for creating a new cell above or below current cell. m and y for changing the cell type to Markdown or code. Shift+m to merge cells so many more - Michael #5: BioGPT “GPT” for biomedical text generation and mining As motivation, let’s see what ChatGPT can do with arrow anti-patterns in Python. Smaller models and “Large” models Used via an API rather than chat style. BioGPT has also been integrated into the Hugging Face transformers library too Play with it here. Erin #6: Code Mentorship and Communicating with Newer Devs Sheena O’Connell Sheena O’Connell gave a talk at DjangoCon about her work at Umuzi, training unemployed young people in underserved communities in Africa and also was on Django Chat Podcast. Dmitriy Chukhin Caktus Group is trying a new mentorship program for folks who don’t have the necessary training. Extras: Michael: News is, these are no loner news: Security Researchers Uncover 700+ Malicious Open-Source Packages in npm and PyPI Git security vulnerabilities announced, again git ignores https://github.com/github/gitignore https://gitignore.io Erin: DjangoCon is in October in Durham, NC this year (Oct 15-20) Joke: Remember your pointers?
#323 AI search wars have begun
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org - may be a minute or two late. Show: @pythonbytes@fosstodon.org Special guest: Pamela Fox - @pamelafox@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: camply A tool to find campsites at sold out campgrounds through sites like recreation.gov and Yellowstone Finding reservations at sold out campgrounds can be tough. Searches the APIs of booking services like recreation.gov (which indexes thousands of campgrounds across the USA) to continuously check for cancellations and availabilities to pop up. Once a campsite becomes available, camply sends you a notification to book your spot! Want to camp in a tower in California? camply campgrounds --search "Fire Lookout Towers" --state CA Brian #2: hatch-fancy-pypi-readme Your ✨Fancy✨ Project Deserves a ✨Fancy✨ PyPI Readme! 🧐 Hynek Schlawack Include lots of extras in a README.md text fragments files, like AUTHORS.md or Changelog.md, with custom start, stop, pattern includes, etc. regular expression substitutions Several projects with examples, including black. Pamela #3: Pyodide dev branch now supports 3.11 Python 3.11 PR Benchmark Py3.11 and Py3.10 pyodide console TODO list for 0.23.0 alpha release Dis-this: specializing adaptive interpreter Recursion visualizer Michael #4: EU hates open source? via Pamphile Roy The Cyber Resilience Act (CRA) is an interesting and important proposal for a European law that aims to drive the safety and integrity of software The proposal includes a requirement for self-certification by suppliers of software to attest conformity with the requirements of the CRA including security, privacy and the absence of Critical Vulnerability Events (CVEs). We recognize that the European Commission has framed an exception in recital 10 attempting to ensure these provisions do not accidentally impact Open Source software. However, drawing on more than two decades of experience, we at the Open Source Initiative can clearly see that the current text will cause extensive problems for Open Source software. Since the goal is to avoid harming Open Source software this goal should be stated at the start of the paragraph as the rationale, replacing the introductory wording about avoiding harm to "research and innovation" to avoid over-narrowing the exception. The reference to "non-commercial" as a qualifier should be substituted. The term “commercial” has always led to legal uncertainty for software and is a term which should not be applied in the context of open source OSI recommends further work on the Open Source exception to the requirements within the body of the Act to exclude all activities prior to commercial deployment of the software and to clearly ensure that responsibility for CE marks does not rest with any actor who is not a direct commercial beneficiary of deployment. Brian #5: So, Single (‘) or Double (“) Quotes in Python? Marcin Kozak PEP8 doesn’t recommend anything. REPL uses single quotes. >>> x = "one" >>> x 'one' Black sides with “double quotes”, due to the apostrophe in the string problem. 'Don\'t be so sad.' vs “Don’t be sad.” You get to pick, and don’t be bullied by black-fanatics. There’s always blue, which is just like black, but defaults to single-quotes line length defaults to 79, not black’s 88. preserves whitespace before hash marks for right hanging comments (so multiple lines can line up). Pamela #6: Frozen-Flask Pamela’s PR for moving to Frozen Flask Stepping down as a maintainer Extras Brian: What does everyone think of GitHub pricing? Michael: Much much better transcripts, for example, this episode. Means our search works way better too The AI search wars have begun - Google Panics Over ChatGPT [The AI Wars Have Begun] video Microsoft Bing rockets to the top of the App Store after announcing ChatGPT integration Google shares lose $100 billion after company's AI chatbot makes an error during demo Free PyCharm for all the Talk Python customers Thanks for the help with finding a good Flutter dev. Important Talk Python episode: Fusion Ignition Breakthrough and Python Pamela: Github pyproject.toml support. Python Package Template Jokes: $McTitle Worst input fields
#322 Python Packages, Let Me Count The Ways
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Special guest: @calvinhp@fosstodon.org Join us on YouTube at pythonbytes.fm/stream/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: Packaging Python Projects Tutorial from PyPA This is a really good starting point to understand how to share Python code through packaging. Includes discussion of directory layout creating package files, LICENSE, pyproject.toml, README.md, tests and src dir how to fill out build-system section of pyproject.toml using either hatchling, setuptools, flit, or pdm as backends metadata using build to generate wheels and tarballs uploading with twine However For small-ish pure Python projects, I still prefer flit flit init creates pyproject.toml and LICENSE will probably still need to hand tweak pyproject.toml flit build replaces build flit publish replaces twine The process can be confusing, even for seasoned professionals. Further discussion later in the show Michael #2: untangle xml Convert XML to Python objects Children can be accessed with parent.child, attributes with element['attribute']. Call the parse() method with a filename, an URL or an XML string. Given this XML: [HTML_REMOVED] [HTML_REMOVED] [HTML_REMOVED] [HTML_REMOVED] Access the document: obj.root.child['name'] # u'child1' A little cleaner that ElementTree perhaps. Calvin #3: Mypy 1.0 Released Mypy is a static type checker for Python, basically a Python linter on steroids Started in 2012 and developed by a team at Dropbox lead by https://github.com/JukkaL What’s New? New Release Numbering Scheme not using symver Significant backward incompatible changes will be announced in the blog post for the previous feature release feature flags will allow users to upgrade and turn on the new behavior Mypy 1.0 is 40% faster than 0.991 against the Dropbox internal codebase 20 optimizations included in this release Mypy now warns about errors used before definition or possibly undefined variables for example if a variable is used outside of a block of code that may not execute Mypy now supports the new Self type introduced in PEP 673 and Python 3.11 Support ParamSpec in Type Aliases Also, ParamSpec and Generic Self types are no loner experimental Lots of Miscellaneous New Features Fixes to crashes Support for compiling Python match statements introduced in Python 3.10 Brian #4: Thoughts on the Python packaging ecosystem Pradyun Gedam Some great background on the internal tension around packaging. Brian’s note: in the meantime people are struggling to share Python code the “best practice” answer seems to shift regularly this might be healthy to arrive at better tooling in the long term, but in the short term, it’s hurting us. From the article: The Python packaging ecosystem unintentionally became the type of competitive space that it is today. The community needs to make an explicit decision if it should continue operating under the model that led to status quo. Pick from N different tools that do N different things is a good model. Pick from N ~equivalent choices is a really bad user experience. Picking a default doesn’t make other approaches illegal. Communication about the Python packaging ecosystem is fragmented, and we should improve that. Pradyun: “Many of the users who write Python code are not primarily full-time software engineers or “developers”.” from Thea: “The reason there are so many tools for managing Python dependencies is because Python is not a monoculture and different folks need different things.” opening up the build backend through pyproject.toml-based builds was good but the fracturing of multiple “workflow” tools seems bad. “I am certain that it is not possible to create a single “workflow” tool for Python software. What we have today, an ecosystem of tooling where each makes different design choices and technical trade-offs, is a part of why Python is as widespread as it is today. This flexibility and availability of choice is, however, both a blessing and a curse.” On building a default workflow tool around pip interesting idea There’s tension between “we need a default workflow tool” and “unix philosophy: many focused tools that can work together”. Michael #5: Top PyPI Packages A monthly dump of the 5,000 most-downloaded packages from PyPI. Also, a full copy of PyPI info too: github.com/orf/pypi-data Calvin #6: SQLAlchemy 2.0 Released #57 on the Top PyPI Packages 😸 Will be giving a SQLAlchemy tutorial at Python Web Conf What’s New? Significant API change from 1.4 You’ll want to follow the migration guide and see also the what’s new in 2.0 guide Fully takes advantage of Python 3 features such as dataclasses, enums and inline annotations Typing support in Core and ORM, but still should be considered beta all SQLAlchemy stubs packages must be uninstalled all SQLAlchemy stubs packages must be uninstalled for typing to work Mypy Plugin is considered deprecated now Major speed increase in the all new fully ORM-integrated bulk INSERTs sorry if you are on MySQL, they don’t support INSERT RETURNING yet but MariaDB does support this All new bulk optimized schema reflection architecture Currently enabled for PostgreSQL and Oracle 250% perf increase for Postgres 900% per increase for Oracle Native extensions ported to Cython C extensions have been replaced by Cython Benchmarks as fast or sometimes faster than the previous C extensions Removes some risk of memory or stability issues introduced by C SQLAlchemy is now pep-517 enabled and has a pyproject.toml at the root means that local source building with pip can auto install the Cython dependancy Extras Brian: Nothing to share yet, but I’m building a new alternative Python build backend. which if course will be followed with a new workflow tool that follows “my workflow”. Michael: “Create shortcut: New window” tip: In the dock/task bar Running as an app Speaking of Proton, started using simplelogin.io What’s all this banning chips about? Great documentary Talk Python is hiring! Calvin: 5th Annual Python Web Conf 2023
#321 A Memorial To Apps Past
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/stream/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: git-sim Visually simulate Git operations in your own repos with a single terminal command. Generates an image (default) or video visualization depicting the Git command's behavior. Features Run a one-liner git-sim command in the terminal to generate a custom Git command visualization (.jpg) from your repo Supported commands: log, status, add, restore, commit, stash, branch, tag, reset, revert, merge, rebase, cherry-pick Generate an animated video (.mp4) instead of a static image using the --animate flag (note: significant performance slowdown, it is recommended to use --low-quality to speed up testing and remove when ready to generate presentation-quality video) Choose between dark mode (default) and light mode Animation only: Add custom branded intro/outro sequences if desired Animation only: Speed up or slow down animation speed as desired See images and animations on the github readme. Brian #2: Why I Like Nox Hynek Schlawack I like tox and have wanted to try nox but couldn’t think of good reasons for a switch. Hynek is a fan of both, so it’s nice to read his perspective. The article starts with comparing doing the same thing in both testing with Python 3.10 and 3.11 and adding the ability to pass in pytest arguments. even with this example, I do admit that the nox example is easier to read, but a bit more verbose. A second example of running a specific example combination of library and Python is quite a bit longer in nox, but there’s an interesting commentary: “… this is longer than the tox equivalent. But that’s because it’s more explicit and anyone with a passing understanding of Python can deduce what’s happening here – including myself, looking at it in a year. Explicit can be good, actually.” Other benefits: It’s a Python file with Python functions, you have the all of Python at your disposal when developing sessions to run. It’s not “ini format”. Complex ini files get out of hand quickly. nox has Python versions as fist class selectors. Final note: “Again, this article is not a call to abandon tox and move all your projects to Nox – I haven’t done that myself and I don’t plan to. But if my issues resonate with you, there’s an option!” Michael #3: I scanned every package on PyPi and found 57 live AWS keys Scanning every release published to PyPi found 57 valid access keys. Detecting AWS keys is actually fairly simple. A keypair consists of two components: the key ID and the key secret. The key ID can be detected with the regular expression ((?:ASIA|AKIA|AROA|AIDA)([A-Z0-7]{16})) The secret key can be detected with a much more general [a-zA-Z0-9+/]{40}. Static PyPI data: github.com/orf/pypi-data Brian #4: Getting Started With Property-Based Testing in Python With Hypothesis and pytest Rodrigo Girão Serrão Hypothesis and property based testing can be overwhelming at first. So focused intro posts are quite helpful. This post focuses on a couple of examples, gcd(), greatest common divisor, and my_sort(), a custom list sorter. Good discussion of how property based testing is different and how to do it successfully, especially the order of development: focus on developing properties of correct answers develop a test that checks those properties use hypothesis strategies to come up with input pick @examples if necessary narrow the range of input if necessary caveat: I would have preferred hypothesis.assume() to limiting input in the first example. assume(not (n == m == 0)) see https://hypothesis.readthedocs.io/en/latest/details.html#hypothesis.assume add more testing outside of hypothesis In my experience it’s often easier for me to develop code with non-hypothesis test cases, then follow up with hypothesis. But after works also. The mental gymnastics of thinking of properties for algorithmic code is worthwhile. Extras Michael: First stream from the sweet new mac mini. Ivory released for Mastodon, but others too. Nice memorial https://tapbots.com/tweetbot/ We’ll be doing a live in-person event at PyCon, become a friend of the show to get notified. Joke: Didn't come here to be called out
#320 The Bug Is In The JavaScript
Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/stream/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: markdown-it-py Yes. another markdown parser. Rich recently switched markdown parsers, from commonmark to markdown-it-py. Let’s look at those a bit. Michael #2: Sketch via Jake Firman Sketch is an AI code-writing assistant for pandas users that understands the context of your data A Natural Language interface that successfully navigates many tasks in the data stack landscape. Data Cataloging: General tagging (eg. PII identification) Metadata generation (names and descriptions) Data Engineering: Data cleaning and masking (compliance) Derived feature creation and extraction Data Analysis: Data questions Data visualization Watch the video on the GitHub page for a quick intro Brian #3: Fixing Circular Imports in Python with Protocol Built on Subclassing in Python Redux from Hynek We covered this in the summer of 2021, episode 240 However, I re-read it recently due to a typing problem Problem is when an object passes itself to another module to be called later. This is common in many design patterns, including just normal callback functions. Normally not a problem with Python, due to duck typing. But with type hints, suddenly it seems like both modules need types from the other. So how do you have two modules use types from each other without a circular import. Hynek produces two options Abstract Data Types, aka Interfaces, using the abc module Requires a third interface class Structural subtyping with Protocol This is what I think I’ll use more often and I’m kinda in love with it now that I understand it. Still has a third type, but one of the modules doesn’t have to know about it. "Structural Subtyping : Structural subtyping is duck typing for types: if your class fulfills the constraints of a [Protocol](https://docs.python.org/3/library/typing.html#typing.Protocol), it’s automatically considered a subtype of it. Therefore, a class can implement many Protocols from all kinds of packages without knowing about them!” The Fixing Circular Imports in Python with Protocol article walks through one example of two classes talking with each other, typing, circular imports, and fixing them with Protocol Michael #4: unrepl via/by Ruud van der Ham We’ve seen the code samples: >>> board = [] >>> for i in range(3): ... row = ['_'] * 3 ... board.append(row) ...>>> board [['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']] >>> board\[2\][0] = 'X'>>> board [['_', '_', '_'], ['_', '_', '_'], ['X', '_', '_']] But you cannot really run this code. You can’t paste it into a REPL yourself nor can you put it into a .py file. So you unrepl it: Copying the above code to the clipboard and run unrepl. Paste the result and now you can. Unrepl can be used as a command line tool but also as a module. The REPL functionality of underscore (_) to get access to the last value is also supported. Extras Michael: You'll want to update your git ASAP. Get course releases at Talk Python via RSS Gist for using Turnstile with Python + Pydantic Joke: there's a bug in the js You’ve checked all your database indexes, You’ve tuned all your API hooks, You’re starting to think That you might need a drink, Because there’s only one place left to look: … There must be a bug in the javascript Because everything else was built properly But the frontend’s a pile of crap ;)
#319 CSS-Style Queries for... JSON?
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/stream/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: Secure maintainer workflow by Ned Batchelder We are the magicians, but also the gatekeepers for our users Terminal sessions with implicit access to credentials first is unlikely: a bad guy gets onto my computer and uses the credentials to cause havoc second way is a more serious concern: I could unknowingly run evil or buggy code that uses my credentials in bad ways. Mitigations 1Password: where possible, I store credentials in 1Password, and use tooling to get them into environment variables. Side bar: Do not use lastpass, see end segment I can have the credentials in the environment for just long enough to use them. This works well for things like PyPI credentials, which are used rarely and could cause significant damage. Docker: To really isolate unknown code, I use a Docker container. Brian #2: Tools for parsing HTML and JSON Learned these from A Year of Writing about Web Scraping in Review Parsel - extract and remove data from HTML using XPath and CSS selectors jmespath - “James Path” - declaratively specify how to extract elements from a JSON document Michael #3: git-sizer Compute various size metrics for a Git repository, flagging those that might cause problems. Tip, partial clone: git clone --filter=blob:none URL # Stats for training.talkpython.fm # Full: git clone repo Receiving objects: 100% (118820/118820), 514.31 MiB | 28.83 MiB/s, done. Resolving deltas: 100% (71763/71763), done. Updating files: 100% (10792/10792), done. 1.01 GB on disk # Partial: git clone --filter=blob:none repo Receiving objects: 100% (10120/10120), 220.25 MiB | 24.92 MiB/s, done. Resolving deltas: 100% (1454/1454), done. Updating files: 100% (10792/10792), done. 694.4 MB on disk Partial clone is a performance optimization that “allows Git to function without having a complete copy of the repository. The goal of this work is to allow Git better handle extremely large repositories.” When changing branches, Git may download more missing files. Not the same as shallow clones or sparse checkouts Consider shallow clones for CI/CD/deployment Sparse checkouts for a slice of a monorepo Brian #4: Dataclasses without type annotations Probably file this under “don’t try this at home”. Or maybe “try this at home, but not at work”. Or just “that Brian fella is a bad influence”. What! It’s not me. It’s Adrian, the dude that wrote the article. Unless you’re using a type checker, for dataclasses, “… use any type you want. If you're not using a static type checker, no one is going to care what type you use.” @dataclass class Literally: anything: ("can go", "in here") as_long_as: lambda: "it can be evaluated" # Now, I've noticed a tendency for this program to get rather silly. hell: with_("from __future__ import annotations") it_s: not even.evaluated it: just.has(to=be) * syntactically[valid] # Right! Stop that! It's SILLY! Extras Michael: LastPass story just keeps getting worse We will see problems in supply chains because of this too A whole 2 hour discussion diving into what I touched on: twit.tv/shows/security-now Got your new mac mini yet? Or MacBook Pro? Joke: Developer/maker, what’s my purpose?
#318 GIL, How We Will Miss You
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with us Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/stream/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Brian #1: PEP 703 - Making the GIL Optional in CPython Author: Sam Gross Sponsor: Łukasz Langa Draft status, but on Standards Track, targeting Python 3.12 Suggested by: Will Shanks “The GIL is a major obstacle to concurrency.” Especially for scientific computing. PEP 703 proposes adding a --without-gil build configuration to CPython to let it run code without the global interpreter lock and with the necessary changes needed to make the interpreter thread-safe. PEP includes several issues with GIL and sckikit-learn, PyTorch, Numpy, Pillow, and other numerically intensive libraries. Python’s GIL makes it difficult to use modern multi-core CPUs efficiently for many scientific and numeric computing applications. There’s also a section on how the GIL makes many types of parallelism difficult to express. Changes primarily in internals, and not much exposed to public Python and C APIs: Reference counting Memory management Container thread-safety Locking and atomic APIs Includes information on all of these challenges. Distribution C-API extension authors will need access to a --without-gil Python to modify their projects and supply --without-gil versions. Sam is proposing “To mitigate this, the author will work with Anaconda to distribute a --without-gil version of Python together with compatible packages from conda channels. This centralizes the challenges of building extensions, and the author believes this will enable more people to use Python without the GIL sooner than they would otherwise be able to.” Michael #2: FerretDB Via Jon Bultmeyer A truly Open Source MongoDB alternative MongoDB abandoned its Open-Source roots, changing the license to SSPL making it unusable for many Open Source and Commercial Projects. The core of our solution is a stateless proxy, which converts MongoDB protocol queries to SQL, and uses PostgreSQL as a database engine. FerretDB will be compatible with MongoDB drivers and will strive to serve as a drop-in replacement for MongoDB 6.0+. First release back in Nov 2022 I still love you MongoDB ;) Brian #3: Four tips for structuring your research group’s Python packages David Aaron Nicholson Not PyPI packages, but, you know, directories with __init__.py in them. Corrections for mistakes I see frequently Give your packages and modules terse, single-word names whenever possible. Import modules internally, instead of importing everything from modules. Make use of sub-packages. Prefer modules with very specific names containing single functions over modules with very general names like utils, helpers, or support that contain many functions. Michael #4: Quibbler Quibbler is a toolset for building highly interactive, yet reproducible, transparent and efficient data analysis pipelines. One import statement and matplotlib becomes interactive. Check out the video on the repo page. Extras Brian: And now for something completely different: turtles talk Michael: More RSS recommendations FreshRSS a self-hosted RSS and Atom feed aggregator. Feedly (for AI) Flym for Android Readwise is very interesting RSS for courses at Talk Python New article: Dev on the Road Joke: Testing the program Joke: Every Cloud Architecture
#317 Most loved and most dreaded dev tools of 2022
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Michael #1: StackOverflow 2022 Developer Survey Last year we saw Git as a fundamental tool to being a developer. This year it appears that Docker is becoming a similar fundamental tool for Professional Developers, increasing from 55% to 69%. Language: Rust is […] the most loved language with 87% of developers saying they want to continue using it. JS Frameworks: Angular.js is in its third year as the most dreaded. Let me Google that for you: 62% of all respondents spend more than 30 minutes a day searching for answers or solutions to problems. 25% spending more than an hour each day. The demise of the full-stack developer is overrated. I do wish there were more women in the field. Databases: Postgres is #1 and MongoDB is still going strong. The “which web framework do you use?” question is a full on train wreck. Why is this so hard for people to write the question? Node.js or Express (built on Node) vs. FastAPI or Flask (but no Python?) Most wanted / loved language is Rust (wanted) and Python/Rust tied for most wanted. Worked with vs. want to work with has some interesting graphics. Brian #2: PePy.tech - PyPI download stats with package version breakdown Petru Rares Sincraian We’ve discussed pypistats.org before, which highlights daily downloads downloads per major/minor Python version downloads per OS PyPy is a bit more useful for me default shows last few versions and total for this major version “select versions” box is editable. clicking in it shows dropdown with downloads per version already there you can add * for graph of total or other major versions if you want to compare daily/weekly/monthly is nice, to round out some noise and see larger trends Oddity I noticed - daily graph isn’t the same dates as the table. off by a day on both sides not a big deal, but I notice these kinds of things. Michael #3: Codon Python Compiler via Jeff Hutchins and Abdulaziz Alqasem A high-performance, zero-overhead, extensible Python compiler using LLVM You can scale performance and produce executables, even when using third party libraries such as matplotlib. It also supports writing and executing GPU kernels, which is an interesting feature. See how it works at exaloop.io BTW, really terrible licensing. Free for non-commercial (great) “Contact us” for commercial use (it’s fine to charge, but give us a price) Brian #4: 8 Levels of Using Type Hints in Python Yang Zhou (yahng cho) A progression of using type hints that seems to track how I’ve picked them up Type Hints for Basic Data Types. x: int Define a Constant Using Final Type DB: Final = '``PostgreSQL' (ok. I haven’t used this one at all yet) Adding multipe type hints to one variable. int | None Using general type hints. def func(nums: Iterable) Also using Optional Type hints for functions def func(name: str) → str: (I probably would put this at #2) Alias of type hints (not used this yet, but looks cool) PostsType = dict[int, str] new_posts: PostsType = {1: 'Python Type Hints', 2: 'Python Tricks'} Type hints for a class itself, i.e. Self type from typing import Self class ListNode: def __init__(self, prev_node: Self) -> None: pass Provide literals for a variable. (not used this yet, but looks cool) from typing import Literal weekend_day: Literal['Saturday', 'Sunday'] weekend_day = 'Saturday' weekend_day = 'Monday' # will by a type error Extras Brian: I hear a heartbeat for Test & Code, so it must not be dead yet. Michael: New article: Welcome Back RSS From this I learned about Readwise, Kustosz, and Python’s reader. Year progress == 100% PyTorch discloses malicious dependency chain compromise over holidays (of course found over RSS and reeder — see article above) Joke: vim switch
#316 Python 3.11 is here and it's fast (crossover)
Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show announcements: @pythonbytes@fosstodon.org Hi folks. For our final episode of 2022 here on Python Bytes, we're crossing the streams with my other show, Talk Python To Me. I present to you one of the more important episodes of the year, the release of Python 3.11 with it's new features and 40% performance improvements. Thank you for listening to Python Bytes in 2022, have a great holiday break, and Brian and I will see you next week. Python 3.11 is here! Keeping with the annual release cycle, the Python core devs have released the latest version of Python. And this one is a big one. It has more friendly error messages and is massively faster than 3.10 (between 10 to 60% faster) which is a big deal for a year over year release of a 30 year old platform. On this episode, we have Irit Katriel, Pablo Galindo Salgado, Mark Shannon, and Brandt Bucher all of whom participated in releasing Python this week on the show to tell us about that process and some of the highlight features. Guests Irit Katriel @iritkatriel Mark Shannon linkedin.com Pablo Galindo Salgado @pyblogsal Brandt Bucher github.com Resources from the show Michael's Python 3.11 Course talkpython.fm/py311 Python 3.11.0 is now available blog.python.org PEP 101 - Releasing Python peps.python.org PEP 678 – Enriching Exceptions with Notes peps.python.org PEP 654 – Exception Groups and except* peps.python.org PEP 657 – Include Fine Grained Error Locations in Tracebacks peps.python.org Python Buildbot python.org Making Python Faster Talk Python Episode talkpython.fm Specializing, Adaptive Interpreter on Talk Python talkpython.fm Specialist Visualizer github.com "Zero cost" exception handling github.com Pyodide pyodide.org pyscript pyscript.net