Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.
Similar Podcasts
The Infinite Monkey Cage
Brian Cox and Robin Ince host a witty, irreverent look at the world through scientists' eyes.
The Top Shelf
ThePrimeagen and teej_dv are on a quest to find the best possible technical speakers and ask the best possible questions we can find. You all know ThePrimeagen can't read, so this is a great format for him to really shine. Teej is here to make sure that Prime knows who the guest is and also to interrupt Prime wherever possible
24H24L
Evento en línea, de 24 horas de duración que consiste en la emisión de 24 audios de diversas temáticas sobre GNU/Linux. Estos son los audios del evento en formato podcast.
#275 Airspeed velocity of an unladen astropy
Watch the live stream: Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Special guest: Emily Morehouse-Valcarcel Michael #1: Async and await with subprocesses by Fredrik Averpil People know I do all sorts of stuff with async Lots of cool async things are not necessarily built into Python, but our instead third-party packages E.g. files via aiofiles But asyncio has asyncio.create_subprocess_exec Fredrik’s article has some nice examples I started using this for mp3 uploads and behind the scenes processing for us Brian #2: Typesplainer Arian Mollik Wasi, @wasi_master Suggested by Will McGugan Now released a vscode extension for that! Available on vscode as typesplainer Emily #3: Ibis Project via Marlene Mhangami “Productivity-centric Python data analysis framework for SQL engines and Hadoop” focused on: Type safety Expressiveness Composability Familiarity Marlene wrote an excellent blog post as an introduction Works with tons of different backends, either directly or via compilation Depending on the backend, it actually uses SQLAlchemy under the hood There’s a ton of options for interacting with a SQL database from Python, but Ibis has some interesting features geared towards performance and analyzing large sets of data. It’s a great tool for simple projects, but an excellent tool for anything data science related since it plays so nicely with things like pandas Michael #4: ASV via Will McGugan AirSpeed Velocity (asv) is a tool for benchmarking Python packages over their lifetime. Runtime, memory consumption and even custom-computed values may be tracked. See quickstart Example of astropy here. Finding a commit that produces a large regression Brian #5: perflint Anthony Shaw pylint extension for performance anti patterns curious why a pylint extension and not a flake8 plugin. I think the normal advice of “beware premature optimization” is good advice. But also, having a linter show you some code habits you may have that just slow things down is a nice learning tool. Many of these items are also not going to be the big show stopper performance problems, but they add unnecessary performance hits. To use this, you also have to use pylint, and that can be a bit painful to start up with, as it’s pretty picky. Tried it on a tutorial project today, and it complained about any variable, or parameter under 3 characters. Seems a bit picky to me for tutorials, but probably good advice for production code. These are all configurable though, so you can dial back the strictness if necessary. perflint checks: W8101 : Unnessecary list() on already iterable type W8102: Incorrect iterator method for dictionary W8201: Loop invariant statement (loop-invariant-statement) ←- very cool W8202: Global name usage in a loop (loop-invariant-global-usage) R8203 : Try..except blocks have a significant overhead. Avoid using them inside a loop (loop-try-except-usage). W8204 : Looped slicing of bytes objects is inefficient. Use a memoryview() instead (memoryview-over-bytes) W8205 : Importing the "%s" name directly is more efficient in this loop. (dotted-import-in-loop) Emily #6: PEP 594 Acceptance “Removing dead batteries from the standard library” Written by Christian Heimes and Brett Cannon back in 2019, though the conversation goes back further than that It’s a very thin line for modules that might still be useful to someone versus the development effort needed to maintain them. Recently accepted, targeting Python 3.11 (final release planned for October 2022, development begins in May 2021. See the full release schedule) Deprecations will begin in 3.11 and modules won’t be fully removed until 3.13 (~October 2024) See the full list of deprecated modules Bonus: new PEP site and theme! Extras Brian: Michael: Emily: Riff off of one of Brian’s topics from last week: Automate your interactive rebases with fixups and auto-squashing Cool award that The PSF just received PSF Spring Fundraiser Cuttlesoft is hiring! Jokes: *Changing * (via Ruslan) Please hire me
#274 12 Questions You Should Be Asking of Your Dependencies
Watch the live stream: Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Special guest: Anne Barela Brian #1: The Adam Test : 12 Questions for New Dependencies Found through a discussion with Ryan Cheley, who will be on an upcoming episode of Test & Code, talking about Managing Software Teams. The Joel Test dates back to 2000, and some of it is a bit dated. I should probably do a Test & Code episode or pythontest article on my opinions of this at some point. Nice shameless plugs, don’t you think? The Joel Test is 12 questions and is a “highly irresponsible, sloppy test to rate the quality of a software team.” “The Adam Test” is 12 questions “to decide whether a new package we’re considering depending on is well-maintained.” He’s calling it “The Well-Maintained Test”, but I like “The Adam Test” Here’s the test: Is it described as “production ready”? Is there sufficient documentation? Is there a changelog? Is someone responding to bug reports? Are there sufficient tests? Are the tests running with the latest language version? like Python 3.10, of course Are the tests running with the latest integration version? Examples include Django, PostgreSQL, etc. Is there a Continuous Integration (CI) configuration? Is the CI passing? Does it seem relatively well used? Has there been a commit in the last year? Has there been a release in the last year? Article has a short discussion of each. What score is good? That’s up to you to decide. But these questions are good to think about for your dependencies. I also think I’ll use these questions for my own projects. I’ve got a README.md, but do I need more examples in it? Should I have RTD docs for it? Have I updated the test matrix to include the newest versions of Python, etc? Have I hooked up CI? Michael #2: Validate emails with email-validator When you think about validating emails, you probably think regex (or just nothing) Regex is fine but so is this email: jane_doe@domain_that_doesnt_exist.com Problem is (at the time of the recording), domain_that_doesnt_exist.com is not a website. What about unicode variations that are technically the same but visually different? If the passed email address is valid, the validate_email() method will return an object containing a normalized form of the passed email address. Anne #3: The Python on Microcontrollers Newsletter One of my main focuses at Adafruit since the pandemic started is as editor of the Python on Microcontrollers Newsletter. With a weekly distribution of almost 9,400 subscribers, it’s the largest newsletter of it’s kind. It mainly focuses on CircuitPython and MicroPython and also discusses Python on single board computers (SBC) like Raspberry Pi. News about Python with a small computer emphasis Folks may subscribe by going to https://www.adafruitdaily.com/ which is separate from adafruit.com. The information is not sold or used for marketing and it’s easy to unsubscribe (no “do you really want to do this, please reconsider…) The challenge, like for Python Bytes and other publications, is to find content. I scour the internet, with a bit of a focus on Twitter as I have an active account there. We encourage others to put in issues and Pull Requests on the newsletter GitHub, email information to cpnews@adafruit.com and using hashtag CircuitPython or MicroPython on Twitter. Brian #4: Git Organized: A Better Git Flow Annie Sexton Found through Changelog episode 480: Get your reset on A possible and common git workflow Branch off of a main branch to a personal dev branch Commit and Push during development to save your work When ready to merge, make a PR Problems Commits are hard to follow and messy, not ever really intended to separate parts of the workflow or anything. Commits are therefore useless in helping someone code review large changes. Annie’s workflow Branch off of a main branch to a personal dev branch Commit and Push during development to save your work. But don’t worry to much about commit messages, “WIP” is sufficient. Or a note to yourself. When ready to merge git reset origin/main Re-commit all changes in a logical order that makes more sense than the way the work actually happened. These will be several commits, with descriptive messages. Even partial commits, if there are unrelated changes in a file, work with this process Push all the new commits. (Is --force going to be necessary?) Create a PR. Now there are a set of commits that are actually helpful to break up large PRs into small chunks that tell a story. I’m looking to try this soon to see how it goes Michael #5: CPython issues moving to GitHub soon Update by The Python Developer in Residence, Łukasz Langa The Steering Council is working on migrating the data that is currently residing in Roundup at https://bugs.python.org/ (BPO) into the GitHub issues of the CPython repository hosted there. Laid out in PEP 581 -- Using GitHub Issues for CPython The ultimate goal is to move user- and core developer-provided issue-reporting entirely to Github. Each issue that currently exists on BPO will include metadata indicating where it was moved on Github. New issues will only exist on Github. Feedback, please: At the current stage, we’re asking you to take a look at the links and important dates below, and share any feedback you might have. Timeline: Friday, March 11th 2022: Github starts transfer of the issues in the temporary repository to github.com/python/cpython/ . The migration is estimated to take anywhere from 3 to 7 days, depending on the load on Github.com. Anne #6: MicroPython, CircuitPython and GitHub What are Microcontrollers and Single Board Computers (SBCs)? Why not use CPython on Microcontrollers? MicroPython was originally created by the Australian programmer and theoretical physicist Damien George, after a successful Kickstarter backed campaign in 2013. Originally it only ran on a number of boards and was based on Python 3.4. CircuitPython was forked from MicroPython in 2017 by Adafruit Industries. Both MicroPython and CircuitPython are Open Source under MIT Licenses so adoption and modification by anyone is easy. Why fork CircuitPython? 1) Make a requirement that CircuitPython boards can enumerate to computers as a USB thumb drive to add or change code files with any text editor. 2) Aim to make CircuitPython use CPython library syntax whenever possible. 3) Make it easy to use and understand for beginners yet powerful for more advanced users. All CircuitPython code is on GitHub. GitHub Actions is used on repos like the Adafruit Learning System code to automate CI with Pylint, Black, and ensuring code has proper SPDX author and license tags, which is a new addition this year. Currently there are 283 microcontroller boards compatible with CircuitPython and 87 single board computers can use CircuitPython libraries in CPython via the Adafruit Blinka abstraction layer. Code portability between boards requires little if any changes. There are 346 CircuitPython libraries (all on PyPI / pip as well as GitHub) covering a wide range of hardware and real world needs. From blinking LEDs to using ulab (microlab), a subset of numpy, for data crunching. I just counted and there are exactly 1,000 Adafruit Learning System guides referencing CircuitPython, all free and open source/MIT licensed. https://learn.adafruit.com/ Extras Brian: Quick read: The Thirty Minute Rule, by Daniel Roy Greenfield summary: Stuck on a software problem for 30 min? Ask for help. Michael: The CircuitPython Show by Paul Cutler Follow up from my Python 3 == Active Python 3? James wrote: In episode #273, you guys were discussing supporting "Python 3" to mean any currently supported version of Python rather than "Python 3.7+" or similar. That's a really bad idea. There are still tons of people using unsupported versions of Python, and they're not all invalid use cases. For example, I am one of the upstream maintainers for cloud-init, and I was only recently able to remove Python 3.5 in order to make 3.6 our minimum supported version (which will continue for the next year). The reason is that our main consumers are downstream distro packagers (ubuntu, red hat, fedora, etc), and it's not uncommon for software released into long-term supported OS releases to be supported for 5-10 years or more. If I fire up an Ubuntu Trusty container, which still receives extended support until 2024, I get Python 3.4. So even though 3.4 is unsupported by Python upstream, it is still absolutely in use and supported by OS manufacturers. Joke: A case of the Mondays
#273 Getting dirty with __eq__(self, other)
Watch the live stream: Watch on YouTube About the show Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: Physics Breakthrough as AI Successfully Controls Plasma in Nuclear Fusion Experiment Interesting break through using AI Is Python at the center of it? With enough digging, the anwswer is yes, and we love it! Brian #2: PEP 680 -- tomllib: Support for Parsing TOML in the Standard Library Accepted for Python 3.11 This PEP proposes basing the standard library support for reading TOML on the third-party library tomli Michael #3: Thread local threading.local: A class that represents thread-local data. Thread-local data are data whose values are thread specific. Just create an instance of local (or a subclass) and store attributes on it You can even subclass it. Brian #4: What is a generator function? Trey Hunner Super handy, and way easier than you think, if you’ve never written your own. Really, it’s just a function that uses yield instead of return and supplies one element at a time instead of returning a list or dict or tuple or other large structure. Some details generator functions return generator objects generator objects are on pause and use the built in next() function to get next item. they raise StopIteration when done. Most generally used from for loops. Generator objects cannot be re-used when exhausted but you can get a new one with the next for loop you use. So, it’s all good. Michael #5: dirty-equals via Will McGugan, by Samual Colvin Doing dirty (but extremely useful) things with equals. from dirty_equals import IsPositive assert 1 == IsPositiveassert -2 == IsPositive # this will fail! user_data = db_conn.fetchrow('select * from users') assert user_data == { 'id': IsPositiveInt,'username': 'samuelcolvin','avatar_file': IsStr(regex=r'/[a-z0-9\-]{10}/example\.png'),'settings_json': IsJson({'theme': 'dark', 'language': 'en'}),'created_ts': IsNow(delta=3),} Brian #6: Commitizen from the docs Command-line utility to create commits with your rules. Defaults: Conventional commits Display information about your commit rules (commands: schema, example, info) Bump version automatically using semantic versioning based on the commits. Read More Generate a changelog using Keep a changelog considering using for consistent commit message formatting can be used with python-semantic-release for automatic semantic versioning learned about it in 10 Tools I Wish I Knew When I Started Working with Python questions anyone using this or something similar? does this make sense for small to medium sized projects? or overkill? Extras: pytest book 40% off sale continues through March 19 for eBook Amazon lists the book as “shipping in 1-2 days”, as of March 2 Michael: Pronouncing the Python Walrus operator := as “becomes” Via John Sheehan: String methods startswith() and endswith() can take a tuple as its first argument that lets you check for multiple values with one call: >>> x = "abcdefg" >>> x.startswith(("ab", "cd", "ef"), 2) True Joke: CS Background
#272 The tools episode
Watch the live stream: Watch on YouTube About the show Sponsor: Brought to you by FusionAuth - check them out at pythonbytes.fm/fusionauth Special guest: Calvin Hendryx-Parker Brian #1: Why your mock still doesn’t work Ned Batchelder Some back story: Why your mock doesn’t work a quick tour of Python name assignment The short version of Python Names and Values talk importing difference between from foo import bar and import foo w.r.t mocking punchline: “Mock it where it’s used” Now, Why your mock still doesn’t work talks about using @patch decorator (also applies to @patch.object decorator) and utilizing mock_thing1, mock_thing2 parameters to test you can change the return value or an attribute or whatever. normal mock stuff. But…. the punchline…. be careful about the order of patches. It needs to be @patch("foo.thing2") @patch("foo.thing1") def test_(mock_thing1, mock_thing2): ... Further reading: https://docs.python.org/3/library/unittest.mock.html#patch https://docs.python.org/3/library/unittest.mock.html#patch-object Michael #2: pls via Chris May Are you a developer who uses the terminal? (likely!) ls/l are not super helpful. There are replacements and alternatives But if you’re a dev, you might want the most relevant info for you, so enter pls See images in Michael’s tweets [1, 2]. You must install nerdfonts and set your terminal’s font to them Calvin #3: Kitty Cross platform GPU accelerated terminal (written in Python Extended with Kittens written in Python Super fast rendering Has a rich set of plugins available for things like searching the buffer with fzf Brian #4: Futures and easy parallelisation Jaime Buelta Code example for quick scripts to perform slow tasks in parallel. Uses concurrent.futures and ThreadPoolExecutor. Starts with a small toy example, then goes on to a requests example to grab a bunch of pages in parallel. The call to executor.submit() sets up the job. This is done in a list comprehension, generating a list of futures. The call to futures.result() on each future within the list is where the blocking happens. Since we want to wait for all results, it’s ok to block on the first, second, etc. Nice small snippet for easy parallel work. Example: from concurrent.futures import ThreadPoolExecutor import time import requests from urllib.parse import urljoin NUM_WORKERS = 2 executor = ThreadPoolExecutor(NUM_WORKERS) def retrieve(root_url, path): url = urljoin(root_url, path) print(f'{time.time()} Retrieving {url}') result = requests.get(url) return result arguments = [('https://google.com/', 'search'), ('https://www.facebook.com/', 'login'), ('https://nyt.com/', 'international')] futures_array = [executor.submit(retrieve, *arg) for arg in arguments] result = [future.result() for future in futures_array] print(result) Michael #5: pgMustard So you have a crappy web app that is slow but don’t know why. Is it an N+1 problem with an ORM? Is it a lack of indexes? If you’re using postgres, check out pgMustard: A simple yet powerful tool to help you speed up queries This is a paid product but might be worthwhile if you live deeply in postgres. Calvin #6: bpytop Great way to see what is going on in your system/server Shows nice graphs in the terminal for system performance such as CPU and Network traffic Support themes and is fast and easy to install with pipx Michael uses Glances which is fun too. Calvin used to be a heavy Glances user until he saw the light 🙂 Extras Brian: pytest book is officially no longer Beta, next is printing, expected paper copy ship date at March 22, although I’m hoping earlier. For a limited time, to celebrate, 40% off the eBook PyCamp Spain is April 15-18: a weekend that includes 4 days and 3 nights with full board (breakfast, lunch and dinner) in Girona, Spain Calvin: Python Web Conference 2022 ← bigger and better than ever! Michael: witch macOS switcher list comprehensions vs. loops [[video](https://www.youtube.com/watch?v=uVQVn8z8kxo), [code](https://gist.github.com/mikeckennedy/2ddb5ad84d6e116e6d14b5c2eef4245a)] syncify.run and nesting asyncio Joke: Killer robots
#271 CPython: Async Task Groups in Python 3.11
Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Steve Dower Michael #1: fastapi-events Asynchronous event dispatching/handling library for FastAPI and Starlette Features: straightforward API to emit events anywhere in your code events are handled after responses are returned (doesn't affect response time) support event piping to remote queues powerful built-in handlers to handle events locally and remotely coroutine functions (async def) are the first-class citizen write your handlers, never be limited to just what fastapi_events provides Brian #2: Ways I Use Testing as a Data Scientist Peter Baumgartner “In my work, writing tests serves three purposes: making sure things work, documenting my understanding, preventing future errors.” Test The results of some analysis process (using assert) Code that operates on data (using hypothesis) Aspects of the data (using pandera or Great Expectations) Code for others (using pytest) Use asserts liberally even within the code use on as many intermediate calculations and processes as you can embed expressions in f-strings as the last argument to assert to help debug failures check calculations and arithmetic check the obvious Notebooks: “One practice I’ve started is that whenever I visually investigate some aspect of my data by writing some disposable code in a notebook, I convert that validation into an assert statement.” utilize numpy and pandas checks, especially for arrays and floating point values hypothesis can help you think of edge cases that should work, but don’t, like empty Series, and NaN values. Write tests on the data itself pandera useful for lightweight cases, checking schema on datasets. Great Expectations if we’re epecting to repeatedly read new data with the same structure. Use pytest, especially for code you are sharking with other people, like libraries. TDD works great for API development Arrange-Act-Assert is a great structure. “Even if we’re not sure what to assert, writing a test that executes the code is still valuable. “ At least you’ll catch when you’ve forgotten to implement something. Steve #3: PEP 654 Exception groups and except A necessary building block for more advanced asyncio helpers Mainly for use by scheduler libraries to handle raising multiple errors “simultaneously” except: “a single exception group can cause several except clauses to execute, but each such clause executes at most once (for all matching exceptions from the group)” Necessary for complex scheduling, such as task groups Michael #4: py-overload A Runtime method override decorator. Python lacks method overriding (do_it(7) vs. do_it(``"``7``"``)) Probably due to lack of typing in the early days Go from this: def _func_str(a: str): ... def _func_int(a: int): ... def func(a: Union[str, int]): if isinstance(a, str): _func_str(a) else: _func_int(a) To this: @overload def func(a: str): ... @overload def func(a: int): ... Brian #5: Next-generation seaborn interface Love the background and goals section “This work grew out of long-running efforts to refactor the seaborn internals so that its functions could rely on common code-paths. At a certain point, I decided that I was developing an API that would also be interesting for external users too.” “seaborn was originally conceived as a toolbox of domain-specific statistical graphics to be used alongside matplotlib.” I’ve always wondered about this Some people now reach for, or learn, seaborn first. As seaborn has grown, reproducing with raw matplotlib to change something seaborn doesn’t expose is sometimes painful goal : “expose seaborn’s core features — integration with pandas, automatic mapping between data and graphics, statistical transformations — within an interface that is more compositional, extensible, and comprehensive.” I also like interface discussions that have phrases like “This is a clean namespace, and I’m leaning towards recommending from seaborn.objects import * for interactive usecases. But let’s not go so far just yet.” I like clean namespaces, and use some of my own libs like this, but import * always is a red flag for me. The new interface exists as a set of classes that can be acessed through a single namespace import: import seaborn.objects as so Start with so.Plot, add layers, like so.Scatter(), even multiple layers. layers have a Mark object, which defines how to draw the plot, like so.Line or so.Dot There’s a lot more detail in there. The discussion is great. Also a neat understanding that established libraries can change their mind on APIs. This is a good way to discuss it, in the open. Note included at the top: “This is very much a work in progress. It is almost certain that code patterns demonstrated here will change before an official release. I do plan to issue a series of alpha/beta releases so that people can play around with it and give feedback, but it’s not at that point yet.” Steve #6: Compile CPython to Web Assembly Allows fully in-browser use of CPython (demo at https://repl.ethanhs.me/) Currently uses Emscriptem as its runtime environment, to fill in gaps that browsers don’t normally offer (like an in-memory file system), or WASI to more carefully add system functionality Still the CPython runtime, and a lot of work to do before you’ll see it as part of client-side web apps, but the possibility is now there. Extras Michael: Get minutes, hours, and days from Python timedelta - A Python Short Did you know ohmyzsh is kind of local? Django reformatted code with Black (via PyCoders) Steve: Python 3.11’s latest alpha now has Windows ARM64 installers. These aren’t the dominant devices yet, but they’re out there, and if you’ve got one the CPython team would love to hear about your experience. Steve just released a new version of Deck, which started as a way to help people who misspelled collections.deque, but has grown into a useful building block for traditional 52-card games (or 54 including jokers). Joke: Help is coming
#270 Can errors really be beautiful?
Watch the live stream: Watch on YouTube About the show Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Dean Langsam Brian #1: A Better Pygame Mainloop Glyph Doing some game programming is a great way to work on coding for early devs (and experienced devs). pygame is a popular package for writing games in Python But… the normal example of a main loop, which listens for events and dispatches actions based on events, has some problems: it’s got a while 1: that wastes power, too much busy waiting looks bad, due to “screen tearing” which is writing to a screen while your in the middle of drawing it This post discusses the problems, and walks through to an async main loop that creates a better gaming experience. Michael #2: awesome sqlalchemy A few notable ones SQLAlchemy-Continuum: Versioning and auditing extension for SQLAlchemy. SQLAlchemy-Utc: SQLAlchemy type to store aware datetime.datetime values. SQLAlchemy-Utils: Various utility functions, new data types and helpers for SQLAlchemy filedepot: DEPOT is a framework for easily storing and serving files in web applications. SQLAlchemy-ImageAttach: SQLAlchemy-ImageAttach is a SQLAlchemy extension for attaching images to entity objects. SQLAlchemy-Searchable: Full-text searchable models for SQLAlchemy. sqlalchemy_schemadisplay: This module generates images from SQLAlchemy models. Can we also get a shoutout to SQLModel? Dean #3: ThreadPoolExecutor in Python: The Complete Guide Long, but worth it (80-120 minutes). Could be consumed in parts. It’s mostly a collection of other blogposts on superfastpython Many examples LifeCycle Usage patterns Map and was as_completed vs sequentially callbacks IO-Bound vs CPU-bound Common Questions Comparison vs. ProcessPoolExecutor vs. threading.Thread vs. AsyncIO Brian #4: Chaining comparison operators Rodrigo Girão Serrão I use chained expressions all the time, mostly with ranges: min <= x <= max, which is like (min <=x) and (x <= max) There are lots of chained expressions available, and some not so obvious. a == b == c all are equal, no prob what abut a != b != c ? This actually can return True if a == c Lots of other issues with chaining discussed in the article, like non-constant expressions and side effects Michael #5: Create Beautiful Tracebacks with Python’s Exception Hooks def exception_hook(exc_type, exc_value, tb): ... sys.excepthook = exception_hook Libraries rich.traceback better-exceptions Pretty Errors IPython.core.ultratb stackprinter Dean#6: Ways I Use Testing as a Data Scientist The importance of knowing what to test for using assert in code on ad-hoc things. Do it while coding. Us numpy np.isclose to test “almost equal” on entire arrays. also [assert_frame_equal](https://pandas.pydata.org/docs/reference/api/pandas.testing.assert_frame_equal.html) Use hypothesis for bombarding the function with smart tests. Pandera and Great Expectations tests are documentation! Work with Arrange-Act-Assert Even if we’re not sure what to assert, writing a test that executes the code is still valuable. Extras Dean: Deprecate urllib out of stdlib? IPython 8 is out It’s less code! 37,500 LOC across 348 files → 36,100 across 294 files “I’m sorry I wrote you such a long letter. I didn’t have time to write you a short one.” – Blaise Pascal This was all done thanks to a developer hired through Small Development Grants coloring exceptions Michael: Python Shorts New videos Beyond the List Comprehension Combining dictionaries, the Python 3.10 way / on pypi.org Joke: Spelling
#269 Get Rich and replace your cat
Watch the live stream: Watch on YouTube About the show Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Luciana and Brett Cannon Brian #1:rich-cli suggested by Lance Reinsmith rich on the command line. why? syntax highlighting rich example.py rich -m README.md use -m for markdown why Will? .md seems clear enough to me. comes with themes. ex: --theme monokai formats json, --json or -j and a bunch of other features I probably won’t use, but you might. alignment, maybe width, yeah, I’ll probably use -w a bunch more In my .zshrc: alias cat='rich --theme monokai' after pipx install rich-cli feel free to tell me that I shouldn’t used cat for looking at file contents. (although, why not?) I’m not, I’m using rich. :) Luciana #2: debugpy - a debugger for Python The debugger we use in the Python extension for VS Code Super heplful features that can save up a lot of time and a lot of folks don’t seem to know about: Conditional breakpoints Helpful when you want the code to break only on a specific condition e.g. # of execution times, or when an expression is true Debug console Helpful for quick testing using the context of the program at the breakpoint Temp edits on variable values, expresison evaluation, etc. Jump to Cursor (a.k.a. Set Next Statement) Control on what is the next line the debugger will execute Including previously executed lines Brian #3: Documentation unit tests Simon Willison Post talking about using pytest and tests to check documentation. Simon has test code that introspects the code introspects the docs then makes sure some items are definitely in the docs This is used in Datasette, so you can look at the example in the repo What’s tested: config options are all documented plugin hooks are documented views are all documented Cool use of parametrize to generate test cases based on introspection Nice use of fixtures Very cool idea Luciana #4: PEP 673 — Self Type Heard from Brett Cannon that it has been accepted! Interesting for me as I’m learning more about types in Python Adds a way to annotate methods that return an instance of their class Particularly interesting for subclasses, exemple they gave: from __future__ import annotations class Shape: def set_scale(self, scale: float) -> Shape: self.scale = scale return self class Circle(Shape): def set_radius(self, r: float) -> Circle: self.radius = r return self Circle().set_scale(0.5) # *Shape*, not Circle Circle().set_scale(0.5).set_radius(2.7) # => Error: Shape has no attribute set_radius Extras Luciana: Black is no longer in beta! Version 22.1.0 is out 🥳 PyCascades 2022 reminder (remote!) Joke:
#268 Wait, you can Google that?
Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Madison @AetherUnbound Brian #1: (draft) PEP 679 -- Allow parentheses in assert statements Pablo Galindo Salgado This is in draft, not approved, and not scheduled for any release But it seems like a really good idea to me. assert(1 == 2, "seems like it should fail") will always pass currently since the tuple (False,"seems like it should fail") is a non-empty tuple. Current Python will emit a warning >>> assert(1 == 2, "seems like it should fail") [stdin]:1: SyntaxWarning: assertion is always true, perhaps remove parentheses? But really, why not just change the language to allow assert with or without parens. Also would allow multi-line assert statements more easily: assert ( very very long expression, "very very long " "message", ) I hope this is a slam dunk and gets in ASAP. Michael #2: Everything I googled as a dev by Sophie Koonin In an attempt to dispel the idea that if you have to google stuff you’re not a proper engineer, this is a list of nearly everything I googled in a week at work Rather than my posting a huge list, check out the day logs on her post Worth calling out a few: Expecting a parsed GraphQL document. Perhaps you need to wrap the query string in a "gql" tag? - said React upgrade then started causing some super fun errors. semantic HTML contact details - wanted to check if the [HTML_REMOVED] tag was relevant here editing host file - desperate times (and it didn’t even work) Madison #3: PyCascades 2022! Another year of excellent and diverse talks across an array of subjects. Talks from some well known folks (Thursday Bram, Jay Miller) as well as first time speakers (Joseph Riddle, Isaac Na) PSF’s DE&I Panel is doing a meet & greet, and they have a survey they’d like Python community members to fill out. Socials Friday & Saturday night, sprints on Sunday. Tickets are still available! Brian #4: Strict Python function parameters Seth Michael Larson We have keyword only parameters def process_data(data, *, encoding="ascii"): ... notice the * encoding has to be a keyword argument, cannot be positional. We have position only parameters: def process_data(data, /, encoding="ascii"): ... notice the / data has to be positional, cannot be passed in as a keyword argument Combine the two: def process_data(data, /, *, encoding="ascii"): ... Now data has to be positional, and encoding has to be a keyword, if present. This way a function really can only be called as intended and all uses of the function will be consistent. This is a good thing. There are many benefits, including empowering library authors to make changes without weird behaviors cropping up in user code. Commentary: extra syntax may be confusing for some new users. For a lot of library API entry points, I think this makes a lot of sense. Michael #5: mureq - vendored requests mureq is a single-file, zero-dependency alternative to python-requests Intended to be vendored in-tree by Linux systems software and other lightweight applications. Doesn’t support connection pooling (but neither does requests.get()). Uses much less memory Avoids supply chain attack vulnerabilities Consider my prod branch until PRs #2 and #3 are merged. Madison #6: Openverse No, not Metaverse! Previously “CC Search” Search engine for openly licensed media, for free and public use/remix of content. Currently images & audio, hope to include video, text, 3D models down the line. Start your search here Extras Michael: We now have playable times in the transcript section (example). Very cool tool for building regex-es I used for the above: regex101.com Next video is out: Do you even need loops in Python? A Python short by Michael Kennedy Remember, we have full-text search Brian: pip-secure-install - from Brett Cannon Python Testing with pytest is, when I last checked, the #2 bestseller at Pragmatic so cool My Maui trip was also a work trip. Gave me time to completely re-read the book, make notes, and make last minute changes. Changes went in this week and tonight is my “pencils down” date. This is getting real, folks. Thanks to everyone for buying beta copies and supporting the re-write. Madison: spd.watch - new police accountability/information tool for the Seattle area Shoutout to just (mentioned in Ep 242) ghcr.io - free docker image hosting for open source projects, easy integration with GitHub Actions Joke: via Josh Thurston How did the hacker get away from the police? He just ransomware. That joke makes me WannaCry… Where do you find a hacker? In decrypt.
#267 Python on the beach
Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Michael #1: Box: Python dictionaries with advanced dot notation access Want to treat dictionaries like classes? Box. small_box = Box({'data': 2, 'count': 5}) small_box.data == \ small_box['data'] == \ getattr(small_box, 'data') == \ small_box.get('data') There are over a half dozen ways to customize your Box and make it work for you: Check out the new Box github wiki for more details and examples! Superset of dict See Types of Boxes as well Brian #2: Reading tracebacks in Python Trey Hunner “When Python encounters an error in your code, it will print out a traceback. Let's talk about how to use tracebacks to fix our code.” Brian’s commentary Tracebacks can feel like brick wall of error telling you “you suck”. But they are really meant to help you, and do, once you know how to read them. Probably should be one of the earliest things we teach people new to coding. Like maybe: hello world tracebacks testing Anyway, back to Trey Start at the bottom. Read the last line first That will have the type of exception and an error message The two lines above that are The exact filename and line number where the exception occurs a copy of the line Those two lines are a stack frame. Keep going up and it’s other stack frames for the callstack of how you got here. Trey walks through this with an example and shows how to solve an error at a high level stack frame using the traceback. Michael #3: Raspberry Pi: These two new devices just went live on the International Space Station The International Space Station has connected new Raspberry 4 Model B units to run experiments from 500 student programmer teams in Europe. From the education-focused European Astro Pi Challenge These are new space-hardened Raspberry Pi units, dubbed Astro Pi The AstroPi units are part of a project run by the European Space Agency (ESA) for the Earth-focused Mission Zero and Mission Space Lab. The former allows young Python programmers to take humidity readings on board ISS while the latter lets students run various scientific experiments on the space station using its sensors. Brian #4: Make Simple Mocks With SimpleNamespace Adam Johnson Who’s crushing it recently, BTW, lots of recent blog posts SimpleNamespace is in the types standard library package. Works great as a test double, especially as a stub or fake object. “It’s as simple as possible, with no faff around being callable, tracking usage, etc.” Example: >from types import SimpleNamespace >obj = SimpleNamespace(x=12, y=17, verbose=True) >obj namespace(x=12, y=17, verbose=True) >obj.x 12 >obj.verbose True unittest.mock.Mock also works, but has the annoying feature that, unless you pass in a spec, any attribute will be allowed. The SimpleNamespace solution doesn’t allow any typos or other attributes. Example: >obj.vrebose Traceback (most recent call last): File "[HTML_REMOVED]", line 1, in [HTML_REMOVED] AttributeError: 'types.SimpleNamespace' object has no attribute 'vrebose'. Did you mean: 'verbose'? Michael #5: Extra, extra, exta Marak Squires, supply chain issues (NPM), and terrorism? [npm issues] css outlines! python 3.10.2 Python Shorts YouTube series #1 Parsing data with Pydantic #2 Counting the number of times items appear with collections.Counter Stream Deck + PyCharm video, github repo Brian #6: 3 Things You Might Not Know About Numbers in Python David Amos Most understated phrase I’ve read in a long time: “… there's a good chance that you've used a number in one of your programs” There’s more to numbers than many people realize The 3 things numbers have methods integers have to_bytes(length=1, byteorder="big") int.from_bytes(b'\x06\xc1', byteorder="big") class method bit_length() and a bunch of others floats have is_integer(), as_integer_ratio() and a bunch more use variables or parentheses, though. 5.bit_length() doesn’t work n=5; n.bit_length() and (5).bit_length() works numbers have hierarchy Every number in Python is an instance of the Number class. so isinstance(value, Number) should work for any number type Then there’s 4 abstract types encompassing other types Complex: has type complex Real: has float Rational: has Fraction Integral: has int and bool Where’s Decimal? It’s not part of those abstract types, it directly inherits from Number Also, floats are weird Numbers are extensible You can derive from numeric classes, both abstract and concrete, and create your own However, to do this effectively, you gotta implement A LOT of dunder methods. Joke:
#266 Python has a glossary?
See the full show notes for this episode on the website at pythonbytes.fm/266.
#265 Get asizeof pympler and muppy
See the full show notes for this episode on the website at pythonbytes.fm/265.
#264 We're just playing games with Jupyter at this point
See the full show notes for this episode on the website at pythonbytes.fm/264.
#263 It’s time to stop using Python 3.6
See the full show notes for this episode on the website at pythonbytes.fm/263.
#262 So many bots up in your documentation
See the full show notes for this episode on the website at pythonbytes.fm/262.
#261 Please re-enable spacebar heating
See the full show notes for this episode on the website at pythonbytes.fm/261.