Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.
#282 Don't Embarrass Me in Front of The Wizards
May 03, 2022
00:28:32
24.1 MB
Downloads: 0
Watch the live stream:
Watch on YouTubeAbout the show
Sponsored by us! Support our work through:
Brian #1: pyscript
- Python in the browser, from Anaconda. repo here
- Announced at PyConUS
- “During a keynote speech at PyCon US 2022, Anaconda’s CEO Peter Wang unveiled quite a surprising project — PyScript. It is a JavaScript framework that allows users to create Python applications in the browser using a mix of Python and standard HTML. The project’s ultimate goal is to allow a much wider audience (for example, front-end developers) to benefit from the power of Python and its various libraries (statistical, ML/DL, etc.).” from a nice article on it, PyScript — unleash the power of Python in your browser
- PyScript is built on Pyodide, which is a port of CPython based on WebAssembly.
- Demos are cool.
- Note included in README: “This is an extremely experimental project, so expect things to break!”
Michael #2: Memray from Bloomberg
- Memray is a memory profiler for Python.
- It can track memory allocations in
- Python code
- native extension modules
- the Python interpreter itself
- Works both via CLI and focused app calls
- Memray can help with the following problems:
- Analyze allocations in applications to help discover the cause of high memory usage.
- Find memory leaks.
- Find hotspots in code which cause a lot of allocations.
- Notable features:
- 🕵️♀️ Traces every function call so it can accurately represent the call stack, unlike sampling profilers.
- ℭ Also handles native calls in C/C++ libraries so the entire call stack is present in the results.
- 🏎 Blazing fast! Profiling causes minimal slowdown in the application. Tracking native code is somewhat slower, but this can be enabled or disabled on demand.
- 📈 It can generate various reports about the collected memory usage data, like flame graphs.
- 🧵 Works with Python threads.
- 👽🧵 Works with native-threads (e.g. C++ threads in native extensions)
- Has a live view in the terminal.
- Linux only
Brian #3: pytest-parallel
- I’ve often sped up tests that can be run in parallel by using -n from pytest-xdist.
- I was recommending this to someone on Twitter, and Bruno Oliviera suggested a couple of alternatives. One was pytest-parallel, so I gave it a try.
- pytest-xdist runs using multiprocessing
- pytest-parallel uses both multiprocessing and multithreading.
- This is especially useful for test suites containing threadsafe tests. That is, mostly, pure software tests.
- Lots of unit tests are like this. System tests are often not.
- Use
--workers
flag for multiple processors,--workers auto
works great. - Use
--tests-per-worker
for multi-threading.--tesst-per-worker auto
let’s it pick. - Very cool alternative to xdist. -
Michael #4: Pooch: A friend for data files
- via via Matthew Fieckert
- Just want to download a file without messing with
requests
andurllib
? - Who is it for? Scientists/researchers/developers looking to simply download a file.
- Pooch makes it easy to download a file (one function call). On top of that, it also comes with some bonus features:
- Download and cache your data files locally (so it’s only downloaded once).
- Make sure everyone running the code has the same version of the data files by verifying cryptographic hashes.
- Multiple download protocols HTTP/FTP/SFTP and basic authentication.
- Download from Digital Object Identifiers (DOIs) issued by repositories like figshare and Zenodo.
- Built-in utilities to unzip/decompress files upon download
file_path = pooch.retrieve(url)
Extras
Michael:
Joke:
- Don’t embarrass me in front of the wizards
- Michael’s crashing github is embarrassing him in front of the wizards!