Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.
#281 ohmyzsh + ohmyposh + mcfly + pls + nerdfonts = wow
April 28, 2022
00:46:34
39.65 MB
Downloads: 0
Watch the live stream:
Watch on YouTubeAbout the show
Sponsored: RedHat: Compiler Podcast
Special guest: Anna Astori
Michael #1: Take Your Github Repository To The Next Level 🚀️
- Step 0. Make Your Project More Discoverable
- Step 1. Choose A Name That Sticks
- Step 2. Display A Beautiful Cover Image
- Step 3. Add Badges To Convey More Information
- Step 4. Write A Convincing Description
- Step 5. Record Visuals To Attract Users 👀
- Step 6. Create A Detailed Installation Guide (if needed)
- Step 7. Create A Practical Usage Guide 🏁
- Step 8. Answer Common Questions
- Step 9. Build A Supportive Community
- Step 10. Create Contribution Guidelines
- Step 11. Choose The Right License
- Step 12. Plan Your Future Roadmap
- Step 13. Create Github Releases (know release drafter)
- Step 14. Customize Your Social Media Preview
- Step 15. Launch A Website
Brian #2: Fastero
- “Python timeit CLI for the 21st century.”
- Arian Mollik Wasi, @wasi_master
- Colorful and very usable benchmarking/comparison tool
- Time or Compare one ore more
- code snippet
- python file
- mix and match, even
- Allows setup code before snippets run
- Multiple output export formats: markdown, html, csv, json, images, …
- Lots of customization possible
- Takeaway
- especially for comparing two+ options, this is super handy
Anna #3:
- langid vs langdetect
langdetect
- This library is a direct port of Google's language-detection library from Java to Python
- langdetect supports 55 languages out of the box (ISO 639-1 codes):
- Basic usage: detect() and detect_langs()
- great to work with noisy data like social media and web blogs
- being statistical, works better on larger pieces of text vs short posts
langid
- hasn't been updated for a few years
- 97 languages
- can use Python's built-in wsgiref.simple_server (or fapws3 if available) to provide language identification as a web service. To do this, launch python langid.py -s, and access http://localhost:9008/detect . The web service supports GET, POST and PUT.
- the actual calculations are implemented in the log-probability space but can also have a "confidence" score for the probability prediction between 0 and 1: > from langid.langid import LanguageIdentifier, model > identifier = LanguageIdentifier.from_modelstring(model, norm_probs=True) > identifier.classify("This is a test") > ('en', 0.9999999909903544) - minimal dependencies - relatively fast - NB algo, can train on user data.
Michael #4: Watchfiles
- by Samual Colvin (of Pydantic fame)
- Simple, modern and high performance file watching and code reload in python.
- Underlying file system notifications are handled by the Notify rust library.
- Supports sync watching but also async watching
- CLI example
- Running and restarting a command¶
- Let's say you want to re-run failing tests whenever files change. You could do this with watchfiles using
- Running a command:
watchfiles 'pytest --lf``'
Brian #5: Slipcover: Near Zero-Overhead Python Code Coverage
- From coverage.py twitter account, which I’m pretty sure is Ned Bachelder
- coverage numbers with “3% or less overhead”
- Early stages of the project.
- It does seem pretty zippy though.
- Mixed results when trying it out with a couple different projects
- flask:
- just pytest: 2.70s
- with slipcover: 2.88s
- with coverage.py: 4.36s
- flask with xdist n=4
- pytest: 2.11 s
- coverage: 2.60s
- slipcover: doesn’t run (seems to load pytest plugins)
- flask:
- Again, still worth looking at and watching. It’s good to see some innovation in the coverage space aside from Ned’s work.
Anna #6:
- scrapy vs robox
scra-py
- shell to try out things: fetch url, view response object, response.text
- extract using css selectors or xpath
- lets you navigate between levels e.g. the parent of an element with id X
- crawler to crawl websites and spider to extract data
- startproject for project structure and templates like settings and pipelines
- some advanced features like specifying user-agents etc for large scale scraping.
- various options to export and store the data
- nice features like LinkExtractor to determine specific links to extract, already deduped.
- FormRequest class
robox
- layer on top of httpx and beautifulsoup4
- allows to interact with forms on pages: check, choose, submit
Extras
Michael:
Joke: