The podcast about Python and the people who make it great
Open Source Automated Machine Learning With MindsDB
Summary
Machine learning is growing in popularity and capability, but for a majority of people it is still a black box that we don’t fully understand. The team at MindsDB is working to change this state of affairs by creating an open source tool that is easy to use without a background in data science. By simplifying the training and use of neural networks, and making their logic explainable, they hope to bring AI capabilities to more people and organizations. In this interview George Hosu and Jorge Torres explain how MindsDB is built, how to use it for your own purposes, and how they view the current landscape of AI technologies. This is a great episode for anyone who is interested in experimenting with machine learning and artificial intelligence. Give it a listen and then try MindsDB for yourself.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
- And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
- You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
- Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
- Your host as usual is Tobias Macey and today I’m interviewing George Hosu and Jorge Torres about MindsDB, a framework for streamlining the use of neural networks
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by explaining what MindsDB is and the problem that it is trying to solve?
- What was the motivation for creating the project?
- Who is the target audience for MindsDB?
- Before we go deep into MindsDB can you explain what a neural network is for anyone who isn’t familiar with the term?
- For someone who is using MindsDB can you talk through their workflow?
- What are the types of data that are supported for building predictions using MindsDB?
- How much cleaning and preparation of the data is necessary before using it to generate a model?
- What are the lower and upper bounds for volume and variety of data that can be used to build an effective model in MindsDB?
- One of the interesting and useful features of MindsDB is the built in support for explaining the decisions reached by a model. How do you approach that challenge and what are the most difficult aspects?
- Once a model is generated, what is the output format and can it be used separately from MindsDB for embedding the prediction capabilities into other scripts or services?
- How is MindsDB implemented and how has the design changed since you first began working on it?
- What are some of the assumptions that you made going into this project which have had to be modified or updated as it gained users and features?
- What are the limitations of MindsDB and what are the cases where it is necessary to pass a task on to a data scientist?
- In your experience, what are the common barriers for individuals and organizations adopting machine learning as a tool for addressing their needs?
- What have been the most challenging, complex, or unexpected aspects of designing and building MindsDB?
- What do you have planned for the future of MindsDB?
Keep In Touch
- George
- Blog
- George3d6 on GitHub
- @Cerebralab2 on Twitter
- Jorge
- MindsDB
Picks
- Tobias
- Bose QuietComfort 25 noise cancelling headphones
- George
- Jorge
Links
- MindsDB
- 3Blue1Brown – Neural Networks
- Think Bayes
- Backpropagation
- Reverse Automatic Differentiation
- Ludwig deep learning toolbox
- Lightwood
- Tensorflow
- PyTorch
- Aerospike
- scikit-learn
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA