Next Steps#
In this section, we’ve introduced the concept of Scientific Computing along with two incredibly popular packages: NumPy
and pandas
. However, these two packages, while incredibly powerful, only get you to the point of working with tabular data in python. They are great for managing, manipulating, and summarizing data. However, they won’t help you analyze your data…and can only enable you to scratch the surface of visualizing data.
The goal of this last chapter is simply to point you in the direction of where you could learn more about analyzing data in Python.
Data Visualization#
There are many ways to visualize data. We saw that pandas
has some built-in capabililties; however, it is limited in its visualizations. To this end, we’ll point you in the direction of three additional packages for plotting and visualizaing data in python:
matplotlib
| This package is well-established and used ubiquitously. It allows for generation and customization of data visualizations and publication-quality plots.seaborn
| This package is built on top of matplotlib but has a high-level interface for generating good-looking plots with less code and time thanmatplotlib
. Works well withDataFrame
s frompandas
.altair
| Implements the grammar of graphics in Python, enabling a consistent and simple API. Not quite as popular as the two above, but incredibly powerful!
Data Analysis#
statsmodels
| This package is the go-to or statistical modeling, enabling users to explore data and carry out statistical tests. Works well withDataFrame
s frompandas
.SciPy
| Includes algorithms for optimiation, differential equations, algebraic equations, etc. Builds on top ofNumPy
and works well with numerical data.Scikit-learn
| Popular package for learning machine learning in python. Built onNumPy
,SciPy
, andmatplotlib
. Has really great documentation.PyTorch
| Ecosystem to generate production-ready machine learning models; often used in concert withTensorFlow