Archive for the ‘Pandas’ Category

Pandas

Wednesday, August 17th, 2016

Pandas by Reuven M. Lerner.

From the post:

Serious practitioners of data science use the full scientific method, starting with a question and a hypothesis, followed by an exploration of the data to determine whether the hypothesis holds up. But in many cases, such as when you aren’t quite sure what your data contains, it helps to perform some exploratory data analysis—just looking around, trying to see if you can find something.

And, that’s what I’m going to cover here, using tools provided by the amazing Python ecosystem for data science, sometimes known as the SciPy stack. It’s hard to overstate the number of people I’ve met in the past year or two who are learning Python specifically for data science needs. Back when I was analyzing data for my PhD dissertation, just two years ago, I was told that Python wasn’t yet mature enough to do the sorts of things I needed, and that I should use the R language instead. I do have to wonder whether the tables have turned by now; the number of contributors and contributions to the SciPy stack is phenomenal, making it a more compelling platform for data analysis.

In my article “Analyzing Data“, I described how to filter through logfiles, turning them into CSV files containing the information that was of interest. Here, I explain how to import that data into Pandas, which provides an additional layer of flexibility and will let you explore the data in all sorts of ways—including graphically. Although I won’t necessarily reach any amazing conclusions, you’ll at least see how you can import data into Pandas, slice and dice it in various ways, and then produce some basic plots.

Of course, scientific articles are written as though questions drop out of the sky and data is interrogated for the answer.

Aside from being rhetoric to badger others with, does anyone really think that is how science operates in fact?

Whether you have delusions about how science works in fact or not, you will find that Pandas will assist you in exploring data.

Pandas Exercises

Saturday, July 30th, 2016

Pandas Exercises

From the post:

Fed up with a ton of tutorials but no easy way to find exercises I decided to create a repo just with exercises to practice pandas. Don’t get me wrong, tutorials are great resources, but to learn is to do. So unless you practice you won’t learn.

There will be three different types of files:

  1. Exercise instructions
  2. Solutions without code
  3. Solutions with code and comments

My suggestion is that you learn a topic in a tutorial or video and then do exercises. Learn one more topic and do exercises. If you got the answer wrong, don’t go to the solution with code, follow this advice instead.

Suggestions and collaborations are more than welcome. 🙂

I’m sure you will find this useful but when I search for pandas exercise python, I get 298,000 “hits.”

Adding exercises here isn’t going to improve the findability of pandas for particular subject areas or domains.

Perhaps as exercises are added here, links to exercises by subject area can be added as well.

With nearly 300K potential sources, there is no shortage of exercises to go around!