Wednesday, June 3, 2015

R Programming [Johns Hopkins University]

I took the R Programming Course (part of the Data Science Specialization on Coursera) last month. The experience has been fulfilling to say the least. I always wanted to learn this programming language, one because I was (and still am) intrigued by it and second because I felt it would be a great asset for me as I work further on my academic project related to mutual fund investments.

Here are some of the key elements I liked about this course - 

(1) Course lectures

The lectures are detailed and cover everything from the origins of the language, to the basics of the programming language like defining variables and programming constructs (such as if, for and while loops), to more advanced concepts such as subsetting a vector and filtering out missing values (NAs). 

(2) Reading material

photo credit: Carlos Porto (flickr)
All slides used in the lectures can be viewed / downloaded for later reference. This helped me quite a bit. I always feel when you learn something new, you don't always get it the first time. But when you review a week later, you discover something new or understand the concept a lot better. Here's something that I've come to appreciate -
"Repetition is a form of change"
(3) Not too long, not too short, just right [Goldilocks]

This course is actually in its 14th edition. It is pretty much running on auto-pilot, starts on the 1st day of each month, and runs for exactly 4 weeks. In fact, all courses of the Data Science Specialization are designed this way. So you can choose any course you like, and by the end of month, you are pretty much done and can move on to the next course. Depending on your experience and bandwidth, you can even take up more than one course in parallel.

Coming to the course content itself, I felt each week had something new to offer and the accompanying quizzes and assignments were quite engaging. I'm sure you might read the Goldilocks story as a child, people refer to the Goldilocks principle these days for describing a task that is not too hard or too easy, but just right, just enough to get someone in the "flow" of things. This course felt exactly like that for me.

(4) Community help (including tips on completing the assignments)

This was one of the reasons why I persisted with the course. The assignments are challenging, as they not only test your understanding of the lectures, but also your ability to search for material outside the lectures, which will be needed to complete the assignment. For example, the first assignment requires that you know how to subset a data frame. Although the lectures covered subsetting in great detail, I still couldn't get myself to complete the assignment without searching for some help online.

It turns out that other students did the same thing in previous editions of the course, and a couple of them have collected resources that might help others do the assignments a lot faster. These includes tips, guided assignments (with step-by-step instructions), unit tests and a cheat sheet covering the contents of all lectures in a concise way (which is great for revision).

(5) The e-Book

Roger D. Peng (the course instructors) has released an e-book using the pay-what-you-want (PWYW) model, which means exactly what it sounds like. If you want to download the book for free, he allows you to do that. 

I recommend paying something as a token of gratitude for the hard work the author has put into making this book. I purchased a copy for myself at the end of the course. I plan to read it this month and write a review on it. 

(6) Swirl

Learning R in R!


I saved the best for last :) 

This was my most enjoyable experience with this course. Swirl is an amazing module that works within the R console. It has interactive courses that are organized by difficulty (beginner, intermediate, advanced). Each course comes with one or more topics that relate to a specific concept. After choosing a topic, you are provided with a series of instructions which aid you in understanding that topic / concept. There's also a neat progress bar at the end of each step. 

I completed the R Programming course module (within Swirl), which covers 15 topics in total. Its a great compliment to the content provided with the main course (lectures, quizzes and assignments). I look forward to completing the other interactive courses in the coming months.

Conclusion

This is one of the best courses I've taken on Coursera, well designed and great support from the community. If you are a programmer, a product manager looking to pick up some analytical skills or just someone interested to learn more about Data Science, I encourage you to take up this course.

No comments:

Post a Comment