Statistics for Data Science. (T-SQL Tuesday 108)

TSQL2SDAY-300x300The invitation to this blog party is here: and it asks for one thing I want to learn that is not SQL Server.  The TL;DR answer is: statistics for data science.

I started working on this earlier this year.  From June to October I took the “Business Intelligence and Data Analytics” certificate program at the University of Victoria. Each class started with a weekend on campus and was followed by a month of assignments to complete off campus. The three classes were:

  1. Business Intelligence and Data Analytics: Basics
  2. Business Intelligence: Dashboard Design
  3. Data Analytics: Model Design

The coursework was all R. I had not used R previously and gotta say I didn’t love it, but I learned enough to ace the assignments and how to google for what I needed. (Lets face it, that’s half the battle of any new language.) By the end of the course we were scraping data from multiple data sources and formats, manipulating the data into a data analytics project and then building descriptive and diagnostic models on that data.

The biggest challenge I had with the course was with the statistics. I’ve taken some form of Stats 101 a few times, (first when taking commerce at university, and again taking programming at college,) but nothing that would prepare me for data science. I want a much better understanding of the concepts so I could actually DO a data science project.

Last Monday at PASS Summit ( I took the “Advanced R” pre-con by Dejan Sarka (t), which was a great follow up to the course I’d just finished, but he reinforced what I’d been feeling… I need more stats.

The next few months will be busy so I’ve tried to keep them realistic, but my next steps are to:

  1. In the next couple weeks: Use my commute to listen to an audiobook.  I’ll probably start with “Naked Statistics: Stripping the Dread from the Data“.  If you, dear reader, has a podcast or other audiobook suggestion for me, I’d love to hear it (pun intended).
  2. Following that: Borrow the book: “Practical Statistics for Data Scientists” from work and read up on at least 3 topics (I can’t commit to reading a whole book… I have no time for reading print until next year).
  3. After Christmas: Do a few kaggle competitions and see where else I need to brush up.

1 thought on “Statistics for Data Science. (T-SQL Tuesday 108)

  1. Pingback: T-SQL Tuesday 108 – A poetic summary – Curious..about data

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s