Statistics for Data Science. (T-SQL Tuesday 108)

TSQL2SDAY-300x300The invitation to this blog party is here: https://curiousaboutdata.com/2018/10/29/t-sql-tuesday-108-invitation-non-sql-server-technologies/ and it asks for one thing I want to learn that is not SQL Server.  The TL;DR answer is: statistics for data science.

I started working on this earlier this year.  From June to October I took the “Business Intelligence and Data Analytics” certificate program at the University of Victoria. Each class started with a weekend on campus and was followed by a month of assignments to complete off campus. The three classes were:

  1. Business Intelligence and Data Analytics: Basics
  2. Business Intelligence: Dashboard Design
  3. Data Analytics: Model Design

The coursework was all R. I had not used R previously and gotta say I didn’t love it, but I learned enough to ace the assignments and how to google for what I needed. (Lets face it, that’s half the battle of any new language.) By the end of the course we were scraping data from multiple data sources and formats, manipulating the data into a data analytics project and then building descriptive and diagnostic models on that data.

The biggest challenge I had with the course was with the statistics. I’ve taken some form of Stats 101 a few times, (first when taking commerce at university, and again taking programming at college,) but nothing that would prepare me for data science. I want a much better understanding of the concepts so I could actually DO a data science project.

Last Monday at PASS Summit (http://www.passsummit.com) I took the “Advanced R” pre-con by Dejan Sarka (t), which was a great follow up to the course I’d just finished, but he reinforced what I’d been feeling… I need more stats.

The next few months will be busy so I’ve tried to keep them realistic, but my next steps are to:

  1. In the next couple weeks: Use my commute to listen to an audiobook.  I’ll probably start with “Naked Statistics: Stripping the Dread from the Data“.  If you, dear reader, has a podcast or other audiobook suggestion for me, I’d love to hear it (pun intended).
  2. Following that: Borrow the book: “Practical Statistics for Data Scientists” from work and read up on at least 3 topics (I can’t commit to reading a whole book… I have no time for reading print until next year).
  3. After Christmas: Do a few kaggle competitions and see where else I need to brush up.

Giving Back (T-SQL Tuesday #102)

TSQL2SDAY-300x300This seems like as good a time to start a blog as any, as I feel like 2018 is the year I start giving back to this community. I’ve been a member of PASS since the first year I went to Summit in 2013 and I’ve learned a lot from the members of PASS. The pathways to learning have been varied, but I have really benefited from presentations, blog posts, hallway track, and even twitter. I wanted to give back, but was not prepared to take on more commitments until recently. My kids are getting older and lately don’t need their mommy quite like they used to.

At last Summit I committed to organizing a SQL Saturday for my area for early March. When I was preparing for it I realized I was too out of the loop with regard to my local tech scene. Over the last couple years the local user group had dried up and I wasn’t getting out and meeting people. So I became the leader of the local defunct user group and as soon as SQL Saturday was over I secured a venue and started setting up meetings for the local user group.

Our first meeting was a meet and greet to gauge interest. A dozen people came and they were enthusiastic and contributed ideas. The next month I presented a session and again the attendees were engaged. I have speakers lined up until summer break and plan to have at least 3 meetings in the fall.

Did you notice in the previous paragraph that *I* presented a session? You may not be aware that that was a significant statement. I had not done that before. The even more significant part is that I did it AGAIN last weekend at SQL Saturday in Edmonton. I plan to work at giving more presentations, by developing another session to present locally in the fall and by submitting to more events.

I’m also an organizer for the Professional Development virtual chapter. We just hosted a presentation and have two more in the pipe.  You can expect life from that group again.