Skip to main content

Data Science Studies // Focus on Statistics First! [2019 Study Plan]

Study Plan for Data Science

One of the biggest eye-openers in my journey has been the realization that Data Science fundamentally revolves around Statistics. There’s a well-established college major focused on statistics, and data science isn't entirely new; it’s a combination of traditional statistical methods with modern computing technologies, utilizing software like R and SAS for analysis.

To truly grasp data science, it's essential not only to learn programming languages but also to understand the underlying principles. As such, I've decided to revise my study plan for 2019, shifting from a language/platform-specific focus to a more theory-based approach that emphasizes core concepts and techniques.

Eventual Skillset

I. Domain:

  1. Statistics
  2. DW/BI (Data Warehouse / Business Intelligence)
  3. Math

II. Tools:

  1. SQL
  2. Tableau
  3. R
  4. Python


I. Domain

A. Statistics

  • Statistics and Data Analysis (WMU - Statistics 160 Textbook)
    A solid introduction to key concepts.

  • Basics of Statistics [BoS]
    A great follow-on to reinforce and clarify concepts with alternative definitions.

  • Simple Data Analysis for Biologists
    Useful for examples in hypothesis building.

These readings have been straightforward, providing a diverse perspective on statistics and showcasing the universality of concepts like discrete and continuous data. My goal is to master these fundamental concepts and deepen my appreciation for data science, alongside refreshing my math knowledge.

B. DW/BI

  • Kimball - Data Warehouse 3rd Edition
    Essential reading for understanding data warehousing.

  • Guide to Data Modeling - UW 1999
    A foundational text to support my study of Kimball.

C. Math

  • Advanced Calculus Textbook
  • Probability and Mathematical Statistics

II. Tools

I am proficient in SQL and have completed the MCSA - SQL Server certification, which includes the excellent Itzik Ben-Gan 70-461 - Querying SQL Server, regarded as the best in that series. This year, my study plan converges with certification goals:

  • Tableau - Desktop Specialist: $150 USD
  • 70-773 - Analyzing Big Data with Microsoft R [for MCSE - Data Management & Analytics]: $165 USD
  • 98-381 - Introduction to Python [for MTA]: $127 USD

III. Goals

My goals for this journey are threefold:

  1. Read All the Books: Roughly 2,000 pages to cover.
  2. Practical Hands-On Experience: Apply the concepts learned.
  3. Get the Certifications: Achieve formal recognition of my skills.

Given the extensive scope of this plan, I anticipate that it will take several years to achieve my goals. Historically, it took me about two years to earn my MCSA, and considering the complexity of data science, two years feels optimistic for this endeavor, especially since I have already dedicated much of 2018 to studying and completed two textbooks (Art of R, Python).


IV. Updates

Update 11/14/18

Finished WMU Stats Book
I completed all 11 chapters of the Statistics and Data Analysis (WMU - Statistics 160 Textbook). It was a manageable read that helped clarify several basic concepts while introducing new material that I will need to revisit for deeper understanding. I highly recommend this resource for anyone interested in learning statistics. Next, I plan to study Basics of Statistics to solidify my foundational knowledge.


Update 11/18/18

Finished Basics of Statistics [BoS]
I completed this book, which served as an excellent counterpoint to the WMU textbook. It’s almost like a condensed version of WMU, and while I skimmed both texts, I plan to return to specific sections for more in-depth understanding. Concepts like Central Limit Theory (CLT), confidence intervals, null hypotheses, and p-values are becoming clearer.

I also skimmed the Simple Data Analysis for Biologists, which helped me understand how to formulate hypotheses. Reading multiple texts on the same subject has been beneficial, revealing different strengths and perspectives.

Update 11/20/18

Kimball Data Warehouse 3rd Edition
I’ve begun studying the DW/BI concepts using Kimball’s 3rd Edition. It’s been an excellent read, and I wish I had explored it sooner! My prior studies in statistics are helping me appreciate how to design data warehouses to support analytical models.

I completed the Guide to Data Modeling - UW 1999 quickly, and it provided a solid foundation before diving into Kimball, discussing ERDs (Entity-Relationship Diagrams) and essential vocabulary that adds context to my studies.


This retooled plan aligns my learning path with my long-term goals in data science, combining theoretical knowledge with practical skills and certifications.

Comments

Popular posts from this blog

Sony MDR-ZX100 vs ZX-110 vs ZX-310 Series Headphones

Audio & Gear Sony ZX Series Headphones A Budget-Friendly Sound Choice — and Where to Go Next 3 Models Reviewed $10 Starting Price 2.5+ Years of Use 30mm Drivers (all models) If you're on the hunt for budget-friendly headphones with decent quality, the Sony ZX Series is definitely worth considering. I happen to own several models from the lineup — the ZX-100, ZX-110, and ZX-310 — and have put all three through enough real-world use to have opinions on each.   Table of Contents ① Build Quality ② Cost Comparison ③ Specifications ④ Sound Quality ⑤ Overall Value ⑥ Upgrade Path Build Quality ZX-310 Takes the Lead The Sony ZX series headphones primarily feature a durable plastic construction. My ZX-100 has lasted over 2½ years, enduring countless tosses into my backpack and car ...

Casio G-Shock 5600 vs 6900 vs 9000

G-Shock · Casio · Field Notes G-Shock Preferences and Favorites What several models taught me about what I actually want 5600 Top Series G9000 Best Compromise World Time Must-Have Feature After trying out several G-Shock models, I've developed a better sense of the specific features and design elements I appreciate most. While features are always a plus, my main priority is size. Here's how some of the models I've tried stack up.   Table of Contents ① Size Preference: 5600 Series ② Best Compromise: Mudman ③ Feature Needs ④ Final Verdict Size Preference 5600 Series For overall size, the 5600 series stands out as a favorite due to its compact, comfortable form. It's slim, lightweight, and fits well on my wrist without being too bulky. Although the 6900 series provides the benefit of ...

Casio MTD 1010 the $30 Submariner Homage

Diver-Style Watches  ·  Budget Horology Casio MTD-1010 Oyster Quartz  ·  Ref. MTD-1010-1AV The Best Budget Submariner Homage $30 eBay Price 41.5mm Case Size 10 ATM Water Resistance 20mm Lug Width If you're on the hunt for an affordable watch that channels the classic diver aesthetic of the Submariner, look no further than the Casio MTD-1010. Priced at around $30 on eBay, this model offers incredible value for anyone who loves a good deal.   Table of Contents ① Affordable Elegance ② Function Over Frills ③ Best Bang for Your Buck ④ Specs & Practical Tips Section I Affordable Elegance The MTD-1010 strikes a ba...