mzakaria.github.io

View on GitHub

Hello!

My name is Mohammed Zakaria. I am a data scientist. I will use this page to post few projects that I am working on and some resources that I found usefull through my journey in the field of Data Science.

Influencers in Data Science:

  1. Eamonn Keogh: Master of time series and clustering. No twitter handle. Checkout his website: https://www.cs.ucr.edu/~eamonn/ a treasure!
  2. Peyman Milanfar @docmilanfar. Nice posting about stats
  3. Sbastian Raschka: He is an excellent communicator for ML: GitHub
  4. Katiey Bauer: Good resource in DS management and Leadership website and (twitter](@https://twitter.com/imightbemary)

Good papers in Data Science:

  1. https://www.cs.ucr.edu/~eamonn/meaningless.pdf How clustering times series after applying a moving window filter is meaningless. Brilliant paper!
  2. https://arxiv.org/pdf/1811.12808.pdf Model Evaluation, Model Selection, and AlgorithmSelection in Machine Learning. Nice paper that covers the fundamentals.

Interesting info in Statistics

https://github.com/mzakariaCERN/FunWithStats.git

Data Science systems: Engineering, Project Management, Product Development, MLOps

https://www.chaos-engineering.dev/p/your-data-science-problems-are-engineering

Data Sources

1. Aiddata

Aiddata has an excellent list of geospatial datasets. It’s all available as a CSV. So you don’t even need to know GIS to access their data.

link: http://geo.aiddata.org

More resources: https://x.com/yohaniddawela/status/1776203931036647455

Utility Industry

https://www.utilitydive.com/news/electricity-transmission-competition-first-refusal-rofr-ferc-cicio/713955/