Hello!
My name is Mohammed Zakaria. I am a data scientist. I will use this page to post few projects that I am working on and some resources that I found usefull through my journey in the field of Data Science.
Influencers in Data Science:
- Eamonn Keogh: Master of time series and clustering. No twitter handle. Checkout his website: https://www.cs.ucr.edu/~eamonn/ a treasure!
- Peyman Milanfar @docmilanfar. Nice posting about stats
- Sbastian Raschka: He is an excellent communicator for ML: GitHub
- Katiey Bauer: Good resource in DS management and Leadership website and (twitter](@https://twitter.com/imightbemary)
Good papers in Data Science:
- https://www.cs.ucr.edu/~eamonn/meaningless.pdf How clustering times series after applying a moving window filter is meaningless. Brilliant paper!
- https://arxiv.org/pdf/1811.12808.pdf Model Evaluation, Model Selection, and AlgorithmSelection in Machine Learning. Nice paper that covers the fundamentals.
Interesting info in Statistics
https://github.com/mzakariaCERN/FunWithStats.git
Data Science systems: Engineering, Project Management, Product Development, MLOps
https://www.chaos-engineering.dev/p/your-data-science-problems-are-engineering
Data Sources
1. Aiddata
Aiddata has an excellent list of geospatial datasets. It’s all available as a CSV. So you don’t even need to know GIS to access their data.
link: http://geo.aiddata.org
More resources: https://x.com/yohaniddawela/status/1776203931036647455
Utility Industry
https://www.utilitydive.com/news/electricity-transmission-competition-first-refusal-rofr-ferc-cicio/713955/