Constituent Ingredients of Data Science¶
The following is my eccentric opinion on what makes up the cross disciplinary subject of data science, along with a non-exhaustive list of subconstituents.
- Math
Probability theory
- Statistical Learning/Inference
Maximum Likelihood
Probably Approximately Correct
Hypothesis Testing
- Optimization
Calculus
- Linear Algebra/Sheaf theory
Arrays
Signal Processing
- Software Development
Documentation
- Text Editing
IDE, Vim, etc.
Fluently reading/writing in high Level programming language(s)
Using/creating libraries, APIs, open source software, etc.
- Development practices
Test Driven Development
AGILE
Continuous integration/delivery
Version control
Dependency management
Effective communication
Identification/construction of key performance indicators
Product sense
- Subject Matter Expertise
“Know the data”
Note
I used to have decision making as a section. This has been subsumed under the software development.
Note
I used to have a section on the experimental method. I’ve subsumed this into the category of statistics, which fits into math. This is questionable. What I mean is that the aspects of the experimental method which are relevant to data science fit into statistics, and therefore math.