What is Data Science?
Unfortunately there is little consensus on how to define Data Science so we will approach it as follows:
|
Some techniques/concepts we will learn include:
|
Some statistical tools we will learn include:
|
The class will be taught using:
- Pencil and paper style activities to focus on concept, not software or tool set.
- Excel. Easy to learn, full-featured. We will use it for numerical analysis and simple graphing.
- R Studio (and R). Very powerful statistical software package. The standard program for most data analysts. We will use it for more sophisticated or taxing analysis and producing better/more detailed plots.
- Google Fusion tables. Allows for heat maps of data and network graphs.
MAIN COURSE SEQUENCE
- Start with Unit Sequence for 1st half
- Mix of pen and paper activities and Excel
- Slowly ramp up Excel knowledge (Formulas, Pivot tables, etc.)
- Slowly introduce R
- Side unit with Fusion tables
- Mix of pen and paper activities and Excel
- Project with Excel
- Project with Fusion tables
- Heat maps
- Network graphs
- Heat maps
- Project with R
- Students design and carry out own project for PBA
- Find data
- Clean data
- EDA
- Pose questions/analysis ideas
- Run analysis
- Visualizations
- Presentation and write-up
- Find data