- Practical Predictive Analytics
- Ralph Winters
- 177字
- 2025-04-04 19:02:43
Dimension reduction techniques
You will often be examining hundreds or even thousands of variables, and dimension reduction is a technique that you can use to drastically reduce the number of rows or variables that you need to examine. The premise behind dimension reduction is duplication, that is, many variables which are measuring the same thing. For example, reading, writing, math, and musical aptitude scores are all important in predicting a college GPA, but it is possible that if you only used a musical aptitude score in combination with a writing score, you could achieve the same prediction accuracy as compared to using all four measures. It might also be easier to explain as well, that is one example of why you would use a dimension reduction technique.
When you are looking at reducing the actual number of variables, consider principal components. You will end up with the same number of observations but will limit the number of variables you look at (the Principal Components section).
When you are considering reducing the dimensionality of rows, use clustering methods.