- Practical Predictive Analytics
- Ralph Winters
- 158字
- 2025-04-04 19:02:43
Data munging and wrangling
A large part of preparing the data for analysis utilizes bringing disparate information together in order to produce the final analytics dataset, which will be passed directly to the algorithm. This process is known by many different names, such as data munging, data wrangling, ETL, or simply data prep. We have already discussed some ways in which we can read data from a single source. You will be very fortunate if you are able to work with a single data file that has all of the information that you need. In fact, if you are able to utilize data that is already consolidated, go for it, since someone else has already done the work and there is no need to try to figure out how to relate them yourself. However, most of the time you will need to relate at least two different sources, and somehow relate them based upon some common data elements.