The Prepare recipe has many features for exploring a dataset as you enrich and transform it. A typical progression is to gain an understanding of the columns in your dataset, the distribution of values within columns of interest, and then to explore values and patterns of values within the dataset.
The Columns view is useful when working with many columns. Using filtering and sorting, you can discover columns that are similar and find columns you’re looking for.
You can filter columns in three ways:
You can sort by various criteria, some of which are only appropriate to columns with numeric meaning. It’s generally useful to display the sort criteria in the column under the sort menu.
Any filtering and sorting you apply is cumulative.
The Table view allows you to quickly navigate to a column by typing
c and then entering text in the name of the column. The dropdown selection updates as you type to show columns whose name contain the typed text. Additionally, you can display a selection of columns.
There are two ways to explore the distributions of values in columns:
Using coloring, filtering, and highlighting, you can zero in on values of interest in the Table view.
By default, cells are colored by meaning validity, with red for cells that don’t match the column Meaning, but you can also color by column values.
Using a combination of color shading and column selection, you can visually scan for patterns of values across columns of interest.
Filtering values is performed:
Any coloring, filtering and sorting you apply is cumulative.
When a value is very long, you can select Show complete value, or use the
Shift + v shortcut, to display the full cell contents so that it is easier to copy. Note: triple-clicking on a cell also selects the full cell contents, even if the contents are not entirely displayed.
You can also highlight a row of interest by selecting Toggle row highlight, or using the
Shift + h shortcut.