A name-changing help

Column names polishing

Having repated column names, names with spaces in them, names where spaces are padding the beginning or the end, names with inconsistent formating, etc can certainly become a trouble when trying to reference a certain column during your workflow.

To tackle this problems directly, we have the functions polish_names, polish_names! and polish_names_ROT used as follows:

julia> using Cleaner

julia> ct = CleanTable([Symbol(" Horrible name 1"), Symbol("another_bad Name  ")], [[1], [2]])
┌──────────────────┬────────────────────┐
│  Horrible name 1 │ another_bad Name   │
│            Int64 │              Int64 │
├──────────────────┼────────────────────┤
│                1 │                  2 │
└──────────────────┴────────────────────┘


julia> polish_names(ct)
┌─────────────────┬──────────────────┐
│ horrible_name_1 │ another_bad_name │
│           Int64 │            Int64 │
├─────────────────┼──────────────────┤
│               1 │                2 │
└─────────────────┴──────────────────┘


julia> polish_names(ct; style=:camelCase)
┌───────────────┬────────────────┐
│ horribleName1 │ anotherBadName │
│         Int64 │          Int64 │
├───────────────┼────────────────┤
│             1 │              2 │
└───────────────┴────────────────┘

Note

Currently the only available styles are :snake_case and :camelCase. The default style is :snake_case.

Internally polish_names, polish_names! and polish_names_ROT all call the generate_polished_names function, so if you just need to generate better names for your table, you could call it as follows and manually rename your table.

julia> generate_polished_names(["  _aName with_lotsOfProblems", "  _aName with_lotsOfProblems"])
2-element Vector{Symbol}:
 :a_name_with_lots_of_problems
 :a_name_with_lots_of_problems_1

julia> generate_polished_names(["  _aName with_lotsOfProblems", "  _aName with_lotsOfProblems"]; style=:camelCase)
2-element Vector{Symbol}:
 :aNameWithLotsOfProblems
 :aNameWithLotsOfProblems_1

If all you want is to change the column names to be your desired ones, you can always use the rename, rename! and rename_ROT functions.

julia> rename(ct, [:A, :B])
┌───────┬───────┐
│     A │     B │
│ Int64 │ Int64 │
├───────┼───────┤
│     1 │     2 │
└───────┴───────┘

Making a row be the column names

When working with messy data you might end up having the row names being the second or third row of the table you have loaded. For this cases you can use the row_as_names, row_as_names! and row_as_names_ROT functions.

By default, row_as_names, row_as_names! and row_as_names_ROT will remove all rows above the index passed, but this behavior can be overwritten by passing the optional keyword argument remove=false.

julia> ct = CleanTable([Symbol(" "), Symbol(" ")], [[" ", "A", 1], [" ", "B", 2]])
┌─────┬─────┐
│     │     │
│ Any │ Any │
├─────┼─────┤
│     │     │
│   A │   A │
│   1 │   1 │
└─────┴─────┘


julia> row_as_names(ct, 2)
┌─────┬─────┐
│   A │   B │
│ Any │ Any │
├─────┼─────┤
│   1 │   2 │
└─────┴─────┘


julia> row_as_names(ct, 2; remove=false)
┌─────┬─────┐
│   A │   B │
│ Any │ Any │
├─────┼─────┤
│     │     │
│   A │   B │
│   1 │   2 │
└─────┴─────┘