pd.melt() can be confusing because of its mediocre documentation and unintuitively-named parameters. It’s nevertheless important for getting dataframes into ‘tidy’ format for plotting purposes.
First, we’ll discuss the
value_vars parameters. Pass into
id_vars a list of columns that you DO NOT WANT CHANGED. These columns will remain as they are. Pass into
value_vars a list of columns you want to be combined into a single column.
pd.melt() takes the columns passed into
value_vars and automatically generates a new column containing the names of those old columns. By default this column is called ‘
variable‘. An additional new column ‘
value‘ is generated containing the values from the old columns.
Imagine you have a dataframe with columns
US Dollar‘, ‘
Japanese Yen‘. The values correspond to the price of a cryptoasset like Bitcoin or Ethereum in a quote currency, like USD.
If we run
df.melt(id_vars='asset', value_vars=['US Dollar', 'Euro', 'Japanese Yen']), we’ll get back a dataframe with an ‘asset’ column, a ‘
variable‘ column containing ‘
US Dollar‘, ‘
Japanese Yen‘, and a column ‘
value‘ containing the value of each cryptoasset in each of the denominations.
pd.melt() gives you some control over how the columns are renamed. By default
pd.melt() will assume that any column not passed into
id_vars is to be transformed. So instead of the above code, we could run
pd.melt(id_vars='airline', var_name='asset', value_name='asset_price'), which will produce the same dataframe but with more appropriately named columns.