Vega Lite
Data Transformation
Density

How to Use Density Transform in Vega-Lite for Data Visualization?

Are you looking to create smooth curves that estimate the probability density of your data? You are in the right place! Vega-Lite's Density Transform helps you perform one-dimensional kernel density estimation. In simpler terms, it takes your raw data and helps visualize the data distribution smoothly.

What is the Density Transform?

Here is a general template for how to use it:

// Any View Specification
{
  ...
  "transform": [
    {"density": ...} // Density Transform
     ...
  ],
  ...
}

Density Transform Definition

PropertyTypeDescription
densityStringRequired. The data field for which to perform density estimation.
groupbyString[]The data fields to group by. If not specified, a single group containing all data objects will be used.
cumulativeBooleanA boolean flag indicating whether to produce density estimates (false) or cumulative density estimates (true).
Default value: false
countsBooleanA boolean flag indicating if the output values should be probability estimates (false) or smoothed counts (true).
Default value: false
bandwidthNumberThe bandwidth (standard deviation) of the Gaussian kernel. If unspecified or set to zero, the bandwidth value is automatically estimated from the input data using Scott’s rule.
extentNumber[]A [min, max] domain from which to sample the distribution. If unspecified, the extent will be determined by the observed minimum and maximum values of the density value field.
minstepsNumberThe minimum number of samples to take along the extent domain for plotting the density.
Default value: 25
maxstepsNumberThe maximum number of samples to take along the extent domain for plotting the density.
Default value: 200
resolveStringIndicates how parameters for multiple densities should be resolved. If "independent", each density may have its own domain extent and dynamic number of curve sample steps. If "shared", the KDE transform will ensure that all densities are defined over a shared domain and curve steps, enabling stacking.
Default value: "shared"
stepsNumberThe exact number of samples to take along the extent domain for plotting the density. If specified, overrides both minsteps and maxsteps to set an exact number of uniform samples. Potentially useful in conjunction with a fixed extent to ensure consistent sample points for stacked densities.
asString[]The output fields for the sample value and corresponding density estimate.
Default value: ["value", "density"]

Examples

Basic Use Case

Let's see the basic use case for density transform:

{
  "data": {
    "url": "data/movies.json"
  },
  "width": 400,
  "height": 100,
  "transform":[{"density": "IMDB Rating"}],
  "mark": "area",
  "encoding": {
    "x": {
      "field": "value",
      "title": "IMDB Rating",
      "type": "quantitative"
    },
    "y": {
      "field": "density",
      "type": "quantitative"
    }
  }
}

Stacked Density Estimates

{
  "data": {
    "url": "data/penguins.json"
  },
  "mark": "area",
  "transform": [
    {
      "density": "Body Mass (g)",
      "groupby": ["Species"],
      "extent": [2500, 6500]
    }
  ],
  "encoding": {
    "x": {"field": "value", "type": "quantitative", "title": "Body Mass (g)"},
    "y": {"field": "density", "type": "quantitative", "stack": "zero"},
    "color": {"field": "Species", "type": "nominal"}
  }
}

Example 3: Faceted Density Estimates

Want to compare density estimates across different categories? Faceted plots can help:

// JSON Configuration for Faceted Density Estimates
{
  "data": {"url": "your-data-source.csv"},
  "facet": {"field": "category"},
  "spec": {
    "transform": [
      {"density": "yourField"}
    ],
    "mark": "area",
    "encoding": {
      "x": {"field": "value", "type": "quantitative"},
      "y": {"field": "density", "type": "quantitative"}
    }
  }
}

FAQs

1. When should I use the Density Transform?

Use Density Transform when you want to visualize the distribution of data points smoothly, particularly in cases where understanding the density or frequency of occurrences is crucial.

2. What is the role of the groupby parameter in Density Transform?

The groupby parameter allows you to perform density estimations separately for different groups in your dataset. It's useful when you have categorical data and you want individual density plots for each category.