Let's Add Another Chart

----------------------+-------------+-----------+---------------+-------------+--------- Crucial_CT750MX300SSD1 | 553260 | 183169 | 138315 | 45792 | 1 CT1000P5PSSD8 | 2057752 | 184133 | 514438 | 46033 | 1 CT250MX500SSD4 | 584018 | 58884 | 146004 | 14721 | 1 INTEL SSDSC2BB016T6 | 524797 | 165071 | 131199 | 41268 | 1 INTEL SSDSC2BB800G4 | 527433 | 178580 | 131858 | 44645 | 1 INTEL SSDSC2KB480G7 | 408940 | 184056 | 102235 | 46014 | 1 INTEL SSDSC2KB480G8 | 516621 | 184821 | 129155 | 46205 | 2 Micron_5100_MTFDDAV960TCB | 553489 | 261468 | 138372 | 65367 | 2 SAMSUNG MZ7KM480HMHQ-00005 | 556298 | 189078 | 139075 | 47270 | 1 SAMSUNG MZ7LM960HMJP-00005 | 550248 | 186309 | 137562 | 46577 | 1 Samsung SSD 980 PRO 1TB | 2046890 | 23510 | 511723 | 5877 | 1 (11 rows)


## Our First Chart

To make a start of visualising this data, let's define a simple Bar chart comparing SSD throughput.  We're also going to output the visualisation as a HTML page, which we can save and open in a web browser.

```sql
-- We define our underlying data analysis queries as CTEs, this just makes it's far easier to separate our data
--  analysis from the visualisation definition
WITH write_4k AS (
    SELECT device, buffered_bw, direct_bw, buffered_iops, direct_iops, count
    FROM
    (
        VALUES
        ('Crucial_CT750MX300SSD1', 553260, 183169, 138315, 45792, 1),
        ('CT1000P5PSSD8', 2057752, 184133, 514438, 46033, 1),
        ('CT250MX500SSD4', 584018, 58884, 146004, 14721, 1),
        ('INTEL SSDSC2BB016T6', 524797, 165071, 131199, 41268, 1),
        ('INTEL SSDSC2BB800G4', 527433, 178580, 131858, 44645, 1),
        ('INTEL SSDSC2KB480G7', 408940, 184056, 102235, 46014, 1),
        ('INTEL SSDSC2KB480G8', 516621, 184821, 129155, 46205, 2),
        ('Micron_5100_MTFDDAV960TCB', 553489, 261468, 138372, 65367, 2),
        ('SAMSUNG MZ7KM480HMHQ-00005', 556298, 189078, 139075, 47270, 1),
        ('SAMSUNG MZ7LM960HMJP-00005', 550248, 186309, 137562, 46577, 1),
        ('Samsung SSD 980 PRO 1TB', 2046890, 23510, 511723, 5877, 1)
    ) data(device, buffered_bw, direct_bw, buffered_iops, direct_iops, count)
)
-- Define the visualisation
SELECT vis.html(vis.visualise(
    -- The dashboard title
    'SSD Performance',
    -- Add a bar chart
    vis.bar_chart(
        -- The chart title
        'SSD Write Throughput (KiB/s)',
        -- The dataset for this chart, which needs to be serialised as a JSON array
        vis.dataset( (SELECT jsonb_agg(q.*) FROM write_4k q) ),
        -- Add a data series of buffered IO
        vis.series(
            -- The data series title
            'Buffered IO Throughput',
            -- The column to use for the chart labels (X axis)
            'device', 
            -- The column to use for the value (Y axis)
            'buffered_bw'
        ),
        -- Add a data series of direct IO
        vis.series('Direct Throughput', 'device', 'direct_bw')
    )
));

If we run this query in psql we'll see a blob of HTML in the console, clearly it's a bit difficult copy and paste this blob. So we'll use psql to write the result to a file, first we need to turn off aligned output and column headers (we'll also turn it back on).

-- configure psql to only output the query result
vis=> \pset pager off
vis=> \pset format unaligned
vis=> \pset tuples_only on

-- now save the result to a file
vis=> \g eg1.html

This will save: eg1.html

Note: your probably want to run with \pset pager off \pset format unaligned \pset tuples_only on for the rest of this tutorial.

Let's add in a second chart comparing SSD IOPs. We're going to display the charts side by side, so we'll make use of pgVis grid layout. The grid is 12 columns wide, so a chart (or other tile) can be from 1 to 12 columns wide, tiles are laidout left to right and wrap every 12 columns.

WITH write_4k AS (
    SELECT device, buffered_bw, direct_bw, buffered_iops, direct_iops, count
    FROM
    (
        VALUES
        ('Crucial_CT750MX300SSD1', 553260, 183169, 138315, 45792, 1),
        ('CT1000P5PSSD8', 2057752, 184133, 514438, 46033, 1),
        ('CT250MX500SSD4', 584018, 58884, 146004, 14721, 1),
        ('INTEL SSDSC2BB016T6', 524797, 165071, 131199, 41268, 1),
        ('INTEL SSDSC2BB800G4', 527433, 178580, 131858, 44645, 1),
        ('INTEL SSDSC2KB480G7', 408940, 184056, 102235, 46014, 1),
        ('INTEL SSDSC2KB480G8', 516621, 184821, 129155, 46205, 2),
        ('Micron_5100_MTFDDAV960TCB', 553489, 261468, 138372, 65367, 2),
        ('SAMSUNG MZ7KM480HMHQ-00005', 556298, 189078, 139075, 47270, 1),
        ('SAMSUNG MZ7LM960HMJP-00005', 550248, 186309, 137562, 46577, 1),
        ('Samsung SSD 980 PRO 1TB', 2046890, 23510, 511723, 5877, 1)
    ) data(device, buffered_bw, direct_bw, buffered_iops, direct_iops, count)
)
SELECT vis.html(vis.visualise('SSD Performance',
    -- We're going to use the more advanced form of chart definition
    vis.bar_chart(
        -- An id for this chart, should be unique within the dashboard
        'write-bw', 
        -- Define the sizing of this chart tile, this uses a 12 column layout
        --  so 6 is half width
        6, 
        -- The same as before
        'SSD Write Throughput (KiB/s)',
        vis.dataset( (SELECT jsonb_agg(q.*) FROM write_4k q) ),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    -- Now the second chart of SSD IOPs
    vis.bar_chart('write-iops', 6, 'SSD Write IOPs',
        vis.dataset( (SELECT jsonb_agg(q.*) FROM write_4k q) ),
        vis.series('Buffered IO IOPs', 'device', 'buffered_iops'),
        vis.series('Direct IO IOPs', 'device', 'direct_iops')
    )
));

View the result: eg2.html

The eagle eyed might have guessed the problem with the above approach.

Sharing Datasets

So far we've been defining the dataset for each chart, we've even aggregated the data into a JSONB array for each of these charts too. This has the significant downside of serialising the data into the output twice. To prevent this, we'll move the dataset definition to the root of the visualisation definition and then reference this from the charts.

WITH write_4k AS (
    SELECT device, buffered_bw, direct_bw, buffered_iops, direct_iops, count
    FROM
    (
        VALUES
        ('Crucial_CT750MX300SSD1', 553260, 183169, 138315, 45792, 1),
        ('CT1000P5PSSD8', 2057752, 184133, 514438, 46033, 1),
        ('CT250MX500SSD4', 584018, 58884, 146004, 14721, 1),
        ('INTEL SSDSC2BB016T6', 524797, 165071, 131199, 41268, 1),
        ('INTEL SSDSC2BB800G4', 527433, 178580, 131858, 44645, 1),
        ('INTEL SSDSC2KB480G7', 408940, 184056, 102235, 46014, 1),
        ('INTEL SSDSC2KB480G8', 516621, 184821, 129155, 46205, 2),
        ('Micron_5100_MTFDDAV960TCB', 553489, 261468, 138372, 65367, 2),
        ('SAMSUNG MZ7KM480HMHQ-00005', 556298, 189078, 139075, 47270, 1),
        ('SAMSUNG MZ7LM960HMJP-00005', 550248, 186309, 137562, 46577, 1),
        ('Samsung SSD 980 PRO 1TB', 2046890, 23510, 511723, 5877, 1)
    ) data(device, buffered_bw, direct_bw, buffered_iops, direct_iops, count)
)
SELECT vis.html(vis.visualise('SSD Performance',
    -- Defining the dataset here allows us to share it across charts
    vis.dataset(
        -- we have to give shared datasets an id
        'write_4k', 
        -- we also have to give them a display title
        'SSD Write Performance',
        -- and the data as a JSON array as we're used to by now
        (SELECT jsonb_agg(q.*) FROM write_4k q)
    ),
    -- define the Charts referencing the shared dataset
    vis.bar_chart('write-bw', 6, 'SSD Write Throughput (KiB/s)',
        -- Simply reference the shared dataset
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    vis.bar_chart('write-iops', 6, 'SSD Write IOPs',
        -- Simply reference the shared dataset
        vis.dataset('write_4k'),
        vis.series('Buffered IO IOPs', 'device', 'buffered_iops'),
        vis.series('Direct IO IOPs', 'device', 'direct_iops')
    )
));

View the result: eg3.html

The only difference is the invisible ~50% file size saving.

Adding Other Chart Types

Let's add in some other charts of different types: line and scatter, this is pretty much repeating what we've already done except with new chart type functions.

WITH write_4k AS (
    SELECT device, buffered_bw, direct_bw, buffered_iops, direct_iops, count
    FROM
    (
        VALUES
        ('Crucial_CT750MX300SSD1', 553260, 183169, 138315, 45792, 1),
        ('CT1000P5PSSD8', 2057752, 184133, 514438, 46033, 1),
        ('CT250MX500SSD4', 584018, 58884, 146004, 14721, 1),
        ('INTEL SSDSC2BB016T6', 524797, 165071, 131199, 41268, 1),
        ('INTEL SSDSC2BB800G4', 527433, 178580, 131858, 44645, 1),
        ('INTEL SSDSC2KB480G7', 408940, 184056, 102235, 46014, 1),
        ('INTEL SSDSC2KB480G8', 516621, 184821, 129155, 46205, 2),
        ('Micron_5100_MTFDDAV960TCB', 553489, 261468, 138372, 65367, 2),
        ('SAMSUNG MZ7KM480HMHQ-00005', 556298, 189078, 139075, 47270, 1),
        ('SAMSUNG MZ7LM960HMJP-00005', 550248, 186309, 137562, 46577, 1),
        ('Samsung SSD 980 PRO 1TB', 2046890, 23510, 511723, 5877, 1)
    ) data(device, buffered_bw, direct_bw, buffered_iops, direct_iops, count)
)
SELECT vis.html(vis.visualise('SSD Performance',
    vis.dataset('write_4k', 'SSD Write Performance',(SELECT jsonb_agg(q.*) FROM write_4k q)),
    vis.bar_chart('write-bw', 6, 'SSD Write Throughput (KiB/s)',
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    vis.bar_chart('write-iops', 6, 'SSD Write IOPs',
        vis.dataset('write_4k'),
        vis.series('Buffered IO IOPs', 'device', 'buffered_iops'),
        vis.series('Direct IO IOPs', 'device', 'direct_iops')
    ),
    -- Define a line chart
    vis.line_chart('write-bw-line', 12, 'SSD Write Throughput (KiB/s)',
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    -- Define an X/Y scatter chart
    vis.scatter_chart('write-bw-v-iops', 12, 'SSD Throughput vs IOPs',
        vis.dataset('write_4k'),
        vis.series('Buffered IO', 'buffered_iops', 'buffered_bw'),
        vis.series('Direct IO', 'direct_iops', 'direct_bw')
    )
));

View the result: eg4.html

Multiple Datasets

A visualisation dashboard is not just limited to working with a single dataset, you can define as many as you need. As a general structure place you data analysis queries as CTEs and then define the dataset you need for the dashboard within the visualisation definition.

WITH write_4k AS (
    SELECT device, buffered_bw, direct_bw, buffered_iops, direct_iops, count
    FROM
    (
        VALUES
        ('Crucial_CT750MX300SSD1', 553260, 183169, 138315, 45792, 1),
        ('CT1000P5PSSD8', 2057752, 184133, 514438, 46033, 1),
        ('CT250MX500SSD4', 584018, 58884, 146004, 14721, 1),
        ('INTEL SSDSC2BB016T6', 524797, 165071, 131199, 41268, 1),
        ('INTEL SSDSC2BB800G4', 527433, 178580, 131858, 44645, 1),
        ('INTEL SSDSC2KB480G7', 408940, 184056, 102235, 46014, 1),
        ('INTEL SSDSC2KB480G8', 516621, 184821, 129155, 46205, 2),
        ('Micron_5100_MTFDDAV960TCB', 553489, 261468, 138372, 65367, 2),
        ('SAMSUNG MZ7KM480HMHQ-00005', 556298, 189078, 139075, 47270, 1),
        ('SAMSUNG MZ7LM960HMJP-00005', 550248, 186309, 137562, 46577, 1),
        ('Samsung SSD 980 PRO 1TB', 2046890, 23510, 511723, 5877, 1)
    ) data(device, buffered_bw, direct_bw, buffered_iops, direct_iops, count)
), 
-- For our second dataset we're just going to define another CTE which does a little extra 
--  processing on our main dataset.  But this query could go and do anything.
device_pop AS (
    SELECT device, count, ((count::real / (SELECT sum(count) FROM write_4k)::real) * 100) AS percent
    FROM write_4k
)
SELECT vis.html(vis.visualise('SSD Performance',
    vis.dataset('write_4k', 'SSD Write Performance',(SELECT jsonb_agg(q.*) FROM write_4k q)),
    vis.bar_chart('write-bw', 6, 'SSD Write Throughput (KiB/s)',
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    vis.bar_chart('write-iops', 6, 'SSD Write IOPs',
        vis.dataset('write_4k'),
        vis.series('Buffered IO IOPs', 'device', 'buffered_iops'),
        vis.series('Direct IO IOPs', 'device', 'direct_iops')
    ),
    vis.line_chart('write-bw-line', 12, 'SSD Write Throughput (KiB/s)',
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    vis.scatter_chart('write-bw-v-iops', 12, 'SSD Throughput vs IOPs',
        vis.dataset('write_4k'),
        vis.series('Buffered IO', 'buffered_iops', 'buffered_bw'),
        vis.series('Direct IO', 'direct_iops', 'direct_bw')
    ),
    -- Let's define our second shared dataset, we can define shared datasets in any order
    vis.dataset('device_pop', 'SSD Population', (SELECT jsonb_agg(q.*) FROM device_pop q)),
    -- Now lets used this extra dataset for more charts
    vis.polar_area_chart('ssd-pop-pa', 4, 'SSD Population',
        vis.dataset('device_pop'),
        vis.series('Devices', 'device', 'count'),
        -- We're also going to pass an option to the chart to render in the same aspect ratio 
        --  as the users screen
        vis.option('maintainAspectRatio', true)
    ),
    vis.pie_chart('ssd-pop-pc', 4, 'SSD Population (%)',
        vis.dataset('device_pop'),
        vis.series('Devices (%)', 'device', 'percent'),
        vis.option('maintainAspectRatio', true)
    )
));

View the result: eg5.html

More Than Just Charts

pgVis supports more than just charts, we can also add arbitary data tables and text content. It's pretty easy to support additional tile types, so future versions will add more.

WITH write_4k AS (
    SELECT device, buffered_bw, direct_bw, buffered_iops, direct_iops, count
    FROM
    (
        VALUES
        ('Crucial_CT750MX300SSD1', 553260, 183169, 138315, 45792, 1),
        ('CT1000P5PSSD8', 2057752, 184133, 514438, 46033, 1),
        ('CT250MX500SSD4', 584018, 58884, 146004, 14721, 1),
        ('INTEL SSDSC2BB016T6', 524797, 165071, 131199, 41268, 1),
        ('INTEL SSDSC2BB800G4', 527433, 178580, 131858, 44645, 1),
        ('INTEL SSDSC2KB480G7', 408940, 184056, 102235, 46014, 1),
        ('INTEL SSDSC2KB480G8', 516621, 184821, 129155, 46205, 2),
        ('Micron_5100_MTFDDAV960TCB', 553489, 261468, 138372, 65367, 2),
        ('SAMSUNG MZ7KM480HMHQ-00005', 556298, 189078, 139075, 47270, 1),
        ('SAMSUNG MZ7LM960HMJP-00005', 550248, 186309, 137562, 46577, 1),
        ('Samsung SSD 980 PRO 1TB', 2046890, 23510, 511723, 5877, 1)
    ) data(device, buffered_bw, direct_bw, buffered_iops, direct_iops, count)
), device_pop AS (
    SELECT device, count, ((count::real / (SELECT sum(count) FROM write_4k)::real) * 100) AS percent
    FROM write_4k
)
SELECT vis.html(vis.visualise('SSD Performance',
    vis.dataset('write_4k', 'SSD Write Performance',(SELECT jsonb_agg(q.*) FROM write_4k q)),
    vis.dataset('device_pop', 'SSD Population', (SELECT jsonb_agg(q.*) FROM device_pop q)),
    vis.bar_chart('write-bw', 6, 'SSD Write Throughput (KiB/s)',
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    vis.bar_chart('write-iops', 6, 'SSD Write IOPs',
        vis.dataset('write_4k'),
        vis.series('Buffered IO IOPs', 'device', 'buffered_iops'),
        vis.series('Direct IO IOPs', 'device', 'direct_iops')
    ),
    vis.line_chart('write-bw-line', 12, 'SSD Write Throughput (KiB/s)',
        vis.dataset('write_4k'),
        vis.series('Buffered IO Throughput', 'device', 'buffered_bw'),
        vis.series('Direct IO Throughput', 'device', 'direct_bw')
    ),
    vis.scatter_chart('write-bw-v-iops', 12, 'SSD Throughput vs IOPs',
        vis.dataset('write_4k'),
        vis.series('Buffered IO', 'buffered_iops', 'buffered_bw'),
        vis.series('Direct IO', 'direct_iops', 'direct_bw')
    ),
    vis.polar_area_chart('ssd-pop-pa', 4, 'SSD Population',
        vis.dataset('device_pop'),
        vis.series('Devices', 'device', 'count'),
        vis.option('maintainAspectRatio', true)
    ),
    vis.pie_chart('ssd-pop-pc', 4, 'SSD Population (%)',
        vis.dataset('device_pop'),
        vis.series('Devices (%)', 'device', 'percent'),
        vis.option('maintainAspectRatio', true)
    ),
    -- Let's add a non-chart tile, a simple tile which will display a table of data
    vis.data_table(
        -- Tile id, size and title
        'ssds-tested', 4, 'SSDs Tested',
        -- The dataset to use
        vis.dataset('device_pop'),
        -- Two columns in the data table
        vis.column('Device Model', 'device'),
        vis.column('Samples', 'count')
    )
));

View the result: eg6.html

Where Next

This was designed to be a quick introduction to what can easily be achieved with pgVis and the approach it takes to defining visualisations. As pgVis is just a set of store functions, it let's you define visualisation dashboards how you wish, the web based renderer just wants some correctly structured JSON.

Maybe now take a look at: