Wednesday 02/21/2024 by phishnet

BACKWARDS DOWN THE NUMBER LINE: PHISH DATA VISUALIZATION PROJECT

[We would like to thank user Jasn1001 (Jason Carlson) for his work and this post! -Ed.]

One of the best things about Phish.net and Phish.in is the free API access to the relational database. It allows any member to request a key and query the database of songs, shows, setlists, and more.

As a long time Phish fan and having recently gained some data manipulation skills, I started a personal project to make a visualization dashboard. Plots that I always wanted to see. To help summarize all of the data the volunteers at The Mockingbird Foundation, Phish.net, and Phish.in have collected over the years.

I had already been working on the dashboard before seeing the recent post about show ratings being disabled because of a surge in activity skewing ratings of historical shows. This motivated me to finally finish the dashboard and share it. The idea of increasing access to the data and helping individuals find a reason to rate shows on Phish.net is what the project is about. Whether it is a highly rated show, average, or low. If a listener had been there the night of, streamed it from home, or listened to it on phish.in a few years later. I think the dashboard can help to navigate the data around the year, tour, or show and help to justify a rating of what your ears just heard. A law of large numbers type of idea that the larger the sample size of ratings becomes the closer we get to the true rating of the show. Here is the link https://www.philletofphish.com/. It is best viewed on an iPad or larger.

Here is a quick rundown on the site.

First, select the show date from the searchable dropdown box, which by default is set at the most recent show.

Second, choose a range of years from the slider. The default is 2015 to present.

Feel free to wind it all the way back if you want too. The visualizations will populate starting with the violin plots of the songs in the selected setlist with the available time duration data over the selected date range. Note that this will list all the song data available in the date range if it is available. If a performance is missing then the show is currently not listed on phish.in.

The setlist on the right side can be used to select/de-select songs. There are also a few zoom tools available in the top right-hand corner.

As an example, the plots have all of the times the song has been played in the date range with the darkest point representing the selected show date from the dropdown. A user can compare the song length easily across all the year selections while still seeing the selected show even if it is not in the date range. Hovering over the points will provide the show date, individual times, mean, and median.

The violin plots showed individual time duration for each song, the Sunburst Plot shows the cumulative time duration by year, by tour, and by song for the years selected. Once again, the time duration data is only inclusive of shows contained at Phish.in. The wheel is clickable.

The text gets a little small but each slice has hover data available. When your pointer is over it. As an example the 2023 Summer Tour song times of 2 days and 16 hours represent 50% of song times available in 2023. The individual songs will show the same data. The 1 hr and 7 min of "Down with Disease" audio represents 2% of the 2023 Summer Tour and 0.13% of the range of years selected total.

Horizontal box plots were used to visualize gap data between a song’s performance. The data comes from phish.net but it excludes two things. Debuts are not included, partially because every time a song is debuted at this point its gap, which is counted from 1983, is a huge outlier and in my opinion is not needed. You will see the gap to the second and subsequent performances. Also, gaps of 0 are also not included so if the band plays Hold Your Head Up twice in the same show only the first gap is included as these are show gaps not song gaps.

The third grouping of charts use phish.net setlist data to illustrate the days of the week that songs are played on and which set they are played in. They are colored by setlist position. Over the selected date range, we can see Everything’s Right is played in decreasing order from Friday thru Sunday and mostly in set two, position 13. The position list on the right side of each plot can be used to select/deselect.

I have always been impressed with the feature of how the song is transitioned into i.e. “>”,”->, “,” , set openers, and closers. Trying to make sense of this visually led me to this parallel category plot. The visualization shows how the setlist songs are transitioned into and out of over the time period.

Warning, adding in too many years can make it almost unreadable but that is up to you. I initially included the songs from before and after, which over enough time is just about every song leading into every song and doesn’t really provide anything usable in my opinion. This plot is really about trying to find a sense of what songs are used to end a set/show, how often is it done over the time period, and is it three continuous face melters or is there a pause etc.

This plot does have a few boxes to select or deselect if you wish to remove any of the options.