Mapping the Museum Universe

Discovering a new dataset always makes for an exciting time at the Clark Library and provides an excellent opportunity to experiment. We recently learned about the newest version of the Institute of Museum and Library Services (IMLS) Museum Universe Data File. The dataset consists of information describing over 35,000 museums throughout the United States. The dataset describes location, type of museum, rural/urban status and tax information (e.g. revenue, income) where available. There is much more that could be done with this data, but we initially just wanted to get a sense of what the data look like on a map.

Small multiples of geographic distributions of musuems

A full sized image is also available.

We decided to make a series of small static multiples of the different categories of museum provided in the data in order to see the national distribution. Seeing each of the maps side by side makes it easy to compare the distributions visually. We created our initial maps in ArcGIS for Desktop, producing a single small map for each of the nine categories of museum. We exported the image to a (rather unwieldy) Adobe Illustrator file. Illustrator was able to handle the 35,000 points, but just barely. We then changed the colors to a palette from Color Brewer. Despite working with these tools and methodologies on a regular basis, we were - as we often are! - surprised at how long it took to go from raw data to concept to final visualization.

We were initially interested in some of the tax data, but found the data to be too messy to work with in a reasonable way. Museums that were part of larger institutions, such as the Museum of Art at Duke listed with income of $12.5 billion, most likely had their revenue and income data drawn from the parent institution's tax information. Stanford University was listed as a museum, with no delineation from the rest of the university, with the highest income in the dataset of $17.6 billion. It is typical to find ourselves trying things out that don't ultimately work, and this is part of the process of building an understanding of a dataset's possibilities and limitations.

As we worked on creating the small multiples pictured above, we discussed next steps for exploring and visualizing this dataset. In the coming weeks we plan to produce a similar but interactive visualization using Leaflet or other open source web map technologies. The underlying data will be the exact same csv file with 35,000 records, but the choices we make with these tools will produce a very different visualization.

Justin Joque and Nicole Scholtz

1 Comment

on March 12, 12:28pm

Hi Justin and Nicole, I’m really glad that you found the Museum Universe Data File (MUDF) of interest. Your point about the tax id variable in the file is well taken. The messiness of the data reflects the complexity of the museum sector itself. While most people think of museums as independent nonprofit entities, there are actually a wide range of governance structures in the museum space. In addition to being registered with the IRS as 501(c)3s there are many that are sub-units of larger non-profit entities, like universities, others that are public entities owned and operated by local municipalities or state governments, and still others that operate as independent, for-profit institutions. For example, the Spy Museum in DC is a for-profit entity (which should not be confused with the International Spy Museum, a non-profit museum in Beachwood, OH). Despite the data challenges we are really excited about the file and we have set an aggressive data review and release schedule that reflects our interest in continually improving the quality of this file for social research and application development. We also released a new open data catalog ( which gives people easier access to our data and provides some basic statistical analysis and mapping functionality. The data catalogue also provides APIs for each and every data file. For our next release of the MUDF we focused particular attention on identifying university and college affiliated museums and galleries. Getting a better handle on these institutions will really help analyst single-out problems like the Stanford museum issue you identified in your blog. To get this data we reviewed of over 4,000 college and university web sites. For every college or university affiliated museum we identified, we made a point of appending their unique IPEDs ID. This unique ID is the key to linking the MUDF university affiliated museum records to a wealth of data available through the Integrated Post-Secondary Data System at the National Center for Education Statistics. Look for the release in July of 2015. Thanks again for your blog. -Carlos (LSA 90) Carlos A Manjarrez, Director Office of Planning, Research and Evaluation Institute of Museum and Library Services Washington, DC 20036

Add new comment

By submitting this form, you accept the Mollom privacy policy.