Mastering the art and science of formative evaluation in art museums

Meghan Stockdale, The Cleveland Museum of Art, USA, Elizabeth Bolander, Cleveland Museum of Art, USA


In a world of evolving visitor needs and expectations, it is increasingly important for art museum professionals to be agile in the design process. Initiating a dialogue with audiences early on can help maximize the emotional impact and educational effectiveness of nearly any museum project, including high- and low-touch interactives, labels, websites, and mobile apps. In this “how-to” session, participants learn about ways to integrate formative evaluation techniques into their development process as a way to link audience needs with the goals of museum initiatives. Presenters share how the Cleveland Museum of Art uses formative evaluation to inform a multitude of museum-wide projects. They explore the various components of this type of evaluation and guide participants through potential applications. Through hands-on activities, attendees have a chance to work in small groups to execute a formative research project from issue identification through presenting results. Additionally, participants learn how to conduct formative evaluation at a rapid pace and in a low-cost manner, allowing for the utmost level of customization and flexibility. Evaluators and non-evaluators alike leave ready to add formative evaluation to their development toolkit.

Keywords: Formative evaluation, Art museums, Prototyping, Interactives, Mobile, Websites

1. Introduction

In a world in which visitor needs and expectations are constantly changing, it is increasingly important for art museum professionals to be agile in the process of developing new projects. Initiating a dialogue with audiences early on can help maximize the emotional impact and educational effectiveness of nearly any museum initiative, including high- and low-touch interactives, exhibit labels, websites, and mobile apps.

Formative evaluation is sometimes underutilized in art museums but can offer many benefits and cost savings. The Cleveland Museum of Art (CMA) uses formative and remedial evaluation to inform a bevy of institution-wide projects. The aim of this paper is to discuss the strategy of implementing formative evaluation, focusing on the needs and processes specific to art museums.

2. Why evaluate at art museums? The Cleveland Museum of Art case study

Over the past few years, the CMA has made a commitment to including audience research and evaluation in many institutional projects. The museum’s Office of Research and Evaluation has now conducted studies for nearly every department, ranging from exhibition, program, and technology evaluations to market research studies on members, donors, visitors, and non-visitors. For the most part, the museum has embraced the transformative power of data-driven decision making. Most recently, the Office of Research and Evaluation has focused a significant amount of time and energy conducting formative, remedial, and summative evaluation on the CMA’s Gallery One and mobile app, ArtLens. This comprehensive longitudinal study has helped the museum further embrace evaluation at various phases in the life-cycle of projects and integrate evaluative thinking into the overall development process.

3. Starting at the beginning with user experiences: Anticipated and actual

The Science Museum of London’s Audience Research and Advocacy Group suggests focusing on three core areas as they relate to the target user of a prototype (Davis, 2007):

  1. Motivations: Do visitors want to use it? Do they enjoy using it?
  2. Usability: Can visitors work out how to use it? Do they know what to do with it?
  3. Content: Do visitors understand what the exhibit is about? Do they recognize aspects of what it is trying to show them?

Davis (2007) discusses the process of prototype testing as a reoccurring litmus test, seeking out the potential emotional, physical, and cognitive barriers between actual user experiences and intended uses. In addition, Pekarik and Schreiber (2012) have proven through their extensive research conducted at Smithsonian museums that visitors’ anticipated experiences for exhibits closely mirror their exit narratives in terms of overall satisfaction. In their article “The Power of Expectation,” the authors explain that “the experiences that visitors were looking forward to on entrance tended to have a distribution similar to that of the experiences they found satisfying on exit” (487). In other words, it is essential to consider the users’ expectations for experiences, whether this experience is visiting an exhibition, using an interactive, or visiting the entire museum.

One of the most insightful outcomes when testing early and late alpha versions of a line-drawing interactive in Studio Play, an area within the Gallery One space, was understanding what users thought would happen before using the device and how respondents used the interactive on their very first try. The line-drawing interactive allows visitors to draw a simple line which is then mirrored and matched to an artwork in the museum’s collection. Testing revealed that users initially wanted to remove their finger to draw multiple or more complex lines before the corresponding image from the collection was revealed. Although the design of the interactive was never modified specifically to address this impulse, the pace of the image generation quickened, which reinforced the overall concept of drawing simple lines rather than complex pictures on the interactive.

4. Formative evaluation as a strategic tool in the planning process

Formative evaluation is invaluable because it helps establish visitor expectations related to a particular product. By uncovering these expectations before the product is finalized, especially if there are established or generally accepted usability conventions that can be employed, the outcome will be stronger. It can also encourage the setting and testing of goals and objectives for each project earlier in the process. Additionally, formative evaluation can also help the project team define its target audience by showing which audience most successfully interacts with the prototype. For example, the CMA will test early advertising concepts or marketing messaging with potential visitors to determine what might resonate with particular audience segments.

When developing ArtLens, staff members wanted to understand which proposed elements within the app might resonate the most with visitors. To quickly determine the hierarchy of needs, CMA used a simple card sort activity. Visitors were asked to sort printed cards that included a short, one-sentence description of the functionality. Evaluators asked probing questions about highest- and lowest-rated items to ascertain why some were rated higher than others. Interestingly, the findings from this study were corroborated with the summative research study of ArtLens, which revealed similar content and functionality preferences.

The beauty of formative evaluation lies in its flexibility and repeatability. It is important to  refer back to the goals of the project constantly and continually assess the prototype in various iterations, always checking to examine whether or not changes made impact (positively or negatively) on the objectives of the interactive. Studies should be low cost and utilize easily available materials. For example, A/B testing with sample labels or app content can be mocked up either with paper or in Microsoft Word. For some early interactive content in Gallery One, labels written in different tones and/or styles were prepared and matched with a picture of the object being interpreted. Visitors were then asked to rate their preference of the labels based on content or tone. This testing was completed relatively quickly, and new versions of the labels were introduced throughout the process to maximize resources.

5. Getting on the same page

Before beginning any testing, it is important for the project team and the evaluator (or the “evaluative voice” on the team) to collaboratively define the study’s parameters. Clarifying the scope of the formative testing, including logistical elements such as timeline and budget constraints, is almost as critical as defining goals and objectives at this early stage. In their Museums and the Web paper on prototyping, Silvers et al. (2014) explain: “If a problem is not defined clearly and succinctly, in a user-centered way, the rest of the process will suffer.”

At the CMA, Research and Evaluation staff members have found creating a set of mutually agreed-upon guiding principles or ground rules helpful: specifically, working to establish exactly what the project team “needs” to know versus what it feels would be “interesting” to know. By delineating these parameters, the project team can feel confident the prototype testing will provide actionable insights while avoiding fruitless “fishing” expeditions. Although exploratory studies can be highly informative, if the purpose of formative evaluation is to conduct “quick and dirty” rapid-fire prototype testing, more open-ended exploratory research will not be helpful. It is also imperative to create a common document that can be used by evaluation and project team staff to ensure everyone is on the same page.

Whenever possible, the evaluator should be a part of the project team, rather than a separate entity superimposed on the group. This will help increase trust and ensure the results will be used. In turn, the evaluator will also be more equipped to design a study that is quickly implemented and immediately actionable—a key aspect of formative research. Project team members can and should be involved as observers or even note-takers during prototype testing. This level of intimacy with the usability testing will help turn even the most stubborn team member into an evaluation advocate.

6. Deciding between cued and uncued participants

There has always been a great debate between the impact of cueing visitors for evaluation. A significant concern is that a cued visitor will be extremely conscientious, spending more time in an exhibition, reading or even studying labels, or using an interactive prototype for a longer period of time than they would naturally. Sometimes cueing visitors is necessary, especially if the prototype testing is taking them away from their planned museum experience, such as requiring an interview after observations, or if participants are expected to use a think-out-loud approach to explain why they are using the device in a certain manner. It is advised, when testing prototypes and using cued participants, to encourage respondents to try to use the interactive as they would if they were not participating in testing, and the evaluator should bear in mind these limitations. Thinking-out-loud user testing can be particularly insightful when conducting formative evaluation for website redesigns, when users have a large number of choices to make as they move through their experience with the prototype.

The formative evaluation of an alpha matching-game interactive in Gallery One took place over several weeks with recruited family visitors, who were asked to play the game while staff observed, and then participate in a short interview. The beauty of this particular evaluation is that changes were made while the game was tested. Some changes were technical, while others focused more on the general content of the interactive when testing showed that some of the concepts being asked for matches were too complicated for the target audience. An example of this would be expecting the average child under seven or eight years old to identify multiple artworks that represented complex emotions like remorse.

Uncued observations will always be important for determining stay time and other more “naturalistic” behaviors. This works especially well if your prototype is in the beta phase of development and more closely aligned with the final experience. At the CMA, both cued and uncued formative testing is used, sometimes in tandem to allow for triangulation of the results. In Gallery One, cued usability testing was particularly important with certain alpha interactives that were only physically in the building for the span of a few days.

7. Incorporate other findings and techniques

When possible, conduct a literature review before formative research, as identifying similar studies with actionable results can save precious time. Validated scales can also be powerful assets within formative testing, because they can provide comparable data. Recently, the CMA used John Brooke’s (1996) System Usability Scale (SUS) for the museum’s newest app related to the special exhibition Senufo: Art and Identity in West Africa. The SUS was chosen to measure the app’s usability because it is a validated scale that has been used successfully by many brands, organizations, and institutions. While it is currently being used to examine the finished Senufo app, this app is a prototype for future special exhibition apps, and the findings from the SUS survey will help the project team understand what changes need to be made if the app is used again for future exhibitions.

8. A final note: Open minds open doors to improvement

Formative evaluation is a process of discovery. Choosing your attitude about evaluation at the onset is fundamental to the overall success of the study. Willingness to make improvements shows commitment to and care for your user. If nothing in the project plan is adjustable, then there is no point in conducting any type of evaluation. Every project member should ask themselves a few key questions before implementing formative evaluation. First, are you willing to make changes to the project? If yes, are you still willing to make changes if this means altering the objectives of the project?

As evaluation becomes more embedded in the project development life cycle within art museums, it is important that formative testing remains a key part of the mix. Taking that extra afternoon or day to gather feedback from potential users—even if only through a rough conceptual prototype—can help avoid mistakes that are more expensive to correct down the road.


Brooke, J. (1996). “SUS – A Quick and Dirty Usability Scale.” Usability Evaluation in Industry 189(194), 4-7.

Brown, T. (2008). “Design Thinking.” Harvard Business Review. June. Consulted January 2015. Available

Dancu, T., J. P. Gutwill, & N. Hido. (2011). “Using Iterative Design and Evaluation to Develop Playful Learning Experiences.” Children, Youth and Environments 21(2), 338-359.

Davis, S. (2007). Exhibit Prototype Testing Lessons from the Launchpad Project. London: Science Museum of London, Audience Research and Advocacy Group.

Diamond, J. (2009). Practical Evaluation Guide: Tools for Museums and Other Informal Educational Settings, second edition. Lanham, MD: AltaMira Press.

Herman, J. L., L. Lyons Morris, & C. Taylor Fitz-Gibbon. (eds.). (1987). Evaluator’s Handbook. Thousand Oaks, CA: SAGE Publications, Inc.

Patton, M. Q. (2002). Qualitative Research & Evaluation Methods, third edition. Thousand Oaks, CA: SAGE Publications Inc.

Pekarik, A. J., & J. B. Schreiber. (2012). “The Power of Expectation.” Curator the Museum Journal 55(4), 487–496.

Silvers, D. Mitroff, E. Lytle-Painter, A. Lee, J. Ludden, B. Hamley, & Y. Trinh. (2014). “From Post-its to Processes: Using Prototypes to Find Solutions.” In N. Proctor & R. Cherry (eds.). Museums and the Web 2014, Silver Spring, MD: Museums and the Web. Published February 1, 2014. Consulted January 20, 2015. Available

Cite as:
. "Mastering the art and science of formative evaluation in art museums." MW2015: Museums and the Web 2015. Published January 31, 2015. Consulted .