A smartphone headset for augmented-reality experiences in museums

Daniela De Angeli, University of Bath, UK, Eamonn O'Neill, University of Bath, UK


Museums are increasingly researching new ways to tell stories, engage, entertain, and educate a heterogeneous group of visitors. In particular, the rapid rise of mobile devices suggests new possibilities for museums, but it also raises new issues. On one hand, mobile devices are personal, portable, and often available since the majority of visitors carry their own devices within the museum. On the other hand, mobile devices can disrupt the visit, limiting both social and physical interaction with other visitors and artifacts. Above all, visitors need to hold the devices in their hands, which can be tiring and limit the possibilities for other hands-on activities. Wearable devices such as headsets can mitigate this problem. Indeed, they have such potential that they have already made their way into museums: they are personal, portable, follow the visitor through the museum experience, and don’t need to be held. This paper proposes a wearable augmented-reality (AR) headset aimed at enriching the museum experience. The headset consists of a lightweight transparent frame that holds the smartphone on the top instead of in front of the user’s eyes. Since visitors will use their own mobile devices, the purchase and maintenance costs for the museum will be limited. A visitor will simply need to download an application and put her smartphone on the top of the headset frame. Content from the device’s screen will be projected on a transparent surface in front of the visitor’s eyes. This paper will describe the development of the wearable device, as well as its potential applications in supporting and augmenting a museum visit.

Keywords: headset, wearable, Augmented Reality, AR, smartphone, mobile

1. Introduction

Museums are increasingly researching new ways to tell stories, engage, entertain, and educate a diverse range of visitors. In particular, the rapid rise of mobile devices suggests new possibilities for museums, but it also raises new issues. Mobile devices are personal, portable, and often available since the majority of visitors carry their own devices within the museum. However, visitors commonly need to hold the devices in their hands, which can be tiring and can limit the possibilities for other hands-on activities. Wearable devices such as headsets can mitigate this problem. They enable tasks that are impossible with static facilities (e.g. kiosks) or other mobile devices: they follow the visitor through the museum experience and don’t need to be held, becoming almost an extension of the user’s body.

In the domain of virtual reality (VR), headsets have been around for many years. They have improved substantially in recent years by utilizing technology developed for mobile phones to provide relatively lightweight hands-free devices with considerable computing power and high-resolution displays (e.g., Oculus Rift). However, such devices are usually expensive to buy and maintain.

The Google Cardboard project suggested a practical and accessible solution, transforming a user’s smartphone into a basic VR headset. Users can put their own devices in a cardboard frame and watch a virtual world on the mobile screen. However, VR does not lend itself well to a museum visit in which people expect to experience the real objects rather than their digital reproduction through the smartphone display. While VR replaces the real space around the user with a virtual world, augmented reality (AR) supplements that real space with virtual information. AR headsets, such as Google Glass and Epson Moverio, have recently been developed that allow the user simultaneously to view both the surrounding real world and “augmenting” virtual experiences. However, similar to VR headsets, these devices are usually expensive to buy and to maintain.

A possible solution might be to develop an AR application for Google Cardboard by using the phone’s camera to capture the real space and rendering the resulting images in real time on the display with whatever augmentations are desired. However, seeing only this virtual rendering of the real world could reduce the quality of the experience, and virtual environments are known to cause issues with spatial orientation and perception (Čopič Pucihar, Coulton, & Alexander, 2013; Grasset et al., 2011), especially as the visitor is not static but moving around the gallery.

In an alternative approach, we have been developing a unique, inexpensive AR headset that utilizes the user’s own mobile phone to provide an AR experience by overlaying images from the phone’s screen on the user’s direct view of her surroundings. The headset holds the smartphone above the user’s eyeline to allow an unimpeded view of the environment, while images from the mobile screen are projected via a transparent surface in front of the user’s eyes. Since users have their own mobile phones to provide the computing power, connectivity, display, etc., the purchase and maintenance costs of the AR headset are modest. Given these low costs, the headset can be treated as effectively disposable and has multiple potential applications. We are investigating in particular its use in museums, where it can be handed out much as electronic mobile audio guides currently are, but at much lower cost and with less need for intervention and control by museum staff. Our wearable system can engage visitors in two main ways: (1) providing additional virtual content to support the visitor’s experience with the exhibits in the gallery; and (2) guiding the visitor’s attention through heterogeneous exhibitions. We are collaborating with the UK’s National Trust to investigate the use of the headset to enhance the experience of visitors to the Trust’s many historical sites.

2. A cheap AR headset: What’s the trick?

The headset consists of a lightweight frame that holds a smartphone above, rather than in front of, the user’s eyeline. Users simply need to install an application on the phone and put it in the headset frame. We developed a prototype application using Unity 3D and the Vuforia SDK. Other AR SDKs such as Metaio also publish extensions for Unity, and the headset can also support them. The Unity scene includes two main elements from the Vuforia SDK: an AR camera and an Image Target. The AR camera serves as the mobile device camera in Unity. The Image Target is an image that can be detected and tracked through the mobile camera. Developers can create and manage their own image target through the Target Manager, a free online tool offered by Vuforia, where it is possible to upload pictures and export them as databases for Unity.

We added a set of virtual elements as children of the Image Target (i.e., they were connected as sub-items of the target). When the AR camera tracks a real-world object, the virtual items that are attached to the target appear on the mobile device’s display. Within Unity, a rectangular plane with black texture was also attached as a child of the AR camera so that it follows the movements of the mobile device’s camera, constantly showing a black background on the mobile display. The black background is necessary to hide the camera view from the mobile screen so that only the relevant content from the mobile device’s screen is projected in front of the user’s eyes by means of the Pepper’s Ghost illusion, overlaying their view of the real world with the virtual items but without the black background, which is not apparent in the image projected by the headset.

Figure 1: Pepper Ghost Trick

Figure 1: Pepper’s Ghost trick

The Pepper’s Ghost illusion is a technique sometimes used by magicians that allows objects to appear and disappear inside a room; there are actually two rooms: one lighter main room and one hidden darker room (figure 1). A sheet of glass, plexiglass, or similar transparent, semi-reflective film is placed between the two rooms, usually at a 45-degree angle, so that it can reflect the view of the darker room into the lighter room. When the lighting level in the darker room is increased, objects from the hidden room can be seen to appear in the main room. In our case, the main room is the space right in front of the user’s eyes, while the hidden room is the phone’s screen at the top of the headset. A sheet of plexiglass is fixed inside the headset at an angle of 45 degrees between the smartphone screen and the user’s view. When the application running on the smartphone displays an object on the phone’s screen, the phone’s screen becomes brighter and the object appears projected in the user’s view via the plexiglass.

3. Effortless engagement

Our AR headset is based on image recognition, which makes it possible to scan real-world items. AR driven by automated image recognition facilitates natural and effortless engagement for visitors who seem to prefer it to other tracking systems such as QR codes (Wein, 2014). The fact that this technology is markerless and does not require an internet connection or to communicate through other devices also makes it easy for museums to deploy. Unlike technologies such as QR codes, NFC, and iBeacons, image recognition does not require either the museum to install coded markers or tags in the gallery or the visitors manually to scan such markers or tags with an appropriate device. It also doesn’t rely on a stable network, such as GPS or other geolocation system, and it can be accurate indoors.

Figure 2: Periscope System

Figure 2: periscope system

Since the AR headset transforms the smartphone into a wearable headset, visitors do not need to manually scan images or objects, but are able to interact with the exhibits simply by walking around. Moreover, the content from the mobile screen is projected directly in front of the visitor’s eyes, integrating the visit with virtual content without distracting from the objects on display. In order to allow such effortless engagement, we had to solve a problem: the phone is placed on the top of the headset with the screen facing down, so the phone’s camera is facing the ceiling and cannot track the view in front of the user’s eyes. We developed a simple and inexpensive solution, installing a small mirror on the top of the headset, rotated 45 degrees over the mobile camera, to reflect the view of the room into the camera. This works in a similar way to a periscope (see figure 2) but since we use just one mirror, the image viewed by the camera is reversed. We therefore flip the image tracker and reimport it into Unity. Finally, we mirror the location of all the items to be projected accordingly to the new image target. As the user walks around, the mirror reflects the view of the surroundings on to the phone’s camera. In this way, the application is able to track the environment in front of the user and display objects and text in the appropriate place and orientation in the user’s view (figure 3).

Figure 3: The Headset

Figure 3: the headset

4. Providing virtual content

In recent years, AR has driven much attention from museums and other cultural institutions, mainly for its capacity to provide additional content and support the visitor’s experience of real objects (Dalsgaard & Halskov, 2011; Damala, Marchal, & Houlier, 2007). For instance, the Smithsonian has launched four AR mobile apps: Leafsnap, which identifies trees by image recognition of leaf shapes; Skin and Bones, which uses animal skeletons as triggers for AR animations of the live animals; View NMAHC, which provides an AR view of the newest Smithsonian museum at its location on the National Mall in DC as it is being constructed; and the Smithsonian app, whose “Smithsonian that Way” feature overlays the smartphone’s camera view with AR signposts to Smithsonian museums around the user. The British Museum also recently released “A Gift for Athena,” an app that supports children’s engagement with the museum’s Parthenon gallery.

In line with this trend, the headset will first engage visitors by providing additional virtual content, such as text and three-dimensional (3D) models, that augments the exhibits in the gallery around them. Therefore, we ran a study to test the capacity of the Pepper’s Ghost illusion to produce a sufficiently clear visual display of a range of textual and graphical images, with a prototype of our headset with a common mobile phone, using the phone’s standard screen brightness settings under a range of indoor lighting conditions.

Figure 4: User study setting with table, prototype headset and participant

Figure 4: user-study setting with table, prototype headset, and participant

User study

The study involved twelve participants (ten males and two females) aged between eighteen and forty-five years old. Each participant sat at a table in front of a white wall. The participant placed the headset in front of his face and looked through the plexiglass towards the wall in front of him (figure 4). For this pilot study, the headset was made of cardboard and contained a 15- by 15-centimeter sheet of plexiglass inclined at an angle of 45 degrees. We placed an LG Nexus 4 mobile phone in the top of the headset. The bottom of the headset was open to allow the projection of images from the phone’s screen on to the plexiglass. We also developed a prototype mobile application using Unity 3D and installed it on the phone. The Unity scene included a black skybox, a directional light, and a set of elements to be displayed. In each trial in our study, each item appeared sequentially for twenty seconds, one after another against the black background. They were displayed as “slides” in the following order:

  • Five words written in Arial 25: samurai, museum, bicycle, garden, room
  • Five words written in Arial 50: object, treasure, label, collection, cellar
  • Five circular blurs, each of a different colour: blue, red, green, yellow, white
  • A 3D column with a dark texture
  • A 3D pharaoh’s head with a lighter texture
  • A set of coloured shapes: a green square, a yellow circle, a red triangle, a blue hexagon, a white rectangle
  • A black-and-white photo of Charles Wade (collector and late owner of the National Trust’s Snowshill Manor; see below)
  • A colour photo of a clock (see figure 5)

When our development of the headset has moved beyond a work in progress to a sufficiently robust and reliable deployable device, we will investigate its use by visitors to National Trust properties. The Trust’s Snowshill Manor is a useful test site, as it combines dimly lit interiors with a large and eclectic collection displayed throughout its rooms. Hence, the words on the first two slides were selected from the content of the Snowshill Manor website. The text remained the same but changed colour for each trial. We could not test every possible colour, but each text, shape, and blur had a different hue: we used the four fundamental hues (blue, red, green, and yellow) and pure white. The RGB colour system includes all the colours from the combinations of red, green, and blue, and uses integer values from 0 to 255 to determine the intensity of each colour. For each colour, we used the maximum brightness value in the RGB scale to maximise the effectiveness of the Pepper’s Ghost illusion: 255-0-0 for red, 0-255-0 green, 0-0-255 blue, 255-255-0 yellow, and 255-255-255 white.

Figure 5: Screenshots of the mobile application

Figure 5: screenshots of the mobile application

As the trials proceeded, the participant was asked to tell the researcher if he saw anything appearing in front of him. If so, he was asked to describe what he saw and if he saw it clearly or poorly. Each session was audio recorded, and the researcher also took notes. For each slide, the researcher recorded on a form if the participant saw nothing (0), poorly (1), or well (2). If the participant saw just a faint unrecognisable image, then the researcher recorded 0.5. If the image was almost clear, missing some details, then the researcher recorded 1.5. At the end of each trial of eight slides, the observer changed the ambient illumination of the room using a digital light meter. Then the participant ran the next trial of eight slides. Each participant ran a total of six trials: four with a different room illumination and twice with different phone-screen brightness settings. Each trial lasted about three minutes. After each trial, the application paused for forty seconds while the researcher changed the ambient lighting level in the room before restarting from the beginning of the next trial (i.e., from the five words written in the larger font size). Hence, the whole evaluation lasted about twenty minutes for each participant.

Ambient and mobile display illumination

The level of environmental illumination is potentially a limiting factor for the effective use of this headset, because a room that is too bright can limit the efficacy of the Pepper’s Ghost illusion. In this study, we were interested in investigating the effect of different indoor lighting conditions on the headset’s use, given standard brightness settings for a typical smartphone screen. Thus, we considered two factors: the brightness of the phone’s screen (using the phone’s default medium and maximum brightness settings), and the intensity of the ambient light. We measured (in lux) how much ambient light illuminated the area right in front of the headset. The following four ambient illumination levels, based on the Illuminating Engineering Society guidelines (Menter, 2014), were tested with the phone’s default medium screen-brightness setting:

  • From 50 to 80 lux: interiors rarely used for visual tasks, such as night-time sidewalk, parking lots
  • From 100 to 160 lux: interiors with minimal demand for visual acuity, such as corridors and changing rooms
  • From 200 to 300 lux: interiors with low/some demand for visual acuity, such as foyers and entrances, dining rooms, libraries, and teaching spaces
  • From 400 to 500 lux: interiors with moderate demand for visual acuity, such as general offices, retail shops, and kitchens

In addition, we tested the two extreme levels of ambient illumination (50 to 80 lux and 400 to 500 lux) with the phone’s screen set to maximum brightness.

Key findings

The blurred circles and shapes (see figure 5) were more visible than any other content, both at medium and maximum screen brightness. At medium brightness, they were clear up to 200 to 300 lux; at maximum brightness, they were fully visible in both the tested ambient illumination conditions (i.e., 450 lux and 50 lux). The black-and-white photo was harder to see than the colour photo, and its details were recognizable only at maximum screen brightness and 50-lux illumination. The 3D column was fully visible only at 50 lux and maximum screen brightness, in which conditions participants were able to describe details of the column. At both 450 lux with maximum screen brightness and 50 lux with medium screen brightness, participants were able to see only the base of the column, which was lighter than the rest of the 3D model, and the general shape of the column. The pharaoh’s head, which had a lighter texture, was more visible than the column: users were able to see the head with more ambient illumination and less screen brightness, while the column was almost invisible in any ambient light over 50 lux with medium screen brightness.

With all forms of content (i.e., text, shapes, blurs, photos, and 3D models), blue and red colours were the hardest to see. The blue and red text was invisible to most participants when the environmental illumination was above 50 lux with medium screen brightness, while they were completely visible at both 50 lux and 450 lux with maximum screen brightness. The bigger text was usually clearer than the smaller text, but the bigger and smaller text were equally visible—because both were perfectly clear—at 50 lux with maximum screen brightness.

Even with the brightest ambient illumination of 450 lux, maximum screen brightness gave good results, with similar visibility scores to those for medium screen brightness and 50-lux ambient illumination. In both those conditions (i.e., 450 lux with maximum screen brightness and 50 lux with medium screen brightness), only the small text, the 3D models, and the black-and-white photo were poorly visible. Unsurprisingly, maximum screen brightness with 50-lux ambient illumination gave the best results, wtwoth maximum visibility of almost every displayed object, except the column and the black-and-white photo, which were scored 1.5 by 2 participants because they were still not completely clear to them (figure 6).

Figure 6: AR content visibility scores according to levels of ambient illumination (450, 250, 150, 50 lux) at medium (Med sB) and maximum (Max sB) screen brightness settings

Figure 6: AR content visibility scores according to levels of ambient illumination (450, 250, 150, 50 lux) at medium (Med sB) and maximum (Max sB) screen-brightness settings

In conclusion, our AR headset can effectively project virtual content, such as 3D models, shapes, and text, right in front of the user’s eyes, overlaid directly on his view of the real world. Our pilot study has shown that a range of virtual content in different colours is very visible using the headset in conditions similar to the rooms and galleries in many historical properties, such as those of the National Trust, which maintain quite dimly lit interiors in order to help preserve the heritage artifacts within them.

5. Conclusion: Guiding visitors’ attention

Augmented-reality (AR) mobile applications are a growing platform for museums because of their potential to enhance visitors’ experience with additional content, as well as to guide visitors through the museum content. Indeed, they can be particularly successful in supporting users’ navigation in real space (Lin et al., 2013; Paucher & Turk, 2010), for example by guiding visitors towards relevant exhibitions (Lee & Park, 2007) and reducing the shift of attention between the virtual and real world (Liu et al., 2012). However, earlier studies of visitors’ attention in museums suggested that attention is inversely related to object satiation (Melton, 1935; Robinson, 1928); that is, the more objects that are on display, the less attention visitors will pay to each item. Therefore, any new object that is introduced to the museum environment is a potential distraction from the exhibits (Melton, 1935; Robinson, 1928). Furthermore, too much variety (e.g., more than four different types of objects on display) may introduce perceptual distraction and fatigue (Melton, 1935; Robinson, 1928). The potential for creating these effects is strengthened with the introduction of virtual objects, for instance via a mobile device. Therefore, it is increasingly important not just to provide additional information but also to personalize the visit by highlighting specific items and guiding visitors through the collections, creating a narrative. This is particularly important for museums with an extensive variety of objects on display, such as the British Museum, which has more than eighty rooms with collections from different ages and geographical areas, or Snowshill Manor, a much smaller National Trust museum that exhibits an incredibly heterogeneous collection.

Our headset can help visitors to orient themselves amongst a variety of virtual and real content, mitigating what Gilman calls “museum fatigue” (Gilman, 1916), reflecting the observation that “visitor interest towards exhibits decreased as visits progressed” (Davey, 2005). The tracking system described above allows the headset to recognize what visitors are looking towards. Consequently, it can guide visitors’ attention towards a specific area or object. This approach can provide support for interaction with both real and virtual space through visual attention (Zhang et al., 2014). In particular, visual saliency may be useful for highlighting specific areas with AR without distracting from real-world tasks (Veas et al., 2011). Not only is salient information easier to find and focus on (McCay-Peet, Lalmas, & Navalpakkam, 2012), but dynamic saliency maps can help to direct attention in a specific direction and order (McNamara, Booth, & Sridharan, 2012).

Our user study demonstrated that the headset can successfully project different shapes and colours. Therefore, we could highlight areas or objects in the gallery by overlaying them with highlighting shapes, or use virtual arrows to indicate a route through the exhibition. For example, Snowshill Manor has walls with a huge variety of objects on display. With the headset, we can highlight with a white square specific areas to guide and focus visitors’ attention, and we can also project items in a particular order to tell a story. Visitors can choose to follow these indications or not by simply walking or looking in a different direction. In this case, the headset will lose track of the previous view and recalculate the new perspective, looking for other targets to recognize. This system is technically feasible; however, we do not know how visitors in general would react to our headset during a museum visit. Therefore, we plan to test the potential of the headset with a more robust, wearable frame at Snowshill Manor (figure 7) in summer 2015.

Figure 7: Snowshill Manor

Figure 7: uncover treasures and surprises in every corner of Snowshill Manor. © Victoria Swinglehurst (www.nationaltrust.org.uk/snowshill-manor/)


Čopič Pucihar, K., P. Coulton, & J. Alexander. (2013). “Evaluating dual-view perceptual issues in handheld augmented reality.” In Proceedings of the 15th ACM on International conference on multimodal interaction – ICMI ’13, 381–388. New York, New York, USA: ACM Press. doi:10.1145/2522848.2522885

Dalsgaard, P., & K. Halskov. (2011). “3d projection on physical objects.” In Proceedings of the 2011 annual conference on Human factors in computing systems – CHI ’11, 1041. New York, New York, USA: ACM Press. doi:10.1145/1978942.1979097

Damala, A., I. Marchal, & P. Houlier. (2007). “Merging Augmented Reality Based Features in Mobile Multimedia Museum Guides.” Anticipating the Future of the Cultural Past, CIPA Conference 2007, 1-6 October 2007. October 1. Available https://halshs.archives-ouvertes.fr/halshs-00530903

Davey, G. (2005). “What is museum fatigue?” Visitor Studies Today 8(3), 17–21. Available http://www.academia.edu/1093648/What_is_museum_fatigue

Gilman, B. (1916). “Museum fatigue.” Scientific Monthly 12, 67–74.

Grasset, R., A. Mulloni, M. Billinghurst, & D. Schmalstieg. (2011). “Navigation Techniques in Augmented and Mixed Reality: Crossing the Virtuality Continuum.” In In Proceedings of Handbook of Augmented Reality, 379–407. Available http://www.hitlabnz.org/index.php/research/augmented-reality?view=publication&task=show&id=1419

Lee, D.-H., & J. Park. (2007). “Augmented Reality Based Museum Guidance System for Selective Viewings.” In Second Workshop on Digital Media and its Application in Museum & Heritages (DMAMH 2007), 379–382. IEEE. doi:10.1109/DMAMH.2007.57

Lin, P., S. Chen, Y. Li, M. Wu, & S. Chen. (2013). “An Implementation of Augmented Reality and Location Awareness Services in Mobile Devices.” In In Proceedings of MUSIC (pp. 509–514). Available http://link.springer.com/chapter/10.1007/978-3-642-40675-1_76#page-1

Liu, T. C., Y. C. Lin, M. J. Tsai, & F. Paas. (2012). “Split-attention and redundancy effects on mobile learning in physical environments.” Computers and Education 58, 172–180. doi:10.1016/j.compedu.2011.08.007

McCay-Peet, L., M. Lalmas, & V. Navalpakkam. (2012). “On saliency, affect and focused attention.” In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems – CHI ’12, 541. New York, New York, USA: ACM Press. doi:10.1145/2207676.2207751

McNamara, A., T. Booth, & S. Sridharan. (2012). “Directing gaze in narrative art.” In Proceedings of the ACM Symposium on Applied Perception – SAP ’12, 63. New York, New York, USA: ACM Press. doi:10.1145/2338676.2338689

Melton, A. (1935). “Problems of installation in museums of art.” American Association of Museums. New Series No. 14.

Menter, A. (2014). Measuring Light Levels – Autodesk Sustainability Workshop. Consulted November 28, 2014. Available http://sustainabilityworkshop.autodesk.com/buildings/measuring-light-levels

Paucher, R., & M. Turk. (2010). “Location-based augmented reality on mobile phones.” In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition – Workshops (pp. 9–16). IEEE. doi:10.1109/CVPRW.2010.5543249

Robinson, E. (1928). “The behaviour of the museum visitor.” American Association of Museums. New Issue No. 5.

Veas, E. E., E. Mendez, S. K. Feiner, & D. Schmalstieg. (2011). “Directing attention and influencing memory with visual saliency modulation.” In Proceedings of the 2011 annual conference on Human factors in computing systems – CHI ’11, 1471. New York, New York, USA: ACM Press. doi:10.1145/1978942.1979158

Wein, L. (2014). “Visual recognition in museum guide apps.” In Proceedings of the 32nd annual ACM conference on Human factors in computing systems – CHI ’14, 635–638. New York, New York, USA: ACM Press. doi:10.1145/2556288.2557270

Zhang, L., X.-Y. Li, W. Huang, K. Liu, S. Zong, X. Jian, et al. (2014). “It starts with iGaze.” In Proceedings of the 20th annual international conference on Mobile computing and networking – MobiCom ’14, 91–102. New York, New York, USA: ACM Press. doi:10.1145/2639108.2639119

Cite as:
. "A smartphone headset for augmented-reality experiences in museums." MW2015: Museums and the Web 2015. Published January 15, 2015. Consulted .