Hardware for illusion - what makes ar tick?

As you have probably figured out by now, Augmented Reality depends a great deal on the hardware. In fact you may even say it’s vital to its existence. In this chapter we will introduce you to the tools of the trade, the machinery which together will shape the way we will perceive our world in the future.


Hardware and sensors are very critical in the implementation of Augmented Reality. Find out how...

As you have probably figured out by now, Augmented Reality depends a great deal on the hardware. In fact you may even say it’s vital to its existence. In this chapter we will introduce you to the tools of the trade, the machinery which together will shape the way we will perceive our world in the future.

The basic kit

Although augmented reality has a lot of uses in various fields, they all have few similarities that cannot be ignored. They all need some sort of input which needs to be augmented. This input is fetched from the camera, most of the times. Next comes the part about deciding what graphical information needs to be shown on screen; depending on the application, the magnitude of annotations vary. However, in all cases some sensing devices are required to fetch information about what needs to be displayed. Depending on the type of interaction that the AR application needs to do with the user, it may need other input devices (for e.g. a tourism AR browser might feature a touch-tolearn feature). Next comes the dough of AR: a processor which combines the visual data being fed into the system with the processed changes that need to be displayed. Since we are in need of displaying the augmentation, AR also needs a display in most cases.

One size does not fit all

Depending on what the application is, in addition to the common hardware we mentioned above, it might need some more. For instance, the use of AR in tourism would need inputs from a GPS unit, an accelerometer and even a digital compass. On the other hand, usage of AR in military might require additional tagging and real world path highlighting. It might need information from custom devices, the local map and a few other information sources to assist the soldier. A virtual shopping trial room would work well enough with a webcam. In this chapter we take a look at the Hardware involved in AR, starting with display.


There are numerous ways to show the augmentation in AR systems and it is affected by the usage scope of the particular AR application. As peruse-case the screen will differ. Let’s find out how.

Head-mounted display (HMD)

Head-mounted displays are the type of displays which can be mounted (worn) on the head of a person. At times they are also called as ‘helmet mounted displays’. HMDs are of two types out of which one cannot be used for AR.

1 HMDs that display only CGI - Most HMDs that you can find in the common market (though you will not be able to find them easily) are of this type and display only computer generated images (CGI). These HMDs are the ones which cannot be used for AR. The main requirement of AR is to augment the reality. Since these HMDs do not feature a real-world view capability in the first place, the question of augmenting the view gets out of scope. These HMDs are usually used in virtual reality setups.

2 HMDs that allow superimposition of CGI with real world view - These displays are capable of displaying real world imagery on which superimposition of additional data can be done. These once again are of two types: 

a Video see through: These HMDs take in the real view via a camera and mix the CGI information onto it. 

b Optical see through: These HMDs allow the real world to be viewed through a semi reflective mirror and project the CGI data on to the mirror using a projector device. The mirror being half reflective only, creates the superimposition effects. Yet another type of optical see through HMDs are the ones which use OLED displays which act as both viewing glass as well as displays. HMDs are one of the most useful types of displays although they are not the most common ones. The usage of HMDs is many.

One of the various types of HMDs

1 Military: HMDs displays are built into fighter jet pilot’s helmet to displays important information about the flight and target. Often they have useful features like night vision and the visors may even be protective. HMDs are also used for the army and police where the displays show vital information about the surroundings e.g. maps and thermal imaging data which can help detect a targets in very low light conditions. These HMDs also feature communication gear for coordination among soldiers as well as for receiving commands from base. Most military applications of HMD include a OLED display instead of a semi reflective mirror.

2 Sports: Sports such as formula one racing can use AR with HMDs to show some important data about the race to the driver. 
HMDs feature head movement detection thus providing an intuitive and immersive experience to the user by giving them six degrees of freedom

Eye Glasses

Eye glas is simpler. In most of the cases, eye glasses used for AR purposes look just like the normal glasses with an additional camera. These devices allow the user to see the world as they would wearing the normal glasses and add the augmentation onto the view. They feature OLED displays which allow the extra layer to be displayed directly on the glass. Some of them use the semi-reflective mirror approach as well.

AR eye glasses look like the normal ones, though with a camera

Contact Lenses

As strange as it might sound, contact lenses are already in development which would cover the eye with hundreds of LEDs to augment reality for you. These contact lenses are essentially tiny displays which can be controlled by the built-in circuitry. As technology progresses that tiny amount of real estate will even have integrated systems for wireless communications. There is still research going on in this area and the technology is only a little way past the drawing board. One of the ongoing research includes utilizing solar energy to power up all that electronics in the lens.

Contact lenses capable of doing AR are going to available in future

Handheld displays

If you are a smartphone user, chances are you have already used AR applications. Even if you are not, we believe this is one AR displays which is not going to send any shockwaves of awe. Handheld displays are the most common AR displays and a lot of people already have them in in the form of smartphones. These displays possess nothing special in the hardware department from the AR viewpoint.

It is the other hardware components of the phone which bring in the specialities. The components which aid AR on these displays mostly consists of a GPS module, accelerometer, digital compass and components used for data and communication. 

The disadvantage of handheld displays is that one needs to keep holding the device in front of oneself all the time.

Spatial displays

Technical speaking these displays are not the normal screens found in other devices. Instead the display is done using a projector. The projector projects the augmented view directly on the object of which the augmented view is needed. That being said, these setups (we would call them as ‘setups’ instead of ‘displays’ to save you some confusion) are not meant for individual users but for a group of users.

Spatial displays have the advantage that they need not be worn or carried around (well, literally you cannot as they are preset for a given environment only) and since all users of the SAR (Spatial Augmented Reality) setup can see each other, they can easily collaborate.

Though we have mentioned quite a number of displays that are used for AR, there are others which are not as popular. Let us now go to devices and methods used for inputting data into the AR systems.

Input hardware for Interaction

“We think basically you watch television to turn your brain off, and you work on your computer when you want to turn your brain on.” - Steve Jobs While opinions may vary greatly when it comes to Steve Jobs, it is pretty difficult to defy the logic contained in the sentence quoted above. Interaction is the main difference between two things that look just alike: a television and a PC. Both of them have got a display system but the former does not allow you to play as much with it as the latter does. While a TV would allow you to change channels and turn up or down the volume, a PC allows you to do much more (even while speaking of simply entertainment).

AR is one of the most active areas of research in computing but strangely AR doesn’t require much interaction. In fact much of the interaction occurring within different types of AR implementations is automated to a level that you do not have to command the system once it has started working. A lot of research is directed towards minimizing whatever interaction you do with the system. For instance, if you were to wear a HMD, you would not want to keep pressing buttons to command the HMD to display what is on the right, instead you would turn right and the HMD would do the job of augmentation of the environment to your right. Such an experience does not require you to constantly command the system. The system detects your movements and based on prediction of what you would want, changes the view.

But not all AR implementations are used or are intended to be used without any interaction. Take for example the tourism AR applications you could install on your phone. Some of them do require you to tell them what information you want augmented into the real view. In the famous AR browser application ‘Layar’, you would have to first select the ‘layer’ of information which should provide the augmentation. Other methods of manual input involve verbal commands that can be given to an AR system. Let us again take the example of HMD. In most cases a HMD would cover the entire viewing area in front of your eye. In such a case it becomes very cumbersome to command the system using controls such as buttons.

Hence advanced HMDs must feature a system which allows them to command the system using voice. But of course voice may not always work well in all AR systems. For example, if an architect is viewing a 2D drawing of a building in 3D using a HMD, commanding the HMD with voice to show minor changes in design will be a much more frustrating experience than using a wand or stylus to alter his view.

Another way to send commands to an AR system is in the form a glove which detects the subject’s hand gestures and sends them to the processing unit which then takes an appropriate action based on that gesture. Yet another method to send manual input to an AR system would be touch. For example, an AR browser could be designed to show extra information about a building in front only if you touch the image on your mobile phone’s display.

Input hardware for tracking changes in environment

Before going further, we want to clear the point that all interaction hardware are input hardware in their own right. They allow the user to input commands to the system. What we are going to talk about in this part is not about them. In this part we talk about the hardware which allow the AR implementations to function automatically (and intuitively). Let’s start with the implementation of AR which we can encounter first hand – AR browser applications built for mobile phones.

Use of AR on your mobile phone to assist you during your journey is one of the simplest usage of AR systems. Most AR browsers use a lot of information sources built into your phone to add information to your view of places around you. We have said earlier that AR browsers as such do not understand what they see. They show the information from the server (for the app) which depends on your phone sending its location and orientation. To learn the location, the AR browsers use the GPS module built into your mobile phone. If you are not getting GPS information, then the server for the application cannot know your exact location and AR would either not work at all or if it works, it would be inaccurate. The second piece of information it needs to know is which direction you are facing which is supplied by the compass built into the phone. The third source needed is what angle you are (your phone is) looking at from the ground. For example if you are trekking and you point your phone towards the base camp then it is important for the AR application to know that you were looking down or else it might mark the base camp as the hill’s peak. The information about the angle of view is provided by the accelerometer in the device (the same thing which auto-rotates the pictures and videos as you turn your phone). Depending on the application, either the server or the phone calculates what the camera is seeing the then superimposes the information with the placemarks. HMDs, one of the prominent devices used for AR can facilitate quite a number of functions depending on what these are built for.

Infrared inputs built into AR systems for cars can help prevent accidents

HMDs which can be used by armed forces usually contain all components an AR browser for mobile would i.e. GPS, accelerometer and compass. The compass and accelerometer in the HMD can be used to track head movements to provide a more immersive experience. Another addition can be that of eye tracking. If the computer has the sense of where the user is looking, it can improve the interface by adding more details to certain types of objects in the direction where the user is looking or help focus the user on only that area by causing other portions of screen to fade. However, in some cases the HMD does not have to contain a GPS such as when it is to be used in a closed environment and the position of the HMD is already known to other components of the AR setup. An engineer’s workspace could be an example.

dditional input can be provided to the HMD using hand movements which can be done using special gloves which can be connected to the HMD using either wires or short range wireless methods (e.g. bluetooth). This enables gesture based input to the HMD for commanding the HMD without the use of voice. Gesture based input is used often for entertainment purposes wherein the user is allowed to play with virtual objects while hand tracking enables the user to interact with virtual objects. The day may not be far when you would be wearing a complete AR suit that adjusts what you see based on your blood pressure, skin humidity, muscle activity and other factors.
Infrared sensors can be used to aid night vision as well as to see objects normally hidden from eyesight using clever use of colors. These can be used to show the presence of an object which might be perilous and save a life. For example, use of infrared sensors in the windscreen of cars which can help drive safer in night by showing hot objects from a distance. This can include humans crossing the road.

Despite so many input devices already present for AR, there are a number of methods being devised further to make the augmentation so intuitive that it blurs the lines between reality and augmented reality.


It is this part which actually does the work in any computing systems. No matter what computing system you pick up for the case study, it is the processing that would be at the center of it all. If everything works but the processing system is not working well, everything else – all input devices, tracking hardware and display systems – becomes useless. When it comes to AR, processing is more substantial and critical than other parts. Obviously the processing units and methodologies are not simple at all. In fact they are sophisticated enough that this FastTrack is beyond the scope of any in-depth discussion on them. No matter, we will simplify the process for you in the next chapter and pique your curiosity a little more.