‘Point, Click & Learn’
Visual search and augmented reality experiences seemed poised to evolve as early adopter platforms for learning based on images, objects and places that exist in the physical world.
Google, Nokia, Ricoh, Intel, and Microsoft have all demonstrated or released beta and 1.0 version services that layer digital information over images and video captured by the camera holder or person looking at the screen.
The vision (pun intended) for visual and augmented reality platforms is to use cameras, screens and projection systems for uncovering and layering digital information about objects (including text) and places. So you can learn about a particular flower or building while standing in front of it, and not when you are at home sitting in front of your computer. The hope is to move beyond photo/video capture and bring new functionality to the lens as a learning device. No keyboard or mouse needed- just point, click and learn.
Camera + Web-based Software = Augmented Visual Learning
We can already see demonstrations of first generation personal learning experiences based on visual augmented reality (digital layers over real world images) and software services that tap the power of scalable cloud computing architectures:
- A student learning biology is able to point, click & learn about a tree leaf, an insect or a bird whether the object exists in real life or as an image inside a book (e.g. Bobcat tracking app; IdentityTree)
- A tourist uses their mobile camera to identify the name and history of a landmark building; or to help them learn about the local mass transit options (e.g. ‘Nearest Subway’ app; BART)
- A museum visitor sees an art piece and wants to learn more about the artist (e.g. museum app)
- An architecture student want to see a time-lapsed reply of a building’s construction, or an ‘x-ray’ layer image of the structural beams below the exterior skin
- An aspiring wine connoisseur wants to learn more about a vineyard or ideal food pairing by snapping an image of the bottle while inside the retail store (e.g. Tesco Wine app YouTube video; demo)
- Someone reading a newspaper sees a compelling image – points, clicks and learns more about the topic (e.g. Ricoh iCandy app1; demo2)
- A star gazer visiting the Southern hemisphere looks up at an unfamiliar sky – points, clicks and learns via an augmented layer explaining the night sky (e.g. Google Sky demo)
This is quite an impressive list for 2010! And yet these are only examples based on first generation software, hardware and a tiny catalog of images. The most exciting learning applications of visual search are ahead of us!
Visual Search 2011-2020
It is important not to confuse today’s beta and 1.0 version visual search and augmented reality apps with those likely to image in the next decade. Both platforms are likely to evolve alongside other applications based on 2D-3D modeling, location based services, robotic vision, tagging, visual mashups, personal assistants (e.g. Siri) and personal learning systems.
But in order to have a ‘real-time‘ experience in which we capture an image and have it immediately identified (from a catalog) and layered with relevant digital background information – we must think beyond the phone or camera itself and see the potential of software as service models.
Visual search catalogs and services will ‘live in the cloud‘ and not on our devices. In other words, we will not have to rely on the memory or processing power inside of our phones. The phone will access image catalogs stored on the internet (or ‘in the cloud’).
This software-as-service architecture of cloud computing (e.g. networked & virtualized) offers users tremendous storage and processing power. It is a low cost, scalable platform for individuals and companies to store, access and collectively learn about physical objects captured by camera lenses. This will allow us to access billions of images, tags and related content by tapping this massive cloud catalog of object shapes and textures.
My wish list for advanced visual search and learning by 2020?
Making the invisible, visible
I am most interested in real-time augmented reality experiences that allow users to test alternative assumptions and scenarios with real-world systems. I’d like to see visual interfaces that reveal layers about the molecular structure of our natural and synthetic worlds. And if all goes well, it might be micro-projectors which layer images directly onto objects and surfaces that really change the game by the end of the decade.
Imagine an engineering student standing on a highway overpass to study traffic flow patterns and then changing the parameters of vehicle speed and driver behavior to test alternative results. Or imagine a 5th grade student zooming in on any material to see the nanostructured reality that defines the material’s properties.
Alas, that is my vision of the next decade! For now, I am comforted and enthusiastic about the Beta and Version 1.0 experiences already on the marketplace!
I’ve included videos from Google and Nokia below:
Nokia’s Point & Find application that uses a video camera to recognize real world objects (e.g. solar panel, buildings, products, et al)
Google has released Google Goggles as its own platform for camera based search
Here we see Goggles being used to translate a menu text (in German) into a captured image into English
Google Goggles Demo
Another Goggles 1.0 real world demo
Using Google Goggles to identify photos taken in Europe
Additional clips
Origional Point and Find demo from Nokia’s Beta Labs
Tesco wine visual search
Nearest Subway Search
Ricoh iCandy Apps
Additional Resources
- Microsoft Bing Visual Search (see this as a database of future)
- html 5 (<canvas>) demo at Google I/O event (Youtube)
- Augmented Reality companies (e.g. Metaio)
- [For those readers who are more technically oriented- I believe visual search (pictures and images) will be greatly enhanced through html 5 based applications (e.g. <canvas>), 3D simulation environment, and NoSQL based personal learning management systems.]
Image Source:
Creative Commons Attribution License

{ 4 comments… read them below or add one }
AWESOME POST, GARRY. Very near to my heart and by far the most comprehensive I’ve seen on this topic to date. The videos in sequence are a treasure trove.
I agree with Alvis, vehemently!! Thank you Garry
Wonder how privacy issues will impact the adoption of these very cool concepts?
Thanks, and this went straight to my favorites. It will form the basis of many & (mostly) inane predictions I will make
Cheers,
Prince
Totally agree with these apps running in the cloud. I actually think all mobile apps will run in the cloud within the next 5 years. It makes far more sense (given the bandwidth) to have all the CPU and memory intensive processes churned away by server farms on the other side of the world. The phone should just handle the visual input and feedback. And of course the same thing is beginning to happen with desktops, and it will be interesting to see how the ChromeOS fairs when it’s released later this year.
But one of the biggest reasons I’d like to see mobile apps run in the cloud, is that we can then develop a group of standard programming languages that can work with any smartphone. The fact that mobile app developers now have to separately program for the iPhone, Android, Palm, Blackberry, Nokia etc etc is a bit stupid. If we really want the mobile web to take-off and provide the potential so many of us have dreamed about, then the system needs to be like the web is now… open, accessible, and compatible across devices.
I think within 5 years the idea of a “mobile app” will begin to erode. Instead we need mobile websites with the same or greater functionality as their native, locally installed and processed predecessors.
Nathan
Thanks for the comment- and agree with all of your points on the need / opportunity to move beyond device (or store) centric approach! Spot on insights — and glad to have Hive45 on my radar! Good luck w/ business ventures!
Best- G
{ 1 trackback }