Artificial Intelligence is popularly the subject of dystopian visions from Bladerunner to Terminator. Even starting this article with the phrase ‘AI has begun to enter mainstream consciousness’ is loaded with extra-curricula meaning. The technology has emerged from theory into the sunlight impacting everything from robotic bank tellers to self-driving cars. Media is no exception.
Data, specifically metadata, has been the currency of media organisations for some time and essentially all AI does is to take this to another level. Recent extraordinary advances have been possible thanks to technical and intellectual advances that have allowed the development of very large-size Artificial Neural Networks (ANN - computational models inspired by how the brain works) coupled with the availability of a huge quantity of data to train them.
To illustrate the scale of the progress, the performance of object recognition algorithms on the benchmark ImageNet database went from an error rate of 28% in 2010 to less than 3% in 2016, lower than the human error rate on the same data.
Equity funding of AI-focused start-ups reached a record high in the second quarter of 2016 of more than $1 billion, according to researcher CB Insights.
Most R&D is being driven by computing and web giants Microsoft, Google, Facebook and Amazon who are best positioned to hoover consumer data on everything from buying habits to exercise regimens. Banks of their machines can be fed vast amounts of multimedia for processing and organising by algorithms for object, voice and facial recognition, emotion detection, speech to text or any programme we want to through at it.
Google CEO Sundar Pichai has said the company’s shift to AI is as fundamental “as the invention of the web or the smartphone.” He went so far as to suggest that we are evolving from a mobile-first to an AI-first world.
In reality, most AI apps are productised machine learning (ML) applications for which the term AI is misleading. ML can be understood as ‘learning machines’ as distinguished from AI, or ‘machines that think’. AI is a branch of computer science attempting to build machines capable of intelligent behaviour, while Stanford University defines ML as “the science of getting computers to act without being explicitly programmed”.
You need robust ML before you get to AI of course and currently there are few true mainstream AI applications outside of autonomous cars.
IBM prefers to talk about augmented intelligence. “It’s an approach which asks how AI supports decision making and demands a societal change in how we look at technology,” says Carrie Lomas, IBM’s cognitive solutions and IoT executive. “Through personal devices like tablets to all manner of items with sensors, the industry as a whole is taking in lots of data and combining it with different types of information to enable a genuinely new understanding of the world.”
IBM’s cognitive computer system Watson is a set of APIs or building blocks which can be combined for different software applications by third parties.
For example, IBM has combined i Alchemy Language APIs with a speech to text platform, to create a tool for video owners to analyse video – forming IBM Cloud Video. It is able to scan news and social media in real time to understand how people are talking about a company; understand important topics and how people feel about them.
Some 75% of Netflix’ usage is driven by recommended content that was itself also developed with data – reducing the risk of producing content that people won’t watch and proposing content that consumers are eager for. This ground-breaking use of big data and basic cognitive science in the content industry has shown others its potential.
“The world’s biggest content owners are going direct to consumers,” says Nagra senior director product marketing Simon Trudelle. “With a growing stock of videos available, just relying on manually managed catalogues or curated lists to create TV or SVOD services has already started reaching its limits.”
The use of AI relies heavily on massive volumes of unstructured data – and a lot more has become available now that video-enabled consumer devices are connected. Capturing and managing TV/video platform data so it can be exploited by advanced predictive algorithms is becoming a key focus for the media industry.
Voice assistants such as Amazon’s Echo and Google Home record user voices in order to function, a logical extension of which is to have cameras on smart TVs and STBs relay information back to the operator about who is watching to improve individual profiling, content serving, ad targeting and automated product insertion.
This may appear more intrusive to the way in which Google or Amazon appropriates data from web searches, for example, and opens up a debate about how much data consumers may be willing to part with for perceived benefit or service discounts.
According to Bloomberg, Amazon, Google, Microsoft are aggregating voice queries from each system’s user base to educate their respective AIs about dialects and natural speech patterns.
As if to circumvent criticism Amazon, Facebook, Google, IBM, and Microsoft formed the non-profit Partnership on AI to advance public understanding of the subject and conduct research on ethics and best practices.
“The advent of cloud-based apps and APIs means 2017 will be about personalisation,” says IBM’s Lomas. “It’s not just about knowing age and gender but knowing a consumer’s emotional response to products marketed to them. Cognitive computing enables media and brands to personalise their approach in a frictionless way.”
End-users will benefit from the increasing role of AI, in particular in interacting with the media. According to Pietro Berkes, principal data scientist, Kudelski Group, “Virtual assistants will understand their preferences and respond to vocal command, facilitating content discovery from multiple sources. As traditional media becomes increasingly connected, AI will enable content providers to interact with end-users. AI assistants will help consumers select personalised camera angles for sport events and they will deliver automatic summaries of latest news and missed TV shows.”
Adoption is bound to grow as all media experiences become fully connected and new products are developed to provide more convenience, relevance and satisfaction to the user experience.
At Kudelski, ML algorithms are being used to assist human decisions in all its core businesses including helping operators understand the behaviour of subscribers, predict churn and optimise their catalogue.
Its security division uses ML methods for “privacy-preserving user behaviour modelling and intrusion detection.” ML is also applied to help infrastructure operators better manage peak traffic situations and to detect and prevent fraud in deployed systems.
New ANN techniques are being developed to beat traditional encoding and decoding algorithms for video. “They will allow the transmission of high quality media content even in regions with low internet and mobile bandwidth,” he says. “ANNs are being used not only to build better compression methods but also to artificially clean up and increase the resolution of transmitted images (known as ‘super-resolution’)”. Magic Pony Technologies, acquired by Twitter last June for $150m is able to reconstruct a HD video from a low-definition, compressed stream, for example).
Associated Press uses an automated algorithm to cover earning reports for thousands of companies; Yahoo Sports creates personalized articles for Fantasy Football fans. Both companies use the services of Automated Insight.
What else might AI do? As an aid to accessibility AI can automate description of photos and movie scenes for the blind. Facebook, Microsoft and Google have this in the works. Automatic subtitles for the hearing-impaired can be derived from speech recognition and lip reading.
Video from observational documentary shoots regularly achieve ratios of 100:1 swamping editorial. Auto-assembly and even auto edit-packages like Antix and Magisto are available today to package and polish GoPro and mobile phone captured video though instances of use in professional content creation are rare.
A documentary assembled by the Lumberjack AI system is hoped to be presented before the SMPTE-backed Hollywood Professional Association (HPA) by 2018 and has already helped create Danish channel STV ‘s 69 x10’ episodes of semi-scripted kids series Klassen.
Other examples of Watson being used to inspire human creativity include Grammy-winning music producer, Alex da Kid, who used Watson to inspire his break-out song, ‘Not Easy’
The trailer for the Fox film Morgan was assembled using Watson.
A number of recent developments in ML research will allow picture and movie content editing with Photoshop-like tools that edit conceptual elements of an image instead of individual pixels. One such example is a Neural Photo Editing tool, designed to directly edit facial features.
While automating previously manual processes is inevitable and will inevitably lead to a loss of human roles, other roles will open up. Since ML systems need very large amounts of high-quality data to achieve optimal performance data collection and curation requires substantial organizational efforts. “The global shortage of ML experts represents one of the most important difficulties for companies wanting to enter the AI market,” reckons Berkes.
Realistically, it may still take several years before new AI APIs become widely available and adopted by the traditional content creation and distribution value chain. “It’s really a new mindset that players need to have,” suggests Trudelle. “It’s one which asks ‘What if there were a cloud AI API doing this?’”