Digital Nirvana, a provider of leading-edge media monitoring and metadata generation services, has announced an upgrade to MetadataIQ, its SaaS-based tool that automatically generates speech-to-text and video intelligence metadata, increasing the efficiency of production, preproduction, and live content creation services for Avid PAM/MAM users. The new version makes beta-tested video intelligence capabilities commercially available and integrates directly with Avid MediaCentral.
MetadataIQ 4.0 relies on advanced machine learning and high-performance AI capabilities in the cloud (speech to text, facial recognition, object identification, content classification, etc.) to create highly accurate metadata more quickly and less expensively than traditional methods. Crucially, MetadataIQ not only automatically generates speech-to-text transcripts on incoming feeds (or on stored content) in real time, but then takes the transcript, parses it by time, and indexes it back to the media in the Avid environment.
Since Digital Nirvana introduced MetadataIQ about a year ago, the primary use case has been generating speech to text in real time as massive amounts of live streams are being ingested, then sending that STT transcript into the Avid Interplay PAM system with time inputs. Two major news organisations — one in the United States and another in the Middle East — have been testing these capabilities in their live news workflows, and Digital Nirvana says the results from real-time transcript metadata alone have transformed operations. The application’s ability to marry real-time transcript generation with real-time indexing in Avid means producers and editors can quickly find relevant media assets for their news stories, thereby accelerating the entire production process.
Also, instead of sending metadata only to Avid Interplay on-prem implementations, MetadataIQ 4.0 will integrate with Avid’s cloud-based MediaCentral hub, where editors access multiple Avid applications to do their work. Thanks to cloud integration, instead of being able to search only one type of metadata at once as they’ve been doing in Avid Interplay, editors will be able to combine searches in MediaCentral based on multiple forms of metadata. For example, if MetadataIQ generates metadata using OCR, facial recognition, and speech to text, when an editor enters search terms, MediaCentral will search all three of those types of metadata simultaneously. This means editors will get more precise results even faster.
According to Russell Wise, senior vice president of sales and marketing at Digital Nirvana, “Combined search makes the entire video machine-readable, not just the words. An STT search might yield 50 results, which still makes for significant time savings when you’ve got hundreds of hours of video to search through. But with combined search, you could narrow it down to perhaps only four or five results. These new developments will allow producers and editors to pinpoint the right clips and create content even faster, which is especially crucial when it comes to news, sports, and other time-sensitive broadcast applications.”