Multi-Source Intelligence

Challenge

The world is awash in data, from text documents such as news reports to images, videos, spreadsheets and databases. Somewhere in all of that data is the information needed to make critical decisions, but the challenge is to identify that information and to present it in a concise manner in order to enhance understanding. Analyzing data in such a way requires the ability to automatically identify key subjects, attributes and keywords and to understand the inter-subject, inter-data and subject-data relationships. To date, such analysis has been performed by individuals. While human analysis can identify subject and data relationships, the amount of data that must be processed is overwhelming. Achieving a complete understanding of multi-source intelligence information will require analytic software capable of processing the raw data to identify the underlying semantic content and relationships, relying upon human use of the results in order to improve inference.

Photo Courtesy of U.S. Army

Benefits of SIG’s Approach

SIG has developed an advanced statistical framework for the analysis of multi-source intelligence that captures the underlying subject, attribute and keyword information from a variety of structured and unstructured sources, can easily discern topics of importance in a large data corpus, and is capable of inferring relationships amongst identified subjects and the original source materials. Furthermore, in situations where subjects have known relationships, SIG’s techniques can provide a probabilistic assessment of the underlying attributes of those subjects. By organizing the material based on its interrelationships, SIG can provide key insights into the data, allowing analysts to go beyond simple keyword queries and achieve a true understanding of the available corpus of materials.

Products/Solution

SIG’s multi-source intelligence analysis suite of tools offer a combination of state-of-the-art natural language processing, search, and network modeling techniques that can be tailored to any application space. The suite of tools includes a variety of statistical models in a flexible and modular architecture that can be optimized for any search or knowledge-understanding problem, including everything from the analysis of text reports to the understanding of cyber-security threats.

Technology Summary

SIG’s multi-source intelligence analysis algorithms are rooted in proven natural language processing techniques and Bayesian statistical principles. By incorporating knowledge representations tailored to a specific application into the natural language processing (NLP) components, SIG’s technologies can identify the information important within a particular domain. By combining the results of the advanced NLP with advanced statistical techniques, SIG’s technologies organize the materials based on their interconnections. This interconnected dataset can then be analyzed using techniques drawn from social network analysis in order to identify key relationships and interactions within the data. All model products are tied back to the original source materials, allowing for semi-supervised machine learning techniques to improve the performance with minimal overhead to the user.