Skip to main content

Processing Data In Motion using StreamInsight

In today's world, "information" is the key. With IT enabled world, we collect large amount of data from various sources but how effectively we can use such large amount of data is always a challenge. Different domains such as Financial, Sales, Security, Retail, Energy, etc. initiate and collect large amount of data from various source for their day-to-day IT operations and management. For e.g. in our day-to-day operation of using e-services, we perform various activities and transactions and one such example is e-banking. Do we ever realize when we do an online transaction how much data is collected to ensure safety of the transaction? Now, when each and every  online activity is monitored, extracting meaning out of billions of activities is a big challenge in IT enabled domains. When billions of transactions are happening per day, how do we find out which one of those is a suspicious activity. With the world becoming more and more IT enabled, IT security becomes extremely important in every human's life. Not collection, rather processing huge amount of data and transactions near real-time and finding out which one we need to know is one of the challenges that IT enabled businesses are facing today and this becomes more critical if it is related to "security". Here, I am going to talk about how a key technology in BI space called "Complex Event Processing" can be leveraged to address some of the challenges in IT security space. So, the success is to process data the data in motion, as it flows to generate meaning out of it...

Challenges in Security Domain
  • Growing Information Technology and Information Security spaces have
    • Critical applications and systems
    • Simultaneous and complex operations
    • Sensors everywhere and real-time collection of security intelligence
    • Massive data volume
    • Need to ensure reliability and quicker turnaround
    • New paradigm is the move to data-driven decisions
  • High security event data rates, continuous queries, and millisecond latency requirements that make it impractical to persist the data in a relational database for processing.

Major challenges are -

How to manage all this data and monitor critical applications to rapidly diagnose problems and maximize uptime?

How to analyze data in motion to gain insight and lower risk? 

How to ensure that the right data is available in the right place at the right time enabling the best decisions?

How to manage large and simultaneous streams of data and analyze it in real time to reduce decision time and increase operational efficiency. 

Background
  • Millions/Billions of Security Events are generated every day
  • There is a need to process to data in near real time in Security Intelligence space to make meaning of huge amount of data
  • Faster processing to improve business decisions
  • Data processing in-motion than post storage to deliver real-time analytics
Available Technology
  • Each BI vendor provides technology leveraging Complext Event Processing concept and one of them is Microsoft StreamInsight
  • Microsoft StreamInsight, a powerful, cost-effective Complex Event Processing platform
  • StreamInsight™ provides high-throughput stream processing architecture to process events in thousands per second
  • StreamInsight™ speeds the transformation of data to actionable business information.
  • It accomplishes this by reducing the latency associated with the traditional BI approach of acquisition, filtering/compression, indexing, storing and then analyzing.
My Solution
  • StreamInsight™ based solution for huge data-in-motion processing in Security Intelligence space
  • Unified solution for data-in-motion as well as post-storage to deliver real-time-analytics
  • Processing huge data-in-motion regardless of data source, relationship and destination
  • Open and independent development platform based upon an event-driven architecture where data from multiple, heterogeneous sources can be intelligently analyzed in real time. 

Impact
  • Continuous and incremental processing of never-ending sequences of events
  • Lightweight streaming architecture that supports highly parallel execution of continuous queries over high-speed data.
  • In-memory caches and incremental result computation provide excellent performance with high throughput and low latency.
    • Low latency because of events are processed without costly data storage in the critical processing path.
  • All processing is automatically triggered by incoming events
  • Historical data can be accessed and included in the low-latency analysis
  • Extensive set of Input and Output Adapters
  • Unified Data Processing architecture for both data-in-motion and data-in-storage
    • Data-in-motion Processing Model for
      • Real-time operational dashboard and analytics
      • Predictive Analysis in predictive future security posture looking at the current events
      • Generating and delivering Real-time Alerts and Notifications
      • Push based data delivery using StreamInsight Output Adapters
    • Data-in-storage Processing Model
      • Based on OLAP model and has different cubes for current and historical data analysis
      • Analytics Dashboard that doesn’t need real time data
      • For what-if-analysis and data mining on historical data
      • For Static Reports
Benefits
  • In Information Security space, there is a need to process and analyze huge security events for quicker business decisions. The proposed solution provided a unified architecture to process both data-in-motion and data-in-storage.
  • The proposed solution will help to
    • Process large volumes of Security Events across multiple data streams while data-in-motion
    • Gain insights from critical information in near real time by monitoring, analyzing, and acting on data in motion
    • Reduces storage cost
    • Low deployment and development costs by utilizing existing technology and skill sets.
    • Gain continuous insight through historical data mining.
    • Take quicker business decisions
    • Improved Time-to-Market
    • Embedded Options and Regional Hubs
      • Embedded-options to pre-process on the edge (for example, sensors and other devices).
      • Regional hubs that provide local processing of event streams from embedded engines for aggregation and correlation.
    • Run complex analytics and mine insights with centralized processing.


Comments

Popular posts from this blog

Office 2013 Installation Error : Code 1603

Wanted to share one error that I got while installing Microsoft Office Professional 2013 for which I had to spend almost 3 days to find the root cause. I also googled and found that many people have also faced the same issue but did not get if anyone had the solution. Sharing the solution that worked for me. Thanks to Dhaval Metrani, my colleague, who also helped me with this. If you get the following error in the log file (in the %temp% folder) while installing Office 2013 Failed to install product OSMMUI.msi ErrorCode: 1603  and the detail log shows ERROR: The network address is invalid then the same is because of Task Scheduler service is not enabled on the machine. 1603 is a generic error and some people have mentioned that the same could be related to deleting/renaming  %programdata% /Microsoft Help but the solution seemed to be related to Task Scheduler when the exact error was related to 'Network address invalid'. By default in Windows 7 and Windows Vista ...

Working with ExtJS and Java

If you are new to extjs then for you ExtJS is a cross-browser Javascript framework for building RIA (Rich Internet Application) based web application. It allows to use any server based technologies for building your application. In my application, I am using ExtJS 3.0 as client side technology, Java (JSP+Hibernate) as server side technology and MySQL 5.x as database. Here I will tell you how to setup the above tools and technologies. ExtJS Setup Download latest version of ExtJS from http://extjs.com/products/extjs/download.php . I am using ExtJS 3.0 in my application. If you are using 3.0 version then you can view the API Documentation online at http://extjs.com/deploy/ext-3.0-rc2/docs/ and you can download the API documentation from download page if you are using any older version than 3.0 Extract the contents to any local folders in your disk. ExtJS IDE Setup It is difficult to remember all ExtJS components and its functions, so we need an IDE for development. Although there are few...

jQuery Intellisense support in Eclipse 3.4.2

To have jQuery Intellisense feature in Eclipse, I tried to find out the way in Google and everyone suggested to use modified version of Eclipse WTP. After doing some research I found out another way of having jQuery Intellisense in Eclipse i.e. integrating Spket IDE with Eclipse. I am using Eclipse 3.4.2 Ganymede version. Download Download and Install Spket IDE and jQuery Download Spket plugin for Eclipse using Eclipse Update Manager, from Spket update site - http://www.spket.com/update/ Once the Spket IDE is installed then download jQuery from http://jquery.com/ and save in your local disk. Configure The steps to configure jQuery Intellisense are: Open Eclipse IDE Select the menu item Window > Preferences... to open the workbench preferences. Select the Spket > JavaScript Profile preference page to display the installed JavaScript Profiles. Click the New.. button. In the Name field, type jQuery (you can type anything) as the name for the new profile. Then click OK . Click th...