Data, Data Everywhere: How Machine Learning Tools Capture the Most Valuable Insights from Your Massive Data Collection

Posted on Thu, Sep 28, 2017 @ 1:19 pm

Big Data. Data Lakes. Analytics. Machine Learning.  Are these tools only meant for tech giants like Google or Facebook? Or, can these tools meaningfully assist any business and help it respond to the ever-changing needs of the market? As the software industry continues to move toward consuming and delivering cloud-based solutions, many companies now realize they sit on a gold mine of data. Although many understand the value of data, few understand how to unlock that value.

Every time one of your customers clicks on a link, completes a transaction, or views a page, it can be logged and tracked.  Some customers realize the value of this data and set up a data lake to store it. To put it simply, a data lake is a database that contains huge amounts of raw transactional data. Data lakes often get confused with data warehouses because they are similar, but data lakes are much more versatile and can grow much larger than data warehouses. With SaaS or connected on-premise solutions, data can be exported on a regular basis from the solution to a data lake, where the data is normalized. This enables you to apply analytics at a later date. If you need help choosing the right data lake, we suggest Hortonworks Data Platform. (Full disclosure: OFS partners with Hortonworks.)

Data lakes collect endless amounts of raw transactional data, but it is just that: raw data. Without applying analytics to that data, obtaining any useful information or action items from it is very difficult.  Everyone knows what analytics is, but most only scratch the surface of what analytics can do for their business. For example, how many people look at a data lake and ask, “How many people logged in?” or “How many people clicked on a link after we deployed this new product/feature?” or “How many widgets were purchased after this marketing campaign?” It’s good to have this information when answering one-off tactical questions, but it does not tell a story about how your business is doing. It does not provide insights into your customers’ and users’ activities and trends.

What makes things more difficult is analyzing data from disparate systems. Think about a typical e-commerce shop. It has multiple, different backend systems all connected to create a complete solution:

  • Customer-facing web portals
  • Shopping carts
  • Billing systems
  • ERP/OMS/WMS/other order and fulfillment systems
  • Shipping systems
  • Sales/contact management systems

Each one of these systems comes from a different provider, has its own database, methods, naming conventions and APIs. The data lake ingests all data from each system, then uses the data for various analytics.

It’s all too often the analytics applied to a data lake do not realize the true potential of the data, or the analytics provide information that is just plain inaccurate.  How many have seen this scenario? Data scientists look at pre-collected data chosen arbitrarily by an engineer, and then they show that data with charts or graphs on a web page. How many dashboards contain the same data:  new accounts this month, number of logins, and number of widgets purchased? Typically, this type of data is used to prove or disprove an existing theory by a single person or team. These solutions provide no conclusions, next steps, or trends to watch out for, and most importantly, they reveal little to no insight into overall business trends.

The industry is moving in another direction and putting a new layer on top of an analytics engine: machine learning (ML). Data scientists now use artificial intelligence engines that meticulously sift through each raw transaction in a data lake to look for crucial trends that lie either at the surface or deep within the data.  These engines are capable of gathering, comparing, and analyzing data from multiple, completely different sources, such as the ones mentioned above.  These machine learning engines consume that data, and ultimately deliver suggestions, theories, and trends that provide answers to significant questions you didn’t even think of asking. Using artificial intelligence (AI) algorithms, an ML engine even can alter its analysis and conclusions based on the changing trends it sees within a data set.

We all think about sales and marketing organizations applying ML to their data to ensure they pitch the right products at the right time to the right consumer. However, many industries now use machine learning and AI to solve very specific problems. The financial services industry applies ML to prevent fraud and reduce expenses related to it. Investment and stock brokerage firms also use AI to suggest stock trends and offer insights to traders about when to enter and exit certain holdings. The healthcare industry has seen an explosion of data collection from wearable devices and is using ML to provide accurate, more targeted healthcare services to specific individuals. Even the oil and gas industry is applying ML and AI to data sets collected from mineral analysis to predict refinery failures or service degradations before they happen.

Ultimately, each industry and business strives to accomplish two things: Identify profitable opportunities for growth, and reduce or avoid risk. Humans ask big data teams to display data they “feel” is important to make educated business decisions.  However, this leads to missing significant trends and insights living deep within the data, and even glaring trends staring data analysts right in the face.  By implementing machine learning on top of data lakes and existing analytics engines, you can gain insights from transactional data to help your business grow.

Interested in seeing machine learning and data analytics technologies in action? Check out ObjectFrontier’s iHealth application demo from our Analytics Innovation Lab. iHealth is powered by data analytics and machine learning to provide real-time insights that create better opportunities to treat patients experiencing abnormal heart rates during physical activity.

About the Author

bob-kramichBob Kramich leads all our US sales efforts and is responsible for directing all business development activities for OFS globally. Bob has more than 25 years of experience in creating and delivering high-value software engineering relationships with US and international companies. Prior to joining OFS, Bob served as vice president of Business Development, Life Sciences, for EPAM Systems, Inc. (NYSE:EPAM), a leading global provider of software product development services. Bob joined EPAM through EPAM’s acquisition of GGA Software Services, the world’s preeminent provider of scientific informatics services to global biotech and pharmaceutical companies. Bob served as GGA’s chief business development officer. Bob holds a Bachelor of Arts degree from Tufts University and a Master of Business Administration degree from Boston College’s Carroll School of Management.

Please follow and like us:
0

Leave a Reply

Your email address will not be published. Required fields are marked *