Data Analytics
Utilize the value of your data

With data analytics you gain important insights - for example, how to support your customers better, reduce costs or optimize processes and production alike.

There a plenty oportunities. From improving your reporting with simple analysis, to establishing complex simultations of future scenarios. This also means that data analytics doesn´t need to be costly or complex nor is it necessarily connected to big data. 

We support you in queries of past-related analysis as well as with the modeling of future-oriented predictive or prescriptigve approaches.

 

We support you through all the steps of data analytics

Data Integration and preparation, Visualization, Testing and Support. Get to know the important elements of data analytics and learn how we support you in each step.

Supplier and business partners, social media, weather data, customer questionaires: The combination of your internal data with external data into a central data analytics plattform has many advantages. You can identify correlations, create scenarios and come to new conclusions. What you need: A smart Data Integration - especially in cases of real-time analytics.

Together with you we identify the suitable solution. Which is not necessarily complex and expensive. Often we can already achieve good results by analysing only your databases with simple tasks.  

Our tools will be aligned to your infrastructure. Wei work with tools like Oracle Data Integration, Microsoft SQL Server Integration Services (SSIS), integration tools of the Cloudera Hadoop ecosystem like Spark streaming and the supplier independend integration tool Talend.

The next important step after the data integration is the preparation and the data quality management. As for all systems, it is also true for data analytics that the better the quality of the used data is, the better are the results and the insights of a data analysis are.

In the data preparation also meta data are added to assure transparency which is important to comply to data privacy regulations and to validate the data integrity. 

Together, we will evaluate what data preparation is suitable for your use case. 

 

Together, we define queries and analysis. We develop and test the query logic and the algorithms iteratively. At this step the input of your experts is important. Business context, corporate experience and special characteristics of your industry will be evaluated together. The contribution of your experts will also assure the acceptance of the data analytics in operation later on.

To visualise the results of your data analysis is often as important as the analysis algorithm. Cockpits or Dashboards are often used to visualise and interpret the data. Some analysis even rely on the visualisation of the data.

We support you to identify the best fitting standard solution for the visualisation of your data and analysis - e.g. Tableau or Microsoft BI.

Or we develop a customised Cockpit which is taylored for your needs. There is a wide variety of open libraries and frameworks for Java or Python like Shiny etc. We design the dashboard based on your requirements, also as mobile App.

The testing and trial phase is the central part of ever data analytics project. Usually a concepts goes through several iterations of the phases "test", "measure" and "improve". 

Your assuptions, algorithms and scenarios will be evaluated and tested. The results repetatly verfied to adjust and optimize the models.

Only after achieving very good results the solution will go live in operation.

After the quality is assured and the results are verified the data analytics solution can go live in prduction. A good preparation and a structured hand over to the IT operations are as important as a transparent documentation.

Our recommendation for data analytics projects: "Think big, start small and learn fast."

Start with simple use cases and extend fast through agile development cycles. Too much complexity in the beginning brings the risk of delays and not achieving quick results.

Conditions and the environment is changing permanently. This means your data analytics solution must be evaluated regularly and be adjusted on demand. For example, a continuous improvement of your analysis through learning algorithms and regular adjustment is very important.

Therefore, we support you not only in the development phase but also during operation. We can maintain your data analytics solution, verify and optimize it constantly. Hence, we can ensure that your data analytics solution is sustainably delivering value - especially in complex use cases.

Especially with big data, an extended search can be considered data analytics. A great candidate for example is the "Cloudera Search" tool. It is based on Apache Solr and enables scalable and relyable queries even in big data.

Further options could be open source products like Elasticsearch or the tools of suppliers like Oracle or Microsoft. 

We support you in designing a performant and reliable platform to search and analyse your data in real time.

Our Approach: CRISP-DM

For data analytics projects we often use the Cross Industry Standard Process for Data Mining, CRISP-DM. It starts with the business understanding and the understanding of the data. It is a very  practical and pragmatic approach and with its iterative character it supports agile prcess. For quick results.

 

We work with numerous technologies

To identify the technology which fits best to your infrastructure and your use case, we work with numerous solutions - from Open Source programming with Python or R, Oracle Data Analytics or our technolgy partners Talend, Cloudera and Amazon Web Services.

R is a open source programming language and environement for statistical data analysis and data visualization. Due to its flexibility, suppier independency, relevant community and its simple usability R is a very common suoltion for data analytics projects. However, R reaches its limitations with very high data volumes and very complex analysis.

The programming language Python is easy to understand and simple to learn. Nevertheless also big and complex tasks and requirements can be solved in a structured way. Python is very popular wihtin the Data Science Community. Thats why a lot of useful libraries and frameworks for data analytics have been created and are permanently enhanced.

From Business Analytics up to Advanced Analytics: Oracle offers several data analytics solutions for its different products -  including Oracle R Enterprise and Oracle Data Mining.

Oracle R Enterprise uses the open source language R to create data analysis. This solution is optimized for performance as the overall solution is targeting corporate customers.

Oracle Data Mining enables a broad selectioni of analytics algorithms which run as SQL on the database. You can build up several predictive analytics methodologies and evaluate and use them in production.

 

Especially in cases without complex algorithms good SQL queries often reach the goal. Prerequisite: broad knowledge of SQL (Structured Query Language) and data science.

SQL is used to communicate with relational databases - e.g. to request data, join them or to execute simple analyses. Also some of the big data frameworks like Spark and Hive support SQL. Therefore, SQL plays a certain role in almost every data project - if big data or data analytics.

That is why we don´t only programm new SQL queries in the data analytics projects, but we also review and optimze the existing SQL queries.

In the Hadoop Ecosystem are several solutions for data analytics use cases available:

The Cloudera Data Science Workbench ofers extensive opportunities to use divers data analytics tools – with clear governance and security standards.

Itself is not a data analytics plattrform with algorithmes and analysis. It enables the use of different open source solutions in the Hadoop ecosystem, e.g. Spark, Impala, Kudu, Spark MLlibm Mahout, Pig, Solr.

Spark MLlib, a open source machine learning library for Spark and Hadoop offers many algorithms with high performance. Impala is a highly performant solution to execute Bi-Requests and data analytics within Hadoop. As Impala supports SQL it is easy to implement and compatible with many other solutions.

Also data science plattforms out of the Hadoop ecosystem like Tensorflow, R or Python can be integrated easily. The advantage: You can use the data directly wihtin Hadoop without copying them to the notebook of the data scientist or in the cloud. This reduces the challange of data synchronisation and risks related to IT-Security.

Talend Open Studio and the Talend Data Plattform makes it easy to create automated jobs for the use and analysis of data. Wherever it is for data integration or data analysis. The solutions of the Hadoop ecosystem are easily usable. With Hadoop Hive or Pig you can execute data analytics tasks without programming skills as Talend will create the code in the background. 

Open source data science and machine learingn libraries and frameworks constantly evolve. Existing once are extended and further developed while new ones are added. Besides the solutions mentioned above we monitor the market and review new trends like H2O (www.h2o.ai) or Tensorflow etc. 

Use Cases

Through usefull use cases you can utilize data analytic projects to increase transparency and efficiency, optimze decisions and create value for your company - throughout your whole value chain.  From marketing to your customer support and sales strategy. From procurement to administration, maintenance and support.

We have summarized some interesting use cases for you. We will be happy to support you with identifying adequate uses cases, evaluate the value and develop an implementation strategy with you.

 

Customer Churn shows customers who stop using your product or your service, means lost customers. In some industries customer change the supplier regularly - e.g. in telecommunications and energy. This customer fluctuation is expensive. If an existing customer is quitting, it creates certain administrative efforts. In parallel you have to invest in marketing and sales to win new customers.

Therefore, it is much cheaper to invest in a high customer retention. 

With data analytics solutions you can reduce the customer fluctuation. After identifying the relevant indicators you can detect customers who are willing to change early and start contermeasures on time. For better customer retention and lower costs.

In many companies there is no holistic picture of their customers, even though the data would be available.  The problem: Often the data are not consolidated in one system and are not analysed and used. They are distributed in different systems within different departments and business areas - e.g. sales applications, support applications or customer portals.

By systematically consolidating, analysing and utilizing your data you can understand your customers better. Based on this better understanding it is much easier to identify the adequate offer for your customers. Customer transparency helps you to increase your revenue and is reducing your sales efforts at the same time.

Image damage or high penalties as a consequence to a compliance scandal or high financial losses due to a major scam - unlawful actions and fraud can get very expensive for your company. The problem: Especially at businesses with a high number of small transactions it is almost impossible to audit each action manually. A high test coverage with extensive random samples often is costly.

With data analytics you can analyse a high number of transaction automated without high efforts. You can detect fraud almost in real time and often you can stop it. This reduces the risks for your company significantly and it avoids damage and reduces costs.

A strong network of partners with a trustfull cooperation is getting more and more important in a digital economy. Past procurement strategies can risk the functioning of such digital ecosystems. Smart savings is important today.

With a smart combination of external predictions or data like market prices, Stocks or Weather data with internal data like prduction planning, inventory levels, storage costs etc. you can create a valueable data set. Based on this data set you often can save reasonable costs through rather simple analysis.

If you have your processes transparent in front of you, the better you can evaluate how efficient they really are. In every company, there is much data available regarding its processes – probably more than the company is aware of. Many processes are supported and documented by software. Applications like CRM, SCM or ERP usually create log files. Often this information is only used by IT to analyse errors etc. 

By systematically analysing these data you can evaluate if your processes are executed in the correct sequence and how high the processing time really is. A great basis to identify major potential for optimization. 

Especially for big productions it is true that technical downtime of a major machine or a component leads to a interruption of the overall production. This results to high costs.

The analysis of machines and relevant sensor data could help to utilize maintenance windows better and avoid interruptions. With the data you can detect patterns and monitor thresholds to calculate the probability of malfuntion. Components which have reached a high probability of malfunction can be proactively replaced or optimized and maintained in the next maintenance window.

Algorithms

Media usually only report the complex predicitve algorithms. In reality, often value can already be created with quite simple algorithms as well, if you analyse the right data.

Hereafter you will receive a overview of some anlysis goals - with exemplary use cases.

With classification objects are allocated into relevant classes or categories based on their characteristics or their behavior. A simple example is the distinction between "creditworthy" and "not creditworthy" by properties like age, income or permanent employment.

With the segmentation you cluster objects with similar or alike attributes to object groups. These groups don´t have to be known or defined beforehand. They can come up through the analysis. 

Segmentation is often used in marketing to devide target groups and approach them individually.

The dependency analysis is about analysing data to identify relations and dependencies between several objects or characteristics. This can be used for example for the shopping basket analysis. If you detect products which are depending on each other you can base your sales strategy on that knowledge.

The goal of the deviation analysis is to detect significant deviations - statistical outliers with a strong deviation of the characteristics of the other objects. This type of analysis is often used for quality control in production. If you have identified deviations you can start to evaluate and solve the root cause like material defects from a certain supplier. 

For a prediction or a forecast you use different calculation models with one ore more variables for a point in the future.

In some cases it is enough to use a simple extrapolation where you estimate future values based on values of the past. In other cases rather complex modelling and simulation is required - especially if the framework conditions are not constant or a simple extrapolation leads to poor results. Often you start with simple models, evaluate the results and improve the model iteratively till the prediction reaches the expected quality level.

A common use case is the prediction of the sales as basis for the production planning.

Also for topics like text or picture detection and language processing there are deidcated analysis methodologies. With them, you can identify patterns or characteristics and macht them with the data or objects. They are  combining several of the above mentioned methodologies. Depending on the requirements, different analysis and modelling methods are used, often also combined. For example discriminant analysis, nearest neighbor method, clustering, decision trees or artificial neuroal networks.