Data Analytics
Utilize the value of your data
With data analytics you gain important insights - for example, how to support your customers better, reduce costs or optimize processes and production alike.
There a plenty oportunities. From improving your reporting with simple analysis, to establishing complex simultations of future scenarios. This also means that data analytics doesn´t need to be costly or complex nor is it necessarily connected to big data.
We support you in queries of past-related analysis as well as with the modeling of future-oriented predictive or prescriptigve approaches.
We support you through all the steps of data analytics
Data Integration and preparation, Visualization, Testing and Support. Get to know the important elements of data analytics and learn how we support you in each step.
Data Integration
Supplier and business partners, social media, weather data, customer questionaires: The combination of your internal data with external data into a central data analytics plattform has many advantages. You can identify correlations, create scenarios and come to new conclusions. What you need: A smart Data Integration - especially in cases of real-time analytics.
Together with you we identify the suitable solution. Which is not necessarily complex and expensive. Often we can already achieve good results by analysing only your databases with simple tasks.
Our tools will be aligned to your infrastructure. Wei work with tools like Oracle Data Integration, Microsoft SQL Server Integration Services (SSIS), integration tools of the Cloudera Hadoop ecosystem like Spark streaming and the supplier independend integration tool Talend.
Data Preparation
The next important step after the data integration is the preparation and the data quality management. As for all systems, it is also true for data analytics that the better the quality of the used data is, the better are the results and the insights of a data analysis are.
In the data preparation also meta data are added to assure transparency which is important to comply to data privacy regulations and to validate the data integrity.
Together, we will evaluate what data preparation is suitable for your use case.
Data Analysis
Together, we define queries and analysis. We develop and test the query logic and the algorithms iteratively. At this step the input of your experts is important. Business context, corporate experience and special characteristics of your industry will be evaluated together. The contribution of your experts will also assure the acceptance of the data analytics in operation later on.
Data Visualisation
To visualise the results of your data analysis is often as important as the analysis algorithm. Cockpits or Dashboards are often used to visualise and interpret the data. Some analysis even rely on the visualisation of the data.
We support you to identify the best fitting standard solution for the visualisation of your data and analysis - e.g. Tableau or Microsoft BI.
Or we develop a customised Cockpit which is taylored for your needs. There is a wide variety of open libraries and frameworks for Java or Python like Shiny etc. We design the dashboard based on your requirements, also as mobile App.
Testing and trial phase
The testing and trial phase is the central part of ever data analytics project. Usually a concepts goes through several iterations of the phases "test", "measure" and "improve".
Your assuptions, algorithms and scenarios will be evaluated and tested. The results repetatly verfied to adjust and optimize the models.
Only after achieving very good results the solution will go live in operation.
Go live
After the quality is assured and the results are verified the data analytics solution can go live in prduction. A good preparation and a structured hand over to the IT operations are as important as a transparent documentation.
Our recommendation for data analytics projects: "Think big, start small and learn fast."
Start with simple use cases and extend fast through agile development cycles. Too much complexity in the beginning brings the risk of delays and not achieving quick results.
Support
Conditions and the environment is changing permanently. This means your data analytics solution must be evaluated regularly and be adjusted on demand. For example, a continuous improvement of your analysis through learning algorithms and regular adjustment is very important.
Therefore, we support you not only in the development phase but also during operation. We can maintain your data analytics solution, verify and optimize it constantly. Hence, we can ensure that your data analytics solution is sustainably delivering value - especially in complex use cases.
Search – Find the right data
Especially with big data, an extended search can be considered data analytics. A great candidate for example is the "Cloudera Search" tool. It is based on Apache Solr and enables scalable and relyable queries even in big data.
Further options could be open source products like Elasticsearch or the tools of suppliers like Oracle or Microsoft.
We support you in designing a performant and reliable platform to search and analyse your data in real time.
Our Approach: CRISP-DM
For data analytics projects we often use the Cross Industry Standard Process for Data Mining, CRISP-DM. It starts with the business understanding and the understanding of the data. It is a very practical and pragmatic approach and with its iterative character it supports agile prcess. For quick results.
Step 1:
Goals & Requirements
At the beginning of the project we want to understand your company as good as possible. In our Workshops we get an understanding of your use cases and identify the goals and the value added.
Step 2:
Analysing the data
Together with you, we evaluate what data are needed to achive the goals and what data are already available in which format. If relevant data are missing we will contact you to.
Step 3:
Data Preparation and Modelling
Based on Step 2 we start with the data preparation and create first models - depending on the requirements in several iterations. In many cases only the modelling makes clear what further data preparation is required.
Step 4:
Verification of the results
After creating an analysis model we verify and evaluate the results (evaluation). This step helps us understand your use case and your business needs even better. If necessary we optimize the data preparation and the modelling based on the new insights we have obtained.
Step 5:
Handing over to operations
Only after sufficient testing and verification we hand over the solution to operations (Deployment).
We work with numerous technologies
To identify the technology which fits best to your infrastructure and your use case, we work with numerous solutions - from Open Source programming with Python or R, Oracle Data Analytics or our technolgy partners Talend, Cloudera and Amazon Web Services.
R
R is a open source programming language and environement for statistical data analysis and data visualization. Due to its flexibility, suppier independency, relevant community and its simple usability R is a very common suoltion for data analytics projects. However, R reaches its limitations with very high data volumes and very complex analysis.
Python
The programming language Python is easy to understand and simple to learn. Nevertheless also big and complex tasks and requirements can be solved in a structured way. Python is very popular wihtin the Data Science Community. Thats why a lot of useful libraries and frameworks for data analytics have been created and are permanently enhanced.
Oracle Analytics
From Business Analytics up to Advanced Analytics: Oracle offers several data analytics solutions for its different products - including Oracle R Enterprise and Oracle Data Mining.
Oracle R Enterprise uses the open source language R to create data analysis. This solution is optimized for performance as the overall solution is targeting corporate customers.
Oracle Data Mining enables a broad selectioni of analytics algorithms which run as SQL on the database. You can build up several predictive analytics methodologies and evaluate and use them in production.
Simple Queries - SQL
Especially in cases without complex algorithms good SQL queries often reach the goal. Prerequisite: broad knowledge of SQL (Structured Query Language) and data science.
SQL is used to communicate with relational databases - e.g. to request data, join them or to execute simple analyses. Also some of the big data frameworks like Spark and Hive support SQL. Therefore, SQL plays a certain role in almost every data project - if big data or data analytics.
That is why we don´t only programm new SQL queries in the data analytics projects, but we also review and optimze the existing SQL queries.
Hadoop-Ecosystem via Cloudera
In the Hadoop Ecosystem are several solutions for data analytics use cases available:
The Cloudera Data Science Workbench ofers extensive opportunities to use divers data analytics tools – with clear governance and security standards.
Itself is not a data analytics plattrform with algorithmes and analysis. It enables the use of different open source solutions in the Hadoop ecosystem, e.g. Spark, Impala, Kudu, Spark MLlibm Mahout, Pig, Solr.
Spark MLlib, a open source machine learning library for Spark and Hadoop offers many algorithms with high performance. Impala is a highly performant solution to execute Bi-Requests and data analytics within Hadoop. As Impala supports SQL it is easy to implement and compatible with many other solutions.
Also data science plattforms out of the Hadoop ecosystem like Tensorflow, R or Python can be integrated easily. The advantage: You can use the data directly wihtin Hadoop without copying them to the notebook of the data scientist or in the cloud. This reduces the challange of data synchronisation and risks related to IT-Security.
Talend
Talend Open Studio and the Talend Data Plattform makes it easy to create automated jobs for the use and analysis of data. Wherever it is for data integration or data analysis. The solutions of the Hadoop ecosystem are easily usable. With Hadoop Hive or Pig you can execute data analytics tasks without programming skills as Talend will create the code in the background.
Other technologies
Open source data science and machine learingn libraries and frameworks constantly evolve. Existing once are extended and further developed while new ones are added. Besides the solutions mentioned above we monitor the market and review new trends like H2O (www.h2o.ai) or Tensorflow etc.
Use Cases
Through usefull use cases you can utilize data analytic projects to increase transparency and efficiency, optimze decisions and create value for your company - throughout your whole value chain. From marketing to your customer support and sales strategy. From procurement to administration, maintenance and support.
We have summarized some interesting use cases for you. We will be happy to support you with identifying adequate uses cases, evaluate the value and develop an implementation strategy with you.
Customer Churn – avoid customer fluctuation
Customer Churn shows customers who stop using your product or your service, means lost customers. In some industries customer change the supplier regularly - e.g. in telecommunications and energy. This customer fluctuation is expensive. If an existing customer is quitting, it creates certain administrative efforts. In parallel you have to invest in marketing and sales to win new customers.
Therefore, it is much cheaper to invest in a high customer retention.
With data analytics solutions you can reduce the customer fluctuation. After identifying the relevant indicators you can detect customers who are willing to change early and start contermeasures on time. For better customer retention and lower costs.
Customer 360 – increase customer transparency
In many companies there is no holistic picture of their customers, even though the data would be available. The problem: Often the data are not consolidated in one system and are not analysed and used. They are distributed in different systems within different departments and business areas - e.g. sales applications, support applications or customer portals.
By systematically consolidating, analysing and utilizing your data you can understand your customers better. Based on this better understanding it is much easier to identify the adequate offer for your customers. Customer transparency helps you to increase your revenue and is reducing your sales efforts at the same time.
Fraud Prevention – Detect and avoid fraud
Image damage or high penalties as a consequence to a compliance scandal or high financial losses due to a major scam - unlawful actions and fraud can get very expensive for your company. The problem: Especially at businesses with a high number of small transactions it is almost impossible to audit each action manually. A high test coverage with extensive random samples often is costly.
With data analytics you can analyse a high number of transaction automated without high efforts. You can detect fraud almost in real time and often you can stop it. This reduces the risks for your company significantly and it avoids damage and reduces costs.
Optimizing procurement - smart savings
A strong network of partners with a trustfull cooperation is getting more and more important in a digital economy. Past procurement strategies can risk the functioning of such digital ecosystems. Smart savings is important today.
With a smart combination of external predictions or data like market prices, Stocks or Weather data with internal data like prduction planning, inventory levels, storage costs etc. you can create a valueable data set. Based on this data set you often can save reasonable costs through rather simple analysis.
Optimizing your processes - organize better with process mining
If you have your processes transparent in front of you, the better you can evaluate how efficient they really are. In every company, there is much data available regarding its processes – probably more than the company is aware of. Many processes are supported and documented by software. Applications like CRM, SCM or ERP usually create log files. Often this information is only used by IT to analyse errors etc.
By systematically analysing these data you can evaluate if your processes are executed in the correct sequence and how high the processing time really is. A great basis to identify major potential for optimization.
Predictive Maintenance – avoid interruptions
Especially for big productions it is true that technical downtime of a major machine or a component leads to a interruption of the overall production. This results to high costs.
The analysis of machines and relevant sensor data could help to utilize maintenance windows better and avoid interruptions. With the data you can detect patterns and monitor thresholds to calculate the probability of malfuntion. Components which have reached a high probability of malfunction can be proactively replaced or optimized and maintained in the next maintenance window.
Algorithms
Media usually only report the complex predicitve algorithms. In reality, often value can already be created with quite simple algorithms as well, if you analyse the right data.
Hereafter you will receive a overview of some anlysis goals - with exemplary use cases.
Classification
With classification objects are allocated into relevant classes or categories based on their characteristics or their behavior. A simple example is the distinction between "creditworthy" and "not creditworthy" by properties like age, income or permanent employment.
Segmentation
With the segmentation you cluster objects with similar or alike attributes to object groups. These groups don´t have to be known or defined beforehand. They can come up through the analysis.
Segmentation is often used in marketing to devide target groups and approach them individually.
Dependency Analysis
The dependency analysis is about analysing data to identify relations and dependencies between several objects or characteristics. This can be used for example for the shopping basket analysis. If you detect products which are depending on each other you can base your sales strategy on that knowledge.
Deviation Analysis
The goal of the deviation analysis is to detect significant deviations - statistical outliers with a strong deviation of the characteristics of the other objects. This type of analysis is often used for quality control in production. If you have identified deviations you can start to evaluate and solve the root cause like material defects from a certain supplier.
Prediction / Forecasting
For a prediction or a forecast you use different calculation models with one ore more variables for a point in the future.
In some cases it is enough to use a simple extrapolation where you estimate future values based on values of the past. In other cases rather complex modelling and simulation is required - especially if the framework conditions are not constant or a simple extrapolation leads to poor results. Often you start with simple models, evaluate the results and improve the model iteratively till the prediction reaches the expected quality level.
A common use case is the prediction of the sales as basis for the production planning.
Other methods
Also for topics like text or picture detection and language processing there are deidcated analysis methodologies. With them, you can identify patterns or characteristics and macht them with the data or objects. They are combining several of the above mentioned methodologies. Depending on the requirements, different analysis and modelling methods are used, often also combined. For example discriminant analysis, nearest neighbor method, clustering, decision trees or artificial neuroal networks.