Combining DevOps and Big Data




devops big data







Those that work in DevOps may think that Big Data doesn’t have much to do with them – and vice versa. But the boundary line between the two fields is becoming blurred. Many businesses are coming to the realisation that it makes sense to pair Big Data with DevOps. Here’s why.


What is DevOps?


People who work in data may only be vaguely familiar with the concept of DevOps. So before continuing it’s important to clarify terms. According to Wikipedia, DevOps is defined as:


“DevOps is a software engineering practice that aims at unifying software development (Dev) and software operation (Ops). The main characteristic of the DevOps movement is to strongly advocate automation and monitoring at all steps of software construction, from integration, testing, releasing to deployment and infrastructure management.”


Many that work in DevOps are quick to point out that it is more a philosophy or mindset rather than an exact process.  DevOps emphasises constant communication across a range of different departments. The thinking goes that breaking down organisational barriers helps to streamline the process of software production.


The ‘continuous delivery’ of software is another important concept closely related to DevOps. Under the continuous delivery model, code is designed, written, tested, and pushed into production environments on a constant basis.


DevOps makes continuous delivery possible. This is because it facilitates constant collaboration between all different teams responsible for pushing software down the delivery pipeline. More traditional approaches to software production often create long delays when code was passed from one team on to the next. The work is done in sequence, there is no ability to work in parallel.


Big Data and DevOps


Data wasn’t mentioned in either of the descriptions above and it’s true that in the conventional sense, DevOps is not closely linked to the field of data analytics. So where does data fit in? Considering that the ultimate aim of DevOps is to make software production and delivery more efficient, including data specialists within the continuous delivery process can be a huge help when it comes to optimising and refining ongoing operations and processes. There are valuable contributions data analysts can make at a variety of stages throughout the software delivery pipeline. Here are a few of the benefits companies can expect to gain by combining Big Data and DevOps.  


More effective planning of software updates


Most software comes into contact with data at one point or another. Before updating or redesigning an app, it helps to have a highly accurate understanding of the types of data sources the app will be working with. The sooner this information is delivered to the developers, the better. By getting together with data experts before sitting down to write code, developers can plan updates in a more effective way.


Lower error rates


As software is written and tested, problems surrounding data handling commonly cause errors. As the complexity of the app and the data it works with increases, so does the error rate. Being able to identify such errors in the early stages of the delivery pipeline can save a huge amount of time and effort. Close collaboration between data experts and the rest of the DevOps team makes life much easier when it comes to finding and fixing data-related errors in an application.


Consistency between development and production environments


One of the central tenants of the DevOps philosophy is to strive to create development environments that mimic real-world production environments as much as possible. But when it comes to apps that work with Big Data, this is not easy for developers alone to achieve.This is because the types and diversity of data found in the real world vary widely. Furthermore, the quality of the data being fed into the app is also subject to change.

By making their voice heard early on in the delivery process, data experts can flag up to the development team the types of challenges their software will likely run into when it goes into production. Steps can then be taken to bring the development environment more in line with the demands of the real world.


More accurate feedback from production


In the final stage of the continuous delivery process, data is collected from the production environment after the app has been released. This data can then be used to help understand the where the strengths and weakness of the software lie, which allows the next update to be planned accordingly. This process depends in part on the work of system administrators who help monitor and maintain software when it goes into production.


However, no one is better qualified to analyse production related data than the production experts themselves. The relevant data includes things like app health statistics (CPU time, memory usage, etc.), the number and location of users and much more. This is arguably where Big Data works best with DevOps. By bringing their skill in data analytics to the DevOps feedback process, data experts can help ensure the company has a solid understanding of what is and isn’t working in the delivery pipeline.


The bottom line here is that Big Data teams and DevOps teams can benefit from working together. Making room for data experts within your DevOps workflow can help to make the delivery of software as efficient as possible. Even though Big Data and DevOps are traditionally seen as two completely different entities, the two departments should not be walled off from one another.