Our story of how the monq AIOps platform emerged from a huge project for monitoring business applications

Copyright Roskosmos

In this article I would like to tell the story of the development of our product — an AIOps platform for IT and business monitoring. And also share some lessons learned during the run for the product. This article will be of interest to both those who are building a product and those who are occupied with IT monitoring in a large organization, since our platform is built for automation, all-in-one end-to-end monitoring, synthetic monitoring and predictive analytics.

In 2014, there lived a team of techies who worked in a system service provider company and dreamed of making their own…

or why they won’t fire a good sysadmin

IT staff stares at the screens and performance metrics of their IT assets around the clock — a typical work shift in any SOC (Security Operational Center) or NOC (Network Operational Center). The resources of your talented engineers can be used much more efficiently by implementing an intelligent system that can handle these tasks on its own. Engineers can be assigned to do work where a person and his intellect are more needed — for example, to tasks of development and control — and routine work should be done by “robots”. …

Availability of IT services as a key business indicator, and what does a watermelon have to do with it?

The use of metrics in management is a progressive and modern practice especially in such a digitalized environment as IT business. And what hasn’t been the IT business in the last decade? Everywhere, from the flower trade to the automotive industry, IT is a key factor for success. Metrics allow us to make competent management and engineering decisions, correctly allocate the budget, increase transparency, and achieve fairness and objectivity.

But this noble undertaking has a serious obstacle (we will leave the philosophical problem “Can everything be reduced to a number” for another, specialized platform and dystopias like “Black Mirror”). In…

Application performance monitoring and health metrics without APM

I have already written about AIOps and machine learning methods in working with IT incidents, about hybrid umbrella monitoring and various approaches to service management. Now I would like to share a very specific algorithm, how one can quickly get information about functioning conditions of business applications using synthetic monitoring and how to build, on this basis, the health metric of business services at no special cost. The story is based on a real case of implementing the algorithm into the IT system of one of the airlines.

Currently there are many APM systems, such as Appdynamics, Dynatrace, and others…

Root cause analysis of IT incidents based on correlations between time series of IT infrastructure metrics


One of the tasks of IT monitoring systems is collection, storage and analysis of various metrics characterizing both the state of various elements of the IT infrastructure (CPU load, free RAM, free disk space, etc.), and the state of various business processes. In order to apply the extensive mathematical apparatus of statistical analysis, it is often more convenient to present these data in the form of ordered time series of the corresponding variables. A good tool for time series processing in Python is a combination of three modules: pandas, scipy and statsmodels (pandas.pydata.org, scipy.stats, statsmodels.org) which provide a wide…

Nikolay Ganyushkin

ceo&founder monqlab - AIOps data platform for log analysis, monitoring and automation. MS of Nuclear Physics. MBA Skolkovo.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store