Self-driving – Modern Data Warehouses



Self-driving

A self-driving database requires three high-level steps. The first step is to enable monitoring of the administrative activities of the database. The second step is to collect data and monitor past events. Third is to create predictive models; e.g., a forecasting framework predicts the query arrival rates using time-series models combined with behavior modeling, which constructs machine learning models that predict the run-time behavior of database management systems. And, in the last, what action needs to be taken to enable autonomous database management systems that proactively plan for optimization actions? It automatically applies the actions and provides explanations of its decisions. Some examples of algorithms are Monte Carlo tree search and receding horizon control.

A self-driving DBMS enables automatic system management, Sys-DevOps, and removes any impediments. First, workload forecasting predicts the query, workload patterns, storage, and processing requirements using machine learning, time-series learning model, behavior modeling and AI.

Self-tuning andConfiguration

Modern data warehouses utilize machine learning and deep learning algorithms to analyze current workloads to forecast future workloads. At the same time, they control and fine-tune the number of CPUs required and customize temporary storage and query performance. Modern data warehouses have the ability to use hybrid storage to reduce the latency of OLAP/OLTP workloads.

Self-automatic backup is included as part of the business continuity and disaster recovery service, and this includes restoration to a point in time within a preconfigured retention period. The options for full backup of data and logs can be set by the customer to their desired frequency, recovery point, and time objective. It offers self-automatic multi-regional support for backup and self-recovery from a failover point.

Self-partitioning divides a big portion into smaller partitions. This is generally done to improve the performance and security by querying a small set of data parallelly. For example, with multiple parallel processing, partitioning advisors work closely with an MPP (multiple parallel processing) engine to partition with three objectives: combining the best of directed and stochastic search algorithms; balancing and exploiting the best solutions and exploring the search space; and maintaining a pool of potential solutions that consider alternative algorithms.

Self-scaling via horizontal and vertical scaling offers easy database node resizing so you can scale your database resources up or down as needed. Even horizontal scaling helps by adding capacity via additional servers, portioning the servers, and loading them. However, vertical scaling adds capacity to a server by increasing memory, processors, and so forth, making the same server bigger and better. The use cases are different for horizontal and vertical scaling. However the best-case scenario is having the capability of hybrid scaling where, depending on the use case, one or both capacities can be chosen.

Hybrid: As we know, OLTP and OLAP systems work with different objectives in mind. One is writing optimized, which OLAP is a read-optimized engine. A major development would be to transform DBMS to support both OLAP and OLTP workloads at the same time. The modern DW is able to adjust its configuration to accommodate both types of workloads at run time, depending on technical or business requirements.

Leave a Reply

Your email address will not be published. Required fields are marked *