In-Memory Data Fabric: Apache Ignite 2.12 introduces change data capture


The Apache Software Foundation (ASF) has announced the completion of version 2.12 of the in-memory data fabric Apache Ignite. In addition to numerous innovations and bug fixes, the current release also provides various new functions – including an Index Query API and a framework for testing distributed environments based on the Kafka library Ducktape. Apache Ignite 2.12 also introduces the Change Data Capture (CDC) data processing scheme.

As an in-memory computing platform and distributed database, Ignite is designed to ensure scalability and performance for real-time applications even with large amounts of data. To do this, the system keeps the application completely in memory and uses MPP (Massively Parallel Processing) to distribute the computing load in the cluster to a large number of server nodes. In order to asynchronously receive changes to entries on the local node and then trigger necessary actions such as an update of the search index, Ignite will build on the CDC pattern in the future.

CDC is implemented via ignite-cdc.sh and the Java API. In addition to updating the search index, the Pattern Ignite user also opens up fields of application such as calculating statistics for streaming queries, asynchronous interaction with external systems, or auditing logs. CDC is still considered an experimental feature for the time being, more information about the pattern can be found in the documentation.

While Ignite prefers to use only the RAM memory by default, the system can also be scaled to the disk storage available in the cluster if required. This only requires a configuration adjustment:

IgniteConfiguration cfg = new IgniteConfiguration();

DataStorageConfiguration storageCfg = new DataStorageConfiguration();

// Enable Ignite Persistence
storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

// Using the new storage configuration
cfg.setDataStorageConfiguration(storageCfg);

A new Index Query API aims to enable index queries across distributed indexes and retrieval of cache entries that match the specified query. According to the Ignite development team, this can be useful in cases where the application’s design doesn’t allow SQL, among other things. In terms of performance is also IndexScan opposite to ScanQuery to prefer.

In order to create a way to test the Apache Ignite code base in real-world environments, the team decided to develop a suitable test framework based on Confluent Ducktape. The freely available Python framework is primarily used for testing Apache Kafka and was apparently the obvious choice due to its simple structure and the simple management of test clusters.

Codenamed Ducktest, the basic functionality and tests have been implemented in a separate branch of the main Ignite repository. Developers can now start and stop Ignite nodes in any Docker and cluster configuration. Using the Test API, clusters can be control.sh administer. Running other applications such as Spark or Zookeeper is now also possible thanks to Ducktest.

More information on the In-Memory Data Fabric, originally developed by GridGain Systems, can be found on the Apache Ignite project website. The ASF blog post summarizes the most important innovations, and the release notes provide a complete overview.


(map)

To home page



Source link -64