Platforms
Solutions
Products
Services
Resources
Company
About Us
Clientele
Events
Careers
Disclosures
Media Kit
Contact Us
SELECT LANGUAGE
Contact Us
Resource/ Blogs

When building a recommendation system, the quality of your data pipeline determines the quality of your recommendations. After evaluating Cassandra’s Batch Processing Platform (BPP), we decided to build our Universal Recommendation System’s data ingestion pipeline on Stream Processing Platform (SPP) with Apache Flink. Six months and millions of processed messages later, here is why it was the right decision and what we learned along the way.
Stream Processing Platform (SPP) is an enterprise streaming platform that provides event-driven processing, automatic deployment, monitoring, and native integration with Event Bus (Kafka). Apache Flink is the streaming engine powering SPP, providing exactly-once semantics, stateful processing, and horizontal scalability.
For personalisation use cases, stale data means missed opportunities — a recommendation system is only as good as the data feeding it. Our pipeline addresses this directly by keeping recommendation data in near-real-time sync with source systems. This means newly published content is available for recommendation within seconds, incoming events are dynamically enriched before reaching the recommender, and delivery remains reliable at scale — so users always receive contextually relevant, up-to-date recommendations rather than yesterday’s data.
SPP provides event-driven streaming with sub-second latency (vs. batch delays), horizontal auto-scaling based on message volume, native Kafka integration and tooling, automatic checkpointing and state management, and SDK support with local development tools. These capabilities directly support our goals of near-real-time sync, scalability, and faster iteration for personalisation.
The architecture follows an event-driven pattern: Kafka Source → JSON Parsing → Dynamic Pipeline Orchestration → Recommender Output. Failed messages are managed via fail-fast and filtering (logged and dropped so healthy traffic continues).
Apache Flink serves as the streaming engine, providing exactly-once semantics, stateful processing, and horizontal scalability. SPP adds enterprise-grade features including automatic deployment, monitoring, and integration with the ecosystem.
High-Level Architecture Diagram
Explanation: Event Source: Two Kafka-backed flows: OICMS (content ingestion topic) and Campaign Manager (CM events topic).
Stream Processing — SPP on Apache Flink: shared ingest and JSON parsing; flow-specific validation (details in diagram for both OICMS and CM); flow-specific enrichment (OICMS: fetch external content; CM: recommender API + CM placement).
Destination: Both flows write to the Recommender System.
Traditional batch processing systems collect data and deliver results hours or days later. For modern recommendation systems, this delay is unacceptable. Real-time processing enables immediate content availability, dynamic enrichment, and reliable delivery at scale. Understanding the right architecture for stream processing keeps engineering teams ahead of the scalability curve.
We started with a proof-of-concept using Cassandra’s Batch Processing Platform. While BPP is powerful for scheduled batch jobs, we quickly realised our recommendation system needed real-time, event-driven processing.
We designed a multi-stage processing pipeline built on Apache Flink that ingests content from Kafka, processes it through configurable stages, and delivers enriched data to the recommendation system in real time.
Several deliberate design choices shaped the reliability, maintainability, and performance of our pipeline. Here is what made the most difference.
One of our most impactful decisions was making the entire pipeline configuration-driven. Instead of hardcoding processing logic, we externalised all processing steps, validation rules, and service endpoints into HOCON configuration files. This approach delivers significant benefits: zero-code deployments for pipeline modifications, environment-specific configurations (local, QAL, production), easy experimentation with different processing strategies, and deployment time reduced from hours to minutes.
We designed the system expecting failures in distributed environments. Key resilience patterns include: Retry Mechanisms with exponential backoff for external API calls and configurable retry attempts per service; fail-fast and filtering so failed messages are logged and dropped without blocking healthy traffic; Circuit Breakers to prevent cascading failures when external services become unavailable; and Monitoring & Observability with structured logging and Splunk integration for real-time visibility into pipeline health, performance metrics, and error tracking.
We integrated MapStruct for compile-time code generation of data mappings between DTOs and domain models. This approach catches mapping errors during compilation rather than at runtime, eliminating an entire class of production bugs and making refactoring safer.
Running a full Flink cluster with Kafka locally is complex. We built a Local Producer tool that executes the complete pipeline locally without infrastructure dependencies. This reduced the developer feedback loop from hours to seconds, dramatically improving productivity and code quality.
Since deploying to production, our pipeline has delivered:
Choosing Stream Processing Platform over Batch Processing Platform gives us the low latency, scalability, and reliability needed for modern recommendation systems. Our configuration-driven architecture on Apache Flink with SPP’s enterprise features created a maintainable foundation for near-real-time data synchronisation to support personalisation use cases. Validating technology choices through POCs, prioritising configuration over code, and designing for failure and observability from the start were key to our success.
Want to build smarter pipelines with AI-first architecture? At Covalense Digital, we specialise in building intelligent, AI-first platforms that power real-time decisioning, personalised customer experiences, and scalable data pipelines — for telecoms, enterprises, and beyond. Whether you’re modernising a legacy BSS stack or engineering a next-generation data architecture, our expertise in AI/ML, iPaaS, and digital transformation can help you move faster and scale confidently.
Let's connect at reachus@covalensedigital.com or fill in a quick contact form.
Author
Chaitanya Madamanchi, Senior Software Engineer
With over 6 years of IT experience, Chaitanya specialises in Java-based technologies, including Spring Boot, Hibernate, and Microservices. He has a proven track record of developing backend systems and RESTful integrations for enterprise-scale solutions in Agile environments.