Project Overview

This project implements a credit card transaction processing system using a Lambda Architecture with both stream and batch processing components. The system tracks card balances, approves or declines transactions based on validation rules, and manages data flow between stream and batch layers.

  1. Stream Layer: Processes transactions in real-time, validating them and tracking pending balances.
  2. Batch Layer: Periodically processes accumulated data, updating the master dataset with finalized values.
  3. Serving Layer: Provides access to the processed data through relational database.

image.png

Dataset

You will work on different datasets individually. The link to the datasets is provided here, and your dataset ID remains the same as Project 2 and 3. Submission with wrong dataset is not accepted.

  1. Customers.csv records all important information about the credit card users

image.png

  1. Cards.csv records each cards identical information

image.png

  1. Credit_card_types.csv records the different types of credit card we offer

image.png

  1. Transactions.csv records real-time transaction history from April 1st to 4th in 2025

image.png

Requirements and Grading Rubric (100 points)

Note: Pandas package is not allowed for processing data

Commonly asked questions:

  1. Do we need to do daily processing between stream and batch layers?