Background
Our client, an American national off-price department store retailer with over 1,000 stores in 40 states and Puerto Rico, sought our expertise in optimizing their IT infrastructure and reducing costs. Specifically, they needed assistance with their legacy data processing system, elasticity and scalability challenges, and slow database operations.
Current Challenges
Legacy System for Data Processing: The client\’s existing data processing application using Informatica had been in use for 12 years and struggled to support tables with over 4 billion records as the business grew.
Elasticity & Scalability: Being hosted on a data center, the client lacked the flexibility to adjust resource usage based on demand, hindering cost optimization.
DBMS Operations: The source data was constantly changing, leading to sluggish delete and upsert operations within the database.
Solution
We proposed and implemented three approaches to address the client\’s challenges and provide a comparative analysis of performance and cost.
Approach 1: Snowflake
Data from various sources was loaded into Snowflake, including the large table with 4 billion records.
Data processing (ETL / ELT) was performed within Snowflake, leveraging different layers.
Snowflake warehouses were utilized to recreate scenarios and measure results.
Approach 2: Databricks + Snowflake
Databricks was used to load the data into Snowflake as target tables.
Azure ADLS supported the Databricks environment.
Approach 3: Informatica + Snowflake
Data processing was carried out using Informatica.
The consumption layer resided within Snowflake.
The Results
After thorough comparison of performance metrics for all three approaches and proper utilization of Snowflake warehouses, the combination of Informatica + Snowflake emerged as the preferred option. The familiarity with Informatica within the client\’s current ecosystem also contributed to this decision, minimizing the learning curve.
By selecting Informatica + Snowflake, our client experienced the following outcomes:
Improved Performance: The optimized tech stack resulted in enhanced performance metrics, enabling efficient data processing and faster DBMS operations.
Scalability and Cost Optimization: The shift to Snowflake on a cloud-based environment provided the needed elasticity and scalability, allowing the client to adjust resource usage as per demand and optimize costs accordingly.
Seamless Integration: Leveraging Informatica within the existing ecosystem ensured a smoother transition and minimized the impact of change on the organization.
In conclusion, our technology advisory and tech stack selection expertise helped our client overcome their challenges, leading to improved performance, scalability, and cost optimization in their data warehousing operations.
 
								

