Retail
Technology, Information and Media
Active

Ramakrishna

data engineer/data analyst

Jurisdiction

Canada

Experience

Seniority LevelEntry level
Years of Experience0-5 years
Current StatusActive
data engineernokia
2025 - 2025

• Engineered end-to-end data pipelines using Azure Data Factory and Databricks, enabling seamless ingestion, transformation, and loading (ETL/ELT) of large-scale structured and semi-structured data. • Architected and deployed a modern Lakehouse solution using Microsoft Fabric, integrating data ingestion, modeling, and real-time reporting across departments. • Spearheaded real-time PySpark pipelines in Databricks, converting raw feeds into refined Delta Live Tables to boost analytical efficiency. • Integrated SharePoint, Power Automate, and Azure Blob Storage to automate document processing and route data directly into SQL Server via ADF pipelines. • Managed hybrid data infrastructure across on-prem SQL Server and cloud platforms (Azure) to support scalable and secure data solutions. • Designed robust dimensional models (star and snowflake schemas) using dbt Cloud, applying the Medallion architecture (Bronze, Silver, Gold) for data transformation and reusability. • Developed advanced T-SQL and PL/SQL stored procedures, views, and functions to implement business logic, automate workflows, and standardize reporting metrics. • Implemented Azure Backup and Site Recovery solutions for mission-critical workloads, enabling RPO/RTO compliance and minimizing downtime during failover scenarios. • Applied Unity Catalog in Databricks to enforce fine-grained data access policies, centralized metadata management, and maintain audit-ready data governance. • Configured real-time failure alerts using Logic Apps, ADF Web Activity, and Azure Monitor, enabling issue resolution and improved system uptime. • Implemented version control and CI/CD processes using Azure DevOps, facilitating safe deployments and environment consistency. • Automated data ingestion from SharePoint to Azure Blob Storage with PowerShell scripts, reducing manual intervention and accelerating data availability. • Collaborated with DevOps teams to containerize data applications using Docker and orchestrate deployments through Kubernetes, ensuring scalability and resilience. • Built and deployed predictive ML pipelines using Azure Machine Learning Studio, training models on curated datasets and automating score ingestion into SQL Server with ADF. • Structured and managed Azure ML Datastores for training, testing, and monitoring model performance efficiently across multiple environments. • Leveraged scikit-learn and pandas in Jupyter notebooks for feature engineering, cross-validation, and explainability of machine learning models. • Embedded ML predictions into Power BI dashboards for real-time decision-making and operational intelligence. • Developed impactful Power BI dashboards to visualize insights from Azure SQL, Databricks, and Delta Lake, enabling executives and analysts to track KPIs in real time. • Created complex DAX measures and custom calculations to support executive-level dashboards with row-level security and dynamic filtering. • Ensured seamless auto-refresh and data integrity in Power BI Service through linked datasets and scheduled updates. • Automated Azure infrastructure deployments using ARM templates, Bicep, and PowerShell, ensuring consistent and repeatable provisioning of resources.

data analyst/data engineerEDURUN GROUP INC
2022 - 2023

• Designed and implemented scalable ETL pipelines using Azure Data Factory (ADF) to extract data from on-premises SQL Server and flat file systems into Azure SQL Database, enhancing accessibility for downstream analytics and reporting. • Configured modular and parameterized ADF pipelines leveraging activities such as Lookup, For Each, Copy, Get Metadata, and Stored Procedure, ensuring pipeline reusability, maintainability, and performance. • Designed, developed, and deployed ETL workflows using SSIS to extract, transform, and load data from multiple sources (SQL Server, Excel, flat files, and APIs) into staging and data warehouse environments. • Developed robust T-SQL stored procedures, user-defined functions, and views to encapsulate complex business logic and automate recurring data processing tasks. • Created parameterized SSIS packages for dynamic data loading and configuration-driven deployments across development, QA, and production environments. • Automated daily, weekly, and monthly data workflows by configuring SQL Server Agent Jobs for ETL tasks, backups, and scheduled report generation, minimizing manual dependencies. • Designed relational data models and normalized schemas for greenfield data projects, ensuring referential integrity and consistent data structures across systems. • Authored data validation scripts and reconciliation queries to compare source vs. target datasets post-ingestion, ensuring completeness and accuracy of data loads. • Partnered with business analysts and domain experts to translate reporting needs into optimized SQL queries and deliver actionable, metric-rich ad hoc and scheduled reports. • Utilized Common Table Expressions (CTEs), window functions, and temporary tables to efficiently compute key performance indicators (KPIs) and address complex analytical use cases. • Migrated legacy SQL jobs and scripts into modular components using modern best practices such as consistent naming conventions, error handling, and reusability patterns. • Developed Mapping Data Flows in ADF for data cleansing, schema standardization, and type transformations, improving the quality and readiness of data for analytics. • Orchestrated Azure Databricks notebooks within ADF pipelines to transform raw structured and semi-structured data using PySpark, processing formats such as CSV and JSON. • Authored PySpark scripts to perform joins, filters, and aggregations on large datasets and export transformed outputs to Parquet format in Azure Data Lake Storage Gen2 (ADLS). • Supported the introduction of CI/CD workflows by connecting ADF to Git repositories, ensuring proper version control, collaboration, and safe deployments across development environments. • Conducted data profiling and quality assessments using T-SQL and Spark SQL, identifying inconsistencies, null patterns, and schema mismatches. • Created and maintained basic Power BI dashboards pulling from Azure SQL and ADLS, enabling real-time monitoring of ETL pipeline status, data freshness, and ingestion errors for operational teams. • Leveraged Microsoft Excel for advanced data validation, profiling, and reconciliation using Power Pivot, VLOOKUP, INDEX-MATCH, and dynamic charts to support ETL testing and stakeholder reports. • Used Excel to audit ETL outputs, streamline reconciliation between source and target datasets, and support business analysts with dynamic KPI exploration. • Led efforts in troubleshooting and resolving data pipeline bottlenecks by analyzing activity run logs and leveraging Azure Monitor alerts. • Implemented real-time streaming ingestion pipelines using Azure Event Hub, Kafka, and Spark Structured Streaming, supporting dynamic Power BI visualizations. • Crafted interactive Power BI dashboards to visualize ETL health, leveraging ADF and Logic App logs for real-time system monitoring. • Documented end-to-end data flow architecture, aiding in faster root cause analysis and onboarding of new developers.

Education

Masters of computer scienceUNIVERSITY OF WINDSOR
2024 - 2025

Skills

Core skills0
Languages1

Languages

English