I am an experienced Data Scientist with a strong foundation in data analysis, data warehousing, and data visualization. My academic background in Mathematics and Statistics complements my technical skills in Python, SQL, Power BI, and Tableau. I am passionate about turning complex data into actionable insights and continuously seeking opportunities to apply my expertise to real-world challenges.
- Programming Languages: Python, SQL (MSSQL, PostgreSQL, MySQL)
- Data Visualization: Power BI, Tableau
- Data Manipulation & Analysis: Pandas, NumPy, Advanced Excel, Power Query
- Data Warehousing & ETL: Data Pipelines, Data Cleaning, Data Modeling
- Messaging & Task Queues: RabbitMQ, Celery
- Others: DAX, Seaborn, Matplotlib
Data Scientist | Centocode (September 2024 β Present)
Working on cutting-edge AI and data solutions, focusing on building and optimizing LLM-based applications.
- LLM Development: Developing and fine-tuning large language models to solve real-world problems and enhance user experience.
- RAG Pipelines: Implementing Retrieval-Augmented Generation (RAG) systems using FastAPI and vector databases for efficient data retrieval and contextual responses.
- Database Management: Designing and managing MongoDB and vector databases to support high-performance data storage and querying.
- API Development: Creating scalable APIs with FastAPI to integrate AI models and data pipelines into production environments.
Data Scientist | 1ViewApps.com
Led the development of robust ETL microservices for 1ViewETL, a web application focused on extracting and processing data for enhanced data-driven insights.
- Data Extraction: Designed and implemented a highly efficient data extraction system that integrates with various APIs, enabling high-throughput data retrieval and processing.
- Asynchronous Processing: Employed Celery as a distributed task queue for asynchronous ETL processing, leveraging RabbitMQ as the message broker to handle large volumes of data efficiently.
- Product Management: Ensured smooth user journey and effective communication across teams involved in the development of the product.
Associate Data Scientist | KDataScience Solutions Pvt. Ltd.
- Developed and optimized data pipelines using Python and MSSQL for Power BI reports, ensuring efficient data flow and accurate reporting.
- Managed and validated data from various sources, including APIs and portals, using the Python library Pydantic to maintain high data quality.
- Designed and enhanced Power BI dashboards with advanced DAX functions and integrated features, facilitating data-driven decision-making for stakeholders.
- Mentored trainees in data science workflows, including data warehousing and data analysis.
- Bachelor of Mathematics (Hons) - Ambedkar University Delhi (2015-2020)
- LinkedIn: LinkedIn Profile
- Tableau Public: Tableau Profile
- Maven Analytics: Maven Analytics Profile
Thank you for visiting my GitHub profile. Feel free to explore my projects and repositories. If you have any questions or want to collaborate, please reach out!