Skip to main content

Data Scientist

Job DescriptionJob Description

Job Title: Data Scientist – Machine Learning, Big Data, GenAI (8–10 Years Experience)
Location: Remote
Employment Type: Contract

About the Role

We are seeking a highly experienced Data Scientist with 8–10 years of expertise delivering production-grade AI/ML solutions at scale. This role requires deep technical proficiency in Machine Learning, Big Data, Generative AI, Large Models (LLMs), and Retrieval-Augmented (RAG), combined with hands-on cloud experience (AWS, Azure, or GCP) and migration expertise for modernizing data and AI platforms.

The ideal candidate can lead projects end-to-end, from architecture design to deployment, while mentoring teams, optimizing for performance and cost, and ensuring alignment with business objectives.

Key Responsibilities

  • Design, develop, and deliver end-to-end ML/AI solutions in cloud- environments from design to deployment and monitoring.
  • Architect and implement Generative AI solutions leveraging LLMs (e.g., GPT, LLaMA, Claude, Mistral) and RAG pipelines with vector search.
  • Build and optimize Big Data pipelines using Apache Spark, PySpark, and Delta Lake integrated with cloud storage (AWS S3, Azure Data Lake, GCP Cloud Storage).
  • Design and maintain data lakehouse architectures with Databricks, Snowflake, or Delta Lake.
  • Deploy scalable MLOps pipelines using MLflow, SageMaker, Azure ML, or Vertex AI with Docker, Kubernetes (EKS, AKS, GKE), and CI/CD.
  • Implement and manage vector databases (Pinecone, FAISS, Milvus, Weaviate, ChromaDB) for RAG applications.
  • Oversee ETL/ELT workflows and pipeline orchestration using Airflow, dbt, or Azure Data Factory.
  • Migration projects, on-prem to cloud, cross-cloud, or legacy platform upgrades (e.g., Hadoop to Databricks, Hive to Delta Lake) , ensuring data integrity and minimal downtime.
  • Integrate streaming data solutions using Apache Kafka and real-time analytics frameworks.
  • Conduct feature engineering, hyperparameter tuning, and model optimization for performance and scalability.
  • Mentor junior data scientists and guide best practices for AI/ML development and deployment.
  • Collaborate with product, engineering, and executive teams to align AI solutions with business KPIs and compliance requirements.

Required Skills & Experience

  • 8–10 years in data science, machine learning, and AI/ML solution delivery.
  • Strong hands-on expertise in at least one major cloud platform (AWS, Azure, or GCP) with proven production deployments.
  • Proficiency in Python, PySpark, and SQL.
  • Proven experience with Apache Spark, Hadoop ecosystem, and Big Data processing.
  • Hands-on experience with Generative AI, Hugging Face Transformers, LangChain, or LlamaIndex.
  • Expertise in RAG architectures and vector databases (Pinecone, FAISS, Milvus, Weaviate, ChromaDB).
  • Experience with MLOps workflows using MLflow, Docker, Kubernetes, and CI/CD tools (Jenkins, GitHub Actions, GitLab CI).
  • Migration experience involving AI/ML workloads, big data pipelines, and data platforms to modern cloud-based architectures.
  • Knowledge of data services (AWS S3, Redshift; Azure Synapse; GCP BigQuery) and infrastructure-as-code (Terraform, CloudFormation, ARM templates).
  • Familiarity with streaming technologies (Kafka) and query engines (Hive, Presto, Trino).
  • Strong foundation in statistics, probability, and ML algorithms.

Qualifications

  • Experience with knowledge graphs and semantic search.
  • Background in NLP, transformer architectures, and deep learning frameworks (TensorFlow, PyTorch).
  • Exposure to BI tools (Power BI, Tableau, Looker).
  • Domain expertise in finance, healthcare, or e-commerce.

 

Data Scientist

Boston, MA
Full time

Published on 08/20/2025

Share this job now