Skip to content
View audreymhoughton's full-sized avatar
🐍
Python Engineer building data pipelines & automation systems
🐍
Python Engineer building data pipelines & automation systems

Organizations

@Randolph-Lab

Block or report audreymhoughton

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
audreymhoughton/README.md

πŸ‘‹ Hi, I'm Audrey Houghton

πŸš€ Data β€’ Infrastructure β€’ ML Systems Engineer
βš™οΈ HPC | Analytics Engineering | Automation | Research Computing

Resume LinkedIn Google Scholar


I build scalable data and compute systems that transform complex datasets into reliable, reproducible, and production-ready workflows.

My work spans GPU cloud infrastructure, high-performance computing, machine learning pipelines, and large-scale analytics systems across both research and production environments.


πŸ”­ Currently Working On

  • 🧠 GIS-linked Electronic Health Record (EHR) analytics pipelines
  • πŸ€– Automation systems integrating APIs, structured logging, and alerts
  • ⚑ Reproducible SLURM & HPC data processing workflows
  • πŸ“Š Applied machine learning and causal inference analysis
  • πŸ—οΈ Production-style research infrastructure & validation systems

⭐ Featured Projects

πŸ€– Lead Automation & Sales Intelligence Pipeline

Enterprise-style automation platform integrating compliant web scraping, territory mapping, structured logging, and workflow alerting.

Tech: Python β€’ APIs β€’ ETL β€’ Automation β€’ Data Validation


⚑ SLURM Wrappers

Reusable Python/Bash wrappers enabling scalable execution of large data-processing workloads across HPC environments.

Tech: Python β€’ Bash β€’ SLURM β€’ HPC
πŸ”— https://github.com/DCAN-Labs/SLURM_wrappers


πŸ“¦ CABINET β€” Container Linking Framework

FAIR-compliant container-linking system enabling reproducible scientific pipelines across heterogeneous compute environments.

Tech: Python β€’ Docker β€’ Singularity
πŸ”— https://github.com/DCAN-Labs/CABINET


🧠 BIBSNet β€” Infant MRI Segmentation

Containerized deep learning application improving performance and accuracy for large-scale neuroimaging pipelines.

Tech: Python β€’ Deep Learning β€’ Containers β€’ HPC
πŸ”— https://github.com/DCAN-Labs/BIBSnet


πŸ“ˆ Google Sheets Automation Utilities

Google Apps Script tools automating workflow tracking, dynamic link generation, and real-time updates.

Tech: JavaScript β€’ Google Apps Script β€’ Node.js


🧰 Technical Stack

πŸ’» Languages

Python β€’ SQL β€’ Bash β€’ JavaScript β€’ R β€’ MATLAB

πŸ“Š Data & Machine Learning

Pandas β€’ NumPy β€’ Statistical Modeling
Predictive Modeling β€’ Causal Inference
Benchmarking β€’ Data Validation

☁️ Infrastructure & HPC

Kubernetes β€’ SLURM β€’ Docker β€’ Singularity
AWS S3 β€’ Ceph β€’ Distributed Compute Systems

πŸ“‘ Observability & Operations

Grafana β€’ Prometheus β€’ Linux/Unix Systems

πŸ”„ Automation & Engineering

ETL Pipelines β€’ API Integrations
Workflow Automation β€’ CI/CD Concepts
GitHub Actions


πŸ“Œ Selected Impact

βœ… Provisioned and maintained thousands of NVIDIA GPU nodes in production Kubernetes clusters
βœ… Built pipelines processing petabyte-scale datasets
βœ… Achieved 600Γ— performance improvements in imaging workflows
βœ… Reduced runtimes 10–12Γ— through HPC optimization
βœ… Designed automation systems reducing manual operational overhead
βœ… Led documentation & reproducibility initiatives across multi-team environments


🧭 Areas of Interest

  • Data Engineering
  • Machine Learning Infrastructure
  • Research Engineering
  • Analytics Engineering
  • HPC & Distributed Systems
  • Python Backend & Automation

πŸ“« Connect With Me

Pinned Loading

  1. DCAN-Labs/CABINET DCAN-Labs/CABINET Public

    CABINET is a utility that standardizes subsequent container execution. Emphasizing reproducibility, users can share JSONs to exactly replicate how containerized processes follow each other.

    Python 1 1

  2. DCAN-Labs/cdni-brain DCAN-Labs/cdni-brain Public

    readthedocs for CDNI collaborators and members

    3 2

  3. DCAN-Labs/SLURM_wrappers DCAN-Labs/SLURM_wrappers Public

    Bash and Python wrappers for SLURM that streamline HPC job submission, AWS S3 data transfers, and disk-based workflows; adaptable to any SLURM-based pipeline.

    Shell 1

  4. DCAN-Labs/BIBSnet DCAN-Labs/BIBSnet Public

    This BIDS App provides the utility of creating a nnU-Net anatomical MRI segmentation and mask with a infant brain trained model. It can easily be included in other processing pipelines and for circ…

    Python 9 10

  5. job-tracker-apps job-tracker-apps Public

    Google Apps Script automations for Google Sheets that streamline application tracking: automatically summarize statuses with counts, percentages, and charts, enforce a 6-month rule on outdated entr…

    JavaScript

  6. lead-lab lead-lab Public

    Research-only lead generation toolkit β€” manage, enrich, and export qualified sponsorship leads securely to Google Sheets or local CSV, with no outreach or client data exposed.

    Python