A current M.S. Data Science student @ UT Austin. I aim to deepen my understanding of various machine learning methodologies.
I recently graduated from UCLA with a B.S. in Computational Biology (Data Science specialization). Over the few years, I've developed a passion for applying machine learning to real-world challenges, especially in data science and ML.
Currently, I'm a Master's student in Data Science at UT Austin, where I'm learning to build language models by working with transformers and architectures like GPT-3. My interests lie in uncovering patterns in data and I am particularly interested in applying machine learning to healthcare to make it more accessible and impactful.
In my spare time, I enjoy cooking and recreating cafe drinks and playing tennis. Over the past 7+ years, I've developed various coding projects, with a focus on Python and full-stack development. Iām currently working on a cost equity modeler that helps users determine fair pricing for medical services by comparing insurance and non-insurance rates and implementing synthetic data generation.
Python
PyTorch
TensorFlow
Pandas
SQL
scikit-learn
Tableau
Git
React
Built a custom Transformer model achieving 99.67% accuracy in character frequency classification and extended it to a full language model with a low perplexity of 6.13 for next-character prediction, using Transformer-based sequence modeling for accurate and efficient language prediction.
Developed a pipeline using unsupervised machine learning to identify six astrocyte subtypes through spatial clustering of spatial transcriptomic data from the Allen Mouse Brain Atlas, discovered 104 astrocyte-specific genes through differential expression analysis with Bonferroni correction across 10+ million cells.
I'm currently looking for full-time data science roles and internships for summer 2025. I'd be happy to further discuss my experiences with you, simply shoot me an email or fill out the form below
jamesfup@gmail.com
+1 (925)Ā 875-8886
CA, United States