Professional Portfolio Mary (Miet) Loubele
Research and Academic Achievements
My research has focused on addressing critical challenges in model evaluation for advanced machine learning and algorithm development. During my PhD, I developed a comprehensive protocol for constructing effective test sets and ground truth, balancing precision levels with associated costs. This work has garnered significant recognition:
2814 citations on publications
H-index of 10
i10 index of 10
Initially applied to dental scanners, my protocol optimized bone model quality while minimizing radiation dose. I identified the i-CAT scanner as the most efficient option, balancing image quality and radiation exposure. This success led to broader applications in machine learning and algorithm development.
Industry Experience and Project Highlights
My industry experience spans various projects applying advanced machine learning and algorithm development to real-world problems:
Evaluated algorithm porting accuracy between Oracle SQL and Microsoft SQL Server
Developed multi-level test sentences for personal assistant ML algorithm evaluation
Employed Monte Carlo simulations to assess big data solution performance
Expanded LSTM (neural net-based search) solution training data from 2 to 300 shopping malls
Evaluated ML algorithms at Facebook and Instagram for both video feed and classification to enhance user experience
Built out training corpora for the use in RAG and fine-tuning of Large Language Models (LLMS)
Performed data science analysis for 25 years of vulnerabilities
Protection against threats in the real world presented at Cisco Offensive Summit 2016
Complex System Upgrades
I have a strong track record in upgrading legacy systems and improving operational efficiency:
Led data engineering framework upgrades for privacy and security compliance
Part of a team that resolved the "cold start" problem for video distribution during for a new product at Facebook during a war room
Optimized a training data cleanup tool, reducing processing time from days to minutes
Facilitated smooth transitions for ML solution upgrades affecting hundreds of developers
Natural Language Processing (NLP) Expertise
With 3 years of specialized NLP experience, I have:
Applied Support Vector Machines for primary classification, increasing precision with 5% using confusion matrix
Led system localization, translating a complete NLP system from English to French
Specialized in active learning techniques for 2 years
Developed rule-based classification systems and extended training data
Leveraged crowdsourcing for data collection
Machine learning and Classification Systems
I have successfully bootstrapped secondary classification systems and built zero-to-one products:
Developed a contact normalizer based on occupation data
Built a horizontal classification system for creator categorization
Normalized course titles at D2L
Created a system to predict sales call success
Developed classification systems for blog post effectiveness and global feed rating
Built a system to classify SQL queries for interview preparation using generative AI
Cost Optimization and Efficiency Improvements
Throughout my career, I have consistently optimized costs and improved efficiency:
Managed cloud budgets, avoiding excessive costs
Achieved a 75% reduction for the data warehouse costs at my team
Led a project to successfully address escalating costs for the Instagram data warehouse
Improved tool efficiency, reducing processing time from days to minutes
Automated manual processes for ML solution precision evaluation
Streamlined data retrieval from Salesforce data for data scientists
Generative AI and Content Creation
Developed a corpus of blog posts for a Retrieval Augmented Generation (RAG) system
Built proof-of-concept content creation engines for efficient blog post distribution
Implemented a classification system to assess media feed health, resulting in a 22% increase in followers and over 150k impressions in six months
Conference Talks and Community Leadership
I have actively contributed to the tech community through organizing meetups, giving talks, and engaging in various events:
Meetup Organization
Lead Organizer @ Intersections K∩W Meetup (August 2015 - March 2020)
Grew membership from 120 to over 960 members
Organized more than 40 meetups focused on mathematics, computer science, and data science
Negotiated sponsorships from regional tech companies
Mentored aspiring data scientists in the KW area
Organizer @ Waterloo Data Science and Data Engineering Meetup (November 2017 - March 2020)
Increased membership from 441 to 503 members in two months
Collaborated on organizing monthly meetups
Secured venues and sponsorships
Conference Talks and presentations
DEML Summit 2024: "Evolution of data engineering interviews over the last 14 years"
Protection against threats in the real world presented at Cisco Offensive Summit 2016
Communitech Panel Discussion: "Practical Applications of Artificial Intelligence"
Annual Women in Data Science Conference 2018: "How do data teams operate in lean startups"
Google Meetup 2018: "Data Pipelines in AI for SAAS applications"
Ryerson University Invited Lecture: "Understanding non-technical skills in data teams in industry"
Super data science podcast: "The Amazing world of Data Science Meetups"
Shopify event: lightning talk + panel discussion: "How to use online data to grow your local community"
Toronto Machine Learning Micro Summit: "Iterative strategies for a neural-net based search solution"
Startup analytics Podcast: "A Data Engineer Career Path | Startups to FAANG | Academia to Industry"