Probabilistic NLP Models
Large language models trained at scale show emerging intelligent behavior, such as coherent and grammatical structures, cultural knowledge, and abstract reasoning capabilities.
- @ Microsoft Research, I used GPT-3 and Turning-NLP to simulate distributions of responses to psychology experiments.
Adversarial attacks may manipulate the behavior of AI systems to serve a malicious end goal.
- @ The MITRE Corporation, I prototyped a docker containerized adversarial attack testing platform and populated a public information resource.
- @ MITRE NLP Lab, I supported research into practical attacks on machine translation using paraphrase.
Cross-Source Information Extraction
Extracting information from documents requires the ability to link events, entities and associated relations across multiple sources.
- @ Indico Data R&D, I worked on deep learning NLP and CV approaches to PDF information extraction.
- @ Olin Satellite Lab, I consolidated multiple possibly contradictory data sources when scraping the FCC's international filings database.
Event Sequence Modeling
With language, voice, and time-series data, data items are dependent on data before or after it.
- @ Olin Microbiology Lab, I characterized time-series data from perturbed and recovering microbial communities using methods from compositional data analysis.
- @ Fidelity R&D, I analyzed distributions of cryptocurrency technical trading indicators over time.
Undergraduate Researcher @ Olin Bioinformatics & Microbiology Lab
January 2021 - January 2023, part-time
Advised by Professor Jean Huang
- Led research on analyzing composition shifts in time-series of cultured, perturbed microbiomes.
- Conducted literature review to find, apply, and analyze limitations of Random Matrix Theory approach, Compositional Data Analysis, and network analysis.
- Presented poster at Northeastern Microbiologists: Physiology, Ecology, and Taxonomy (NEMPET).
- Led project on cleaning and interpreting 2D Fourier analysis to isolate patterns in bacterial surface images to identify pattern and shape of surface proteins.
Undergraduate Researcher @ Olin Satellite + Spectrum Technology & Policy Group
September 2021 - Present, part-time
Advised by Professor Whitney Lohmeyer
- Worked on undergraduate and industry research team to identify factors driving value in 5G spectrum auctions.
- Applied statistical analysis tests (ANOVA), and analyzed auction context and mechanics
- Presented session at Research Conference on Communication, Information, and Internet Policy (TPRC).
- Led project to automate web-scraping, PDF-text extraction, and data cleaning of satellite filings from the FCC's International Bureau Filing System, creating data sets to support two new analysis projects and generate yearly review of satellite filings.
Senior Engineering Capstone @ Fidelity Center for Applied Technology
September 2022 - Present, part-time
- Worked on undergraduate and industry research team to build a robust cryptocurrency algorithmic trading analysis and backtesting library.
- Led research into technical trading strategies, indicators, and evaluation methods.
- Designed visualizations to compare distributions of strategy performance.
- Supported 100% test coverage of backtesting library.
Undergrad Research Intern @ Microsoft Research
May - August 2022
Advised by Dr. Adam Kalai (Microsoft Research) and Professor Rosa I. Arriaga (Georgia Institute of Technology)
- Led research on using large language models (GPT-3, Turing-NLG) to simulate demographically-aligned distributions of human behavior on behavioral economics, psycholinguistics, and social psychology experiments, resulting in paper (preprint on arXiv).
- Designed zero-shot prompt methodology, compared predicted distributions to literature on Ultimatum Game, Garden Path sentence comprehension, and Milgram Shock Experiment.
- Designed and ran novel alternatively worded prompts to mitigate risk of models regurgitating training data.
- Working with cognitive psychology and behavioral language model researchers Professor Rosa I. Arriaga (Georgia Institute of Technology) and Professor Micheal Kearns (University of Pennsylvania) to evaluate limitations of method before submitting to top-tier journal
SWE Intern @ Indico Data Solutions
May - August 2021
Full-stack software engineering intern advised by Madison May (Co-founder and Machine Learning Architect)
- Improved deep-learning document extraction capabilities.
- As fullstack software engineer supporting team with four UXD and product interns, protoyped and user-tested high-fidelity novel React.js GUI for predicting and correcting groups of text extractions.
- In independent research project with R&D team, adapted object detection Faster R-CNN model to classify handwriting on business documents and incorporated methods for alternate pre-training, multi-label tasks, and small object detection.
- Both features were highlighted as two of the top five features of Indico's fifth major release.
SWE Intern @ The MITRE Corporation
September - December 2020
Software engineering intern working on Practical Attacks on Machine Translation using Paraphrase, advised by Dr. John Henderson (Principal Investigator)
- Researched new methods to exploit vulnerabilities in natural language machine learning systems.
- Revived and adapted an academic lab's research code for generating a paraphrase database using the bilingual pivoting technique, debugged it in a new environment, incorporated newer software packages, ran timing experiments to determine hardware needs and Hadoop configuration for running resource-intensive computation with 4x more data, modified code to run faster for our specific use-case.
- Augmented dataset with 900 million segments of parallel text, performed word-alignment, and implemented methods for scrubbing poorly aligned segments.
- Used Logistic Regression for classification on an imbalanced dataset, and improved performance with feature engineering ablation studies.
SWE Intern @ Cumulus Digital Systems
May - August 2020
Software engineering intern on the backend team
- Developed externally facing REST API to let Cumulus's clients interface with Cumulus's system directly.
- Created a Amazon Web Services SNS, Lambda, and DynamoDB webhooks system to handle real-time events.
- Implemented secure API endpoints using Serverless microservice, REST API, Swagger, and AWS CloudFront.
- Implemented request and response validation, automated documentation, and API documentation page.
SWE Intern @ The MITRE Corporation
June - August 2019
Software engineering intern on the Secure Assured Intelligent Learning Systems (SAILS) federally funded research project
- Worked on initial proof-of-concept platform to systematically benchmark machine learning models security vulnerabilities against adversarial attacks.
- Built gPRC interface for communication between docker containerized attacks and models.
- Improved speed of sending batches of images from 17 minutes to 17 seconds.
- Conducted literature review and consolidated information on adversarial attacks and defenses to populate a public education resource.
- Prototype work secured future funding: project now popularly known as Adversarial Threat Landscape for Artificial-Intelligence Systems (MITRE ATLAS).
SWE Intern @ Boston University
July - August 2018
High school researcher in the Software & Application Innovation Lab, advised by Lucy Qin (PhD candidate), Kinan Dak Albab (PhD candidate), and Dr. Andrei Lapets (Principal Investigator)
- Wrote demos with different algorithm implementation for data oblivious sorting algorithms and set-intersection tasks, calculated throughput and latency, and implemented benchmark tests.
- Contributed to the development of new functionalities that performed 92-95% faster than the existing JIFF functions on common use-cases.
- Presented poster at the Greater Boston Research Opportunities for Women (GROW) research conference.
Using Large Language Models to Simulate Multiple Humans
G. Aher, R. I. Arriaga, and A. T. Kalai.
In progress of submitting to Proceedings of the National Academy of Sciences. ArXiv preprint.
Evaluating the FCC's $10 Billion Gamble: Successfully Accelerating
Access to Spectrum in Auction 107
P. Boyalakuntla, G. Aher, P. Post, G. Miner, L. Heinrich, Y. Mao, J. A. Musey, W. Lohmeyer.
Submitted to Journal of Information Policy (JIP). SSRN preprint.
An Analysis and Review of Geostationary Satellite Applications
Submitted to the Federal Communication Commission (FCC) From 2000 to
P. Post, K. Fleming, K. Canavan, S. Cho, G. Aher, W. Lohmeyer.
Submitted to Telecommunications Policy.
Effects and Recovery of Carbon Perturbation on Composition of
Phototrophic Microbial Community Enriched from a Freshwater Pond
G. Aher, J. Huang.
Demystifying Cryptocurrency Investment Strategies
G. Aher*, N. Faber*, E. Ito-Fisher*, A. Mascillaro*, L. Stein.
Research Conference on Communications, Information and Internet Policy (TPRC) > (Sep. 2022): Auction 107 (C-Band): Policy and Closing Bid Price Analysis
Talks & Slides
Microsoft Research, ML Ideas Seminar (Sep. 2022): Using Large Language Models to Simulate Multiple Humans
Technical University of Łódź, PL MathUp Conference (Apr. 2021): Principal Component Analysis Case Study: Facial Recognition
Northeastern Microbiologists: Physiology, Ecology, and Taxonomy (NEMPET) (Jun. 2021): What Factors Affect Microbial Community Composition?
Massachusetts Computer Using Educators (MassCUE) (Sep. 2018): SOARing with Drones in Education
Boston University, Greater Boston Research Opportunities for Women (GROW) (Jul. 2018): Refining Private Set Intersection Under Secure Multi-Party Computation
International Society of Technology Educators (ISTE) (Jun. 2018): Artificial Intelligence, Chatbots, and Amazon Web Services
React.js + DigitalOcean + SQLite + Auth0 Football Pick-Em' site used by 40+ active weekly users
Fullstack React DevelopmentRead More
Algorithms and implementations for small-world (local clustering) and scale-free (hubs) graphs
Generating Realistic GraphsRead More
Deep learning object-detection trials on pre-training, multi-label, small & imbalanced targets
Faster R-CNN for Handwriting DetectionRead More
Constant time querying, compressing huge index numbers, and bypassing the curse of global updates
Data Structures for Large Scale Information RetrievalRead More
Characterizing repeating protein patterns on bacteria image with 2D Fourier Transform
Fourier Transform Detective StoryRead More
Browse Projects By Category
Teaching, Leadership, and Academic Service
ENGR3599A-SL Olin College (Instructor Student-Led Course, Spring 2023): Advanced Algorithms
MTH2110 Olin College (Teaching Assistant Head Grader, Fall 2022): Discrete Mathematics
GirlsWhoCode Olin College (Branch Leader, Fall 2022)
Data Science and ML Lunch-and-Learn Olin College (Organizer & Presenter, Fall 2021)
ENGR2510 Olin College (Teaching Assistant, Fall 2022)
Einstein's Workshop Coding & STEM Classes (Teaching Assistant, 2017 - 2019)
Shishu Bharati Indian Language K-8 (Teaching Assistant, 2015 - 2019)
FIRST Lego League Robotics (Mentor, Fall 2018)
Some recent art...
My hobbies include drawing, dance, long-distance running, and playing four instruments :)