Welcome to P-ai!
Apply to Spring 2024 Projects:
Read through summary information to get an introduction to each project
Click on the Full Proposal buttons to view more detailed outlines
Applications are hosted through the button above and will close on Sunday, 2/11 at 11:59pm PST
SWE DIVISION
Industry Project
p-tm41m
Partnered with a Canadian Startup - Tm41m: The Metrics for one Matter
Software Engineering and NLP For Automated News
tm41m (The Metrics For One Matter) is a data-driven journalism startup dedicated to using key data measurements to supplement traditional editorial reporting - hoping to combat the growing distrust of mainstream news and subjective writing. The project will create an automated application to construct scheduled SQL queries against a database of real-world metrics, filter for promising insights, generate written news reports using NLP, and then push the results to social media and a public audience.
JavaScript, Python, Scala
SQL, Pandas, Spark
Postgres, CircleCI, dbt, Scrapy, Flask, ChatGPT, Jekyll, Git
NLP Techniques and LLMs (Large-Language Models)
Member Requirements:
Commit 6+ hours/week, and meet during business hours (9-5) with company liaison
Proficiency in version control i.e. Git
Knowledge of a scripting language e.g. Javascript, Python, Scala etc.
Knowledge of a querying language e.g. SQL, Pandas, Spark etc.
Knowledge of Test-driven design, Continuous Delivery, Integration workflows
Knowledge of prompt chaining using NLP, LLM frameworks
p-NXTPlay
Revolutionizing Football Strategy Analysis with AI - Building Computer Vision
Imagine transforming the way we analyze American football by leveraging the power of artificial intelligence. Our project aims to harness advanced AI techniques, particularly computer vision and machine learning, to decode and analyze game footage with unprecedented precision. We're not just watching games; we're dissecting every move, strategy, and formation. Our goal is to build a platform that not only tracks players and ball movement but also understands formations and play strategies in real time. By achieving this, we aim to provide coaches and teams with insights that go beyond human observation, offering a statistical edge that could redefine game preparation and strategy. Join us in this exciting venture to blend AI with America's favorite sport and change the way the game is played and understood.
SWE Subteam
AWS EC2, PostgreSQL, OAuth
Backend: Rust, Actix_web
Frontend: React, Figma
AI Subteams (General AI & Football AI)
Pytorch, Numpy, Pandas, YOLOv8, Opencv, Seaborn
Member Requirements:
Intro CS or Equivalent
Commit 4-5 hours/week
AI Teams
Python comfort + project/class work in AI preferred
No Football knowledge for General AI, Football knowledge recommended for Football AI
SWE Team
Have experience in at least one of the following:
Design or web design
React or other JS framework
Rust
No Football knowledge needed!
p-mLog
Developing a Full-Stack Mobile Application in Flutter:
Supporting University Research in Social, Language and Emotional Development in Preschool Classrooms
In this project, we leverage Flutter's rapid development cycle to create an app that facilitates observational data collection for psychology researchers at the University of Miami. The goal of this project is to have a significant impact (2x) on the quality and quantity of data collected in preschool classrooms. This data collected is used to train Machine Learning models that aid in research projects in topics including including preschool interactions, autism assessment, face-to-face interaction, and attachment.
An emphasis on this project is learning and developing skills in industry-standard tech stacks like Flutter and Dart!
Flutter (Frontend development kit)
Dart (Backend typed programming language)
Amazon Firebase
Figma, Github
Member Requirements:
Minimum course requirement: CS62
Mobile app development experience a plus, but not required
Commit 4-5 hours/week
p-Dormlife
Building a Website to Centralize All the Housing Needs of 5C Students
Tired of the housing system? Picture this: room descriptions, maps, virtual room tours, and filters to find your dream space all in one place. Share your dorm stories on our reviews board and report issues effortlessly. Let's make housing decisions less bothersome!
We're looking for people excited to do work with building websites on both the front and backend!
In this project we will be working with React and Javascript to make an interactive website for students
We will tackle utilizing graphics libraries to create a new interactive map
Member Requirements:
Intro CS or Equivalent
Experience building/developing a website
Front end/React experience
Commit 4-5 hours/week
p-5cEvents
AI-Powered Event Planner for 5Cs Students
P-5cEvents is a web app designed to organize on-campus events and student calendars, offering comprehensive event scheduling assistance including maps and g-Cal integration. We will use imaplib to scrape Outlook/Gmail email content from various servers across campus to populate this web app but users can also submit the event details directly on this website. Beyond scheduling, p-5cEvents enhances user experience with an AI-driven recommendation system, employing natural language processing to tailor event suggestions based on individual preferences and historical attendance.
Backend: Python & Flask for API endpoints
Frontend: JavaScript, CSS & HTML, React.js/Node.js/Next.js frameworks
Machine learning: PyTorch, TensorFlow, Natural language processing, Supervised vs unsupervised learning
Containerization: Docker, Kubernetes
Databases: PostgreSQL
Scaling & Deployment: AWS
Version Control: Git
Unit Testing: Pytest
Member Requirements:
Some coding experience with at least one: Python, Java, JavaScript, SQL, HTML & CSS
A passion for learning new technologies and frameworks
Self-initiative, ability to commit around 4-5 hrs a week. For applicants with minimal experience, the initiative to take time out to self-learn the tech stack will be essential
p-Newspaper
Delivering News Users Are Interested in Using ChatGPT API and Web Scraping
p-Newspaper aims to supply users with a personalized stream of news related to topics of their interest. We'll source news from all the biggest free news sources, and deliver the best articles straight to the user, eliminating the need for them to spend time searching for news themselves
Learn to use ChatGPT 4 API! (very useful and cool)
Learn web scraping!
Learn web development!
Build software engineering skills!
Tech stack: Python, ChatGPT 4 API, Javascript, Newspaper3k.py, React, Flask
If you're experienced already, great. Only intro CS or equivalent experience required though
Member Requirements:
Intro CS or Equivalent
Willingness to learn 🙂😀🎉
Commit 4-5 hours/week
p-Link
AI-Powered Platform for 5Cs Student-Alumni Connection
In this project, we aim to bridge the gap between 5C students and alumni through an AI-powered outreach platform. By leveraging natural language processing, this platform enables users to effortlessly search and connect with alumni sharing similar passions, school involvements, and career trajectories, using GPT-4 and filtered vector searches. Our system overcomes the common barriers of cold emailing and LinkedIn messaging, offering instant, relevant search results using natural language queries that can be refined for precision. Out platform will design AI-generated profile summaries and personalized outreach messages, and we aspire to significantly enhance networking opportunities and foster a deeper sense of community within our consortium’s network.
Prototyping: Figma
Scraping: Python, Selenium
Authentication and DB: Supabase
Frontend: Next.js, TypeScript
Backend: Flask, Python
Services: OpenAI, Pinecone
Hosting: Heroku
Member Requirements:
Preferably, students comfortable working with APIs and DBs
Students need to have taken at least one CS intro class!
Python/TypeScript/JavaScript are nice-to-haves
Web Scraping experience is a plus
Commit 4-5 hours/week
Lots of curiosity for learning new tech + positive energy! :)
DS/AI DIVISION
p-ollution
Quantifying the Health Impacts of Air Pollution
Climate change meets tech meets public health. Air pollution is a serious problem in many parts of the world whose damage is worsened by climate change. P-ollution draws from publicly available data to quantify the damages caused by air pollutants in Ontario, CA. This project will involve data wrangling, exploratory data analysis, a time series, and a data visualization dashboard. This project aims to analyze and communicate the impact of the relationship between air pollution and human health in a digestible manner.
Python (Numpy, pandas), R, matplotlib, ARIMA, LSTM
GeoPandas, PowerBI, Folium, Streamlit
Member Requirements:
Intro Python/R or equivalent
Data science/Machine learning skills are a huge plus but not strictly required
Skills/experience in Web design/Tableau/Power BI is a plus
Experience with environmental policy and/or advocacy, environmental justice, web design, visual/written communication, etc is a plus. Any other skill/experience is a plus if the candidate can demonstrate how their experience can contribute to p-ollution during the interview
Commit 4-5 hours/week
p-Stars
Unsupervised Clustering of Variable Stars
The cosmos is alive with change: supernovas, gamma ray bursts, pulsars. We aim to unlock the secrets of the universe by taking advantage of the large amounts of data astronomers have collected on variable stars, which have a noticeably varying brightness over time, through developing novel data processing and machine learning methods. We have previously built a supervised algorithm with 85% accuracy. This new model will be unsupervised, in that the algorithm must learn to discern the unique types of stars without explicit output classes it’s trying to match; successful deployment on a well-studied dataset will provide us with the opportunity to study contact binary systems, which are an active area of astronomy research.
Construct unsupervised clustering algorithm to classify types and sub-types of variable stars.
Will begin with a labeled dataset that we trained our supervised model on since this gives us a positive control; if successful, we will study novel datasets of great use to astronomers, including contact binary systems. Thus there is a real, possibly publishable, research opportunity here.
Two primary approaches: statistical and other astronomical indices to process timeseries brightness data for stars (lightcurves) -- need to develop a new metric for this space; tree model inspired by large language models using slices of lightcurves as tokens.
A lot of the code to process the lightcurves will be custom written (the datastructures), but we will take advantage of the code we have already written for processing as well as the package 'sklearn' and others to deploy a variety of existing clustering algorithms (k-Metoids, DBSCAN, OPTICS) once we can represent our data in a meaningful way.
Coding to be done in Python. - No prior coding or astrophysics experience required; main requirement is a love for learning and a willingness to try.