The Data Standard Presents: 25 Under 25
Meet The 2020 Class of Top Data Science Contributors and Thought-Leaders
At The Data Standard, we are continually compiling data on talent and technologies. We want to highlight and acknowledge some of the fine work done by undergraduates and recent college graduates in the data science field.
After looking at hundreds of Data Scientists in the Tech, Retail, Finance, and Transportation sectors from many backgrounds in statistics, physics, computer science, and others. Our list is based on the quality of work these young minds have been able to accomplish before the age of 25.
The Data Standard proudly announces 2020’s 25 Under 25 List and look forward to seeing what these brilliant minds will do next in their careers. We know that they will continue to build great solutions that will have a meaningful impact on our lives.
Listed alphabetically, here are 25 rising innovators in the data science community:
Advitya just finished up his internship at HP working as a data scientist on their Global Pricing Analytics team. His work involved developing a tool for automating pre-processing checks in HP’s pricing/sales datasets to flag warnings related to bad feature combinations for pricing models refreshed quarterly for all of HP’s products globally. This involved formulating a ranking system by defining metrics to identify the worst combinations of features for pricing models and providing insights with declarative configurations in terms of various kinds of warning reports, heatmaps and time-series plots for individual feature drivers. His advice for job searching is to use rejections and embrace them as a part of the learning process, never give up on your efforts and always seek to learn more!
Ani is currently a Quality Engineering Intern at ServiceNow. He is working within the Machine Learning team to make sure that they continue to ship quality products. His focus is on automating tests for upgrading between versions of our product and make sure that the relevant functionality works for the previous version and newer version after the upgrade process. This is important because it reduces the need for manual tests while increasing productivity within the team.
This past summer Ayush worked virtually as a Software Engineering intern at Cisco. He worked with the Blade Storage team on Redfish API testing and automation scripts to efficiently interact with Cisco’s data center infrastructure. He enjoys the work he did at Cisco because he got to experience industry practices and culture, which give him insight into the applications of data science in software engineering. In addition to his work at Cisco, he has also worked on several side projects using ML to identify key predictors for contraceptive usage for Indian women through the Qualcomm Learning Academy.
Catherine Tao is an intern and executive producer at The Data Standard where she develops insightful summary articles from The Data Standard Audio Experience Podcast episodes. She is passionate about networking and connecting with others to help foster the data science ecosystem. She just finished organizing a virtual event on August 25th where major data scientists from NASA, Microsoft, Samsung, and more all come together for our workshop and networking community event. To learn more about upcoming events and hear from inspiring data scientists, check out the website and podcast.
Darren is currently working for NASA as a Cybersecurity Intern for JPL’s cybersecurity team. JPL space missions require ground support with complex technologies to ensure successful missions, and these systems are always at risk of cyberattacks. To proactively prevent cyberattacks from happening, JPL created software that utilizes graph-based strategies to predict potential attack paths throughout the systems and discover potential vulnerabilities that can be patched. Darren’s job is to explore possible graph database options that can ultimately replace the current database that will scale much better. This process involved researching popular graph databases, down-selecting, creating arbitrary graphs of up to 10 million nodes/50 million edges, benchmarking basic graph algorithms between the two databases, and implementing more specific algorithms currently used in the JPL software for further comparisons.
Ding is currently a Data Science Master’s student at the University of San Francisco with a Computer Science undergraduate degree from the University of Southern California. She just finished up a Data Science internship with Mozilla where she worked on business models predictive ad click models to analyze Firefox user data and increase ad revenue. She also delivered reports to management teams to support data-driven business decisions.
Enrique is currently working as a Transportation Modeler at SanDag. He is involved in researching the Activity-Based Model developed by SANDAG to help model transportation and predict its impact on the entire San Diego region. He also works on developments for future releases of the Activity-Based Model. This will automate the assessment of the quality of the model’s inputs as well as automating the summarization of the model’s various outputs
Gregory is currently a Data Science intern at Tesla. Utilizing their database system, he runs analysis on large datasets and arranges dashboards to present to clients with the most relevant and important information. His tip for job searching is that networking is key, you never know who you are going to run into or know.
Harshi is a Software Engineering Intern at General Atomics-ASI. Her job is to add functionalities for the camera and make updates to ensure that the data is properly being sent from the ground to the plane and vice versa. Additionally, she works to eliminate possible data loss by fixing parts of code and allowing for better code readability.
Imran is a Computer Science major at the University of California San Diego (UCSD) and is about to finish his Software Engineering internship at Apple. His main project worked on cloud security and using data collected over time to improve and analyze system processed and performance. This also included mining data logs for information on users and cloud vulnerability. Imran advises aspiring data scientists to make the most of the opportunities you have because everything is a learning experience.
Ismail is a 2nd-year Machine Learning & Data Science Master’s student at UCSD. He has also been working as a teaching assistant for the Data Science Department at UCSD. Additionally, he worked as a data scientist in the Supply-Chain industry for 2 years where he focused on demand forecasting and inventory optimization. Currently, he works as a Data Science Intern with the Qualcomm IT team to develop algorithms for the detection of User Behavior Anomalies in data storage systems.
Jared is a Data Science Intern at IBM, where he works on a publication/ subscription messaging service. The service was created using Apache Kafka hosted inside a Kubernetes cluster on IBM’s cloud. The goal was to allow downstream consumers to subscribe to specific data streams (e.g. data involving a specific transaction type and involving over a specific monetary value) and push the data to the end-user in real-time.
Julie just finished up her Software Engineering Internship at Samsung in South Korea. She is a rising junior at Columbia majoring in Computer Science. During her 8-week internship, she conducted research on state-of-the-art algorithms in natural language, vision, and machine learning. Her team was tasked with developing a personalized gallery search algorithm for an Android app that conducts ranked image retrieval via spoken commands using bottom-up attention and stacked cross attention. Her advice to young people who want to “break into data science” is to not stress about nailing the perfect job. There are plenty of productive alternatives to not having a lot of job experience like campus research, side projects, certifications.
For the past 6 months, Jordan has been working as a Data Science Intern at Intel, uncovering insights about how people use their PCs and build foundational knowledge to influence better thermal management solutions. He also tutors for the UCSD Engineering Department, helping instruct a hands-on engineering class (ECE 196) to introduce engineering students to machine learning and signal processing through a project that incorporates both hardware and software. As a side hobby over the last couple of years, he has also been working on building a machine learning sports model to predict the performance of NBA and NFL players. For future work, he is interested in utilizing the power of data and data science to form insights that improve people’s lives.
Kaushik works at the UCSD Supercomputer Center as a Bioinformatics and Spatial Data Science Researcher. He worked on a team with other undergraduate researchers Johnny (Jiaxi) Lei and Eric Yu as well as Prof. Ilya Zaslavsky (Director of Spatial Analytics Lab) and Peter Rose (Director of Structural Bioinformatics Lab, SDSC). They work to develop agent based modeling systems for transmission/exposure levels of COVID-19 in small scale environments in San Diego County.
Additionally, he is also a co-author of course textbooks with Eric Yu and Dr. Ilya Zaslavsky (Director, Spatial Information Systems Lab, San Diego Supercomputer Center) for the Halıcıoğlu Data Science Institute, UC San Diego (DSC-170: Spatial Data Science and Applications).
Last week marked the end of Leena’s 7-week internship at Bloomberg LP. As a Global Data intern on the Content Acquisition and Business Management team, she was able to develop her technical and product management skills as well as my knowledge of financial markets. Shares with us that although the internship was virtual, it exceeded far beyond her expectations and she encourages others to look into remote work.
Marielle currently works as a Machine Learning Engineer at Nike. She focuses mostly on time-series analysis and classification with ML on embedded systems. She also has worked on analytics for the Nike Sports Research Lab in the Running category, with datasets they collect in the labs and in the Nike Run Club app.
Nabi was a Software Engineer Intern at Seismic last year. He worked on building a platform for monitoring the health of containerized applications on the cloud. The platform was deployed on Azure Kubernetes using Jenkins, as a 2 container Docker app, to be used to monitor over 400 containerized apps. He also collaborated on building a .NET library for unit/integration testing for pre-deployment, pipeline testing and published it as a NuGet package, saving hours of work for over 150 engineers across many teams. He has returned this fall for a full-time position at Seismic as a Software Engineer.
Paavas just finished up his 12-week internship at Facebook as a Software Engineer. He worked with the Metrics Platform team to create an internal portal for the metrics team called “Metrics Reporting Explorer”. His main takeaways were understanding the workflow, code integration, and getting real-world experience with efficiently deploying tools.
Rujuta just finished up his Ads Engineering Internship at Netflix. He worked on building a backend application that enables the creation of advertising videos automatically based on configuration. He also integrated with a number of other services for fetching various ad components and used technologies like GRPC, protobufs for cross-service communication. Additionally, he used AWS cloud infrastructure to develop and enable continuous integration to build, test, and deploy automatically. This allowed for conducted integration testing and unit testing to ensure defect-free delivery of product functionality
Sid Jain is a recent UCSD graduate in Computer Science, specifically interested in software development and working with big data. This past year he worked as an Analytics and Machine Intelligence Intern at Raytheon where he explored state of the art deep neural networks to perform language translation. He also worked on cross-lingual information retrieval and summarization at Raytheon BBN Technologies, as part of IARPA’s MATERIAL program.
Sharmi is a Software Engineering Intern at American Express where she designed, modeled, developed, and deployed to produce my machine learning solution to an estimation problem that is used by consumers of the Amex data platform. She particularly focused on using terabytes of internal data to apply machine learning solutions to colleague-related issues and forecasting to simplify complex decision making. She shared that the machine learning process is a grueling one, with many levels of data preparation and lots of testing, but to her, it was incredibly satisfying to build something that benefits my fellow colleagues at American Express.
Sravya is an undergraduate at theUniversity of California San Diego studying Data Science. She is currently working as a Data Engineer Intern at Facebook. Additionally, Sravya has been an active board member of the Data Science Student Society at UCSD and will be President this upcoming year.
Srilekha just started working as a Software Engineer at Dell. She also has experience with machine learning and deep learning through other internships at Redpine Signals, Meeam Technologies, PayPal IDC, and HyperVerge. At Dell, she is currently working as a full-stack developer using react and asp.net. She shared that her internship experience helped her use theoretical knowledge in her coursework during her bachelor's degree in technology from SRM University and eventually inspired her thesis on natural language processing.
Sruthi is an intern on the SAS Model Manager team. She works on building models for assessing SAS’s model manager tools from the analytical lifecycle standpoint. She looks at a variety of factors including data size limits (can they use higher-dimensional data) and model type; what models work best for the platform and which models are efficient when re-training.
As always, if you have any questions or comments, feel free to leave your feedback below, or you can always reach Stephanie on LinkedIn or at the www.datastandard.io Till then, see you in the next post!
About the Author:
Stephanie Moore is a Data Scientist at The Data Standard, the premier user community for Data Science, AI, ML, and Cyber Security thought-leaders. Stephanie’s background is in applied statistics, data wrangling, and data visualization. Prior to working at The Data Standard Stephanie graduated from the University of California San Diego as a scholar-athlete and Data Science major. She is passionate about turning complex data into actionable insights through analysis and storytelling with the hopes of using her skills to create innovative solutions with a worldwide impact.