Meeting of Bay Area Women in Machine Learning and Data Science

The group, Bay Area Women in Machine Learning and Data Science, met on Wednesday night (1/28/15) on campus to connect with graduate students who are interested in data science. After a brief introduction to the group, we got to hear from three women who took different paths in data science.

Dr. Laura Waller got her PhD in electrical engineering at MIT. She is now an assistant professor of electrical engineering and computer science (EECS) here at Berkeley.

Dr. Marian Farah got her PhD in statistics at UC Santa Cruz, and did a post-doc in bio-statistics at Cambridge. She now works as a quantitative researcher at Climate Corp.

Katrina Glaeser, a UC Berkeley alumna, got her MA in mathematics at UC Davis. After working at Yammer for a year, she now works as a product analyst at Pandora.

All the speakers clearly love their jobs, which was very encouraging. One thing that I learned from hearing about their jobs also, is that data science is a very large field, and can refer to quite different kinds of work. The three of them were working in data science, and yet work on vastly different problems.

After they each introduced themselves and told us a bit about their background, we got to ask them questions about how each of them came to do what they do. I did my best to write down answers:

Q: What are the positives and negatives of your job?

-Katrina said that having weekends off is nice, and working late only happens by choice.
-Marian said she loves her work, and the data she gets to work with. But a negative is that she doesn’t get to publish it.
-Laura (who stayed in academia) said that a positive is that she gets to work on whatever she wants, but two negatives are the amount of bureaucracy, and that she doesn’t get weekends off.

Q: What kinds of machine learning techniques do you use?

-Katrina said in her job at Yammer, they tried using random forest models for a particular problem, but that it didn’t work out so well. She said the most important thing to know about machine learning is “when not to use it.”
-Marian said she uses a bit of everything, and that she does a lot of model validation. She also gave the advice that when you’re trying to decide on what method to use for some problem, it’s worth mining the literature for what has been done before. She pointed out that it’s rare to be dealing with a problem that has never been looked at at all. And of course, as PhD’s we know very well how to mine the literature!

Q: Is it true that there is more variety in projects in industry?

-Laura said that in academia she works on a lot of different projects.
-Marian agreed, saying that when she worked in academia she worked on a lot of things at a time. She said that she now works on one project for about three months and then switches to a new one. She also added that there are less hard deadlines in data science than in, say, engineering, because a data scientist is not usually working directly on the product. No hard deadlines sounds quite a bit like academia.
-Katrina said that she works on a lot of short-term projects. She said she might be working on three different projects in the space of an hour.

Q: What is the interviewing process like?

-Laura said she only interviewed for academic positions, so they were all academic job interviews.
-Katrina said for her first job, she cast a wide net, and it was all a whirlwind. She said the interview for her first job at Yammer was very different from her interview for her current job at Pandora. She added that the technical part of the interview was a mix of hard math problems and software questions.
-Marian said that the interview varied depending on the company. One company was like an academic interview. At another company, she gave a talk, and then had an interview during which she had to answer a lot of technical questions; for example, she had to explain the meaning of various statistical concepts, devise statistical models, and then there were also some behavioral questions like “Where do you want to be in 5 years?”

Q: Do you have any tips for negotiating?

Everyone said more or less the same thing: always ask for more money. They added that it helps you feel confident in doing so if you’ve done research on what the salary for the position should be. It’s easier to as if you have justification. They recommended for getting salary information, and also posts people’s interview experiences. However, be forewarned that there are a lot of grumpy interviewees who post their experiences, while the contented ones usually don’t.

Q: What led to your choice, when choosing between academia and industry?

-Laura said she loved academia, so she never really considered leaving. She wanted to be able to pursue topics just because they are interesting, and not necessarily because they’re useful.
-Katrina said that she knew it was time to leave when she realized that the motivation to continue wasn’t enough. She knew she didn’t want to be a professor.
-Marian said that she considered staying in academia, but decided to leave for lifestyle reasons. The salary in industry is higher, and she wanted to live in San Francisco, near her family.

Q: What are some important skills for a data scientist to have?

-Katrina: SQL and Python are good, but at least one programming language
-Marian: R and Python
-Laura: Besides technical skills, since you’re always working with people, you need to be able to communicate

Katrina also gave a couple good job searching tips that I liked. She said “People writing job descriptions don’t always know what the job is. Most of the jobs posted are a lie” – in other words, the job description may seem like it is for something specific, but it isn’t necessarily, and also the job may or may not exist. Sounds confusing, but at least we don’t have to feel so bad if we get rejected? Another thing I liked, she said that her motto when job searching was “Try to get rejected at least once a day.”