Story # 1
Jasmin is a data scientist who works at a tech company in a product-oriented team. She likes to work on designing metrics and urges everyone in her team to reason about what they optimize for from a product perspective to link it to the models she builds. Everyone says that she asks good questions and she is tenacious.
One day, a manager (non data science oriented) came to her to ask for some charts showing the “improvement” in the model/application performance and relevant metrics to use for a presentation in a company-wide event . Jasmin explained that there were issues with the current measurement methods and clarified the caveats in communicating this. She gave several proposals on what to communicate and how. The manager said that he’d just like to show any “good trend”! She argued that they should make sure they shared something accurate because it would become a benchmark to which future results will be compared. The manager seemed to have a different agenda, which he revealed by saying that “I don’t think people will remember these numbers months from now to compare to, so I am not worried about this point!”. At this time, another engineering lead who is close to the manager, volunteered to provide some numbers (inaccurate for sure). Jasmin wasn’t invited to check the final draft of the presentation, nor were her inputs taken into consideration. She just watched it as one of the audience and noticed the irrelevant numbers, the line charts with upward trends and the wrong legends. The presentation ended with an applause from the audience who mostly didn’t pay attention to the numbers and just wanted to move on to the next segment.
Jasmin was disturbed by the whole show and was even more frustrated with how the manager’s agenda was explicitly communicated in words and actions as if he said “I don’t want to communicate the truth, I want to show off with numbers that people won’t remember”!
Sara is a machine learning engineer. When she joined one of the mid-size tech companies at an early-stage data science team, she was supposed to inspect and enhance their simple recommendation/scoring systems. The products were in the area of education, training, jobs, etc. which meant humans were involved, or to be precise their profiles and skills were to be scored.
In one of the discussions, the CPO said that they should consider age as one of the features for their rule-based algorithm or any future advanced ones. Sara argued that she didn’t see the age relevant to the problem and tried to question the logic behind this proposal. The CPO explained that he worked enough in the field to know what the clients wanted and which features impacted their decisions so he wanted to automate this. Sara clarified the issues with this logic and highlighted the possibility of causing harm by automating such biases. The CPO insisted that he knew the rules to start with.
Sara realized at this point that it would be hard to get along or do meaningful work there under these conditions. but she was curious to know, given that the CPO was so confident about how things work, how would he evaluate if things worked fine or not? What was the metric? She asked him and he said “I simply know it will work”! It didn’t take long before Sara moved somewhere else with a deep sense of disappointment and more caution while picking a new job.
Ed was a solution architect/consultant who worked for several large consultancy firms. Most of the time he was working on projects for governmental entities. On many occasions he started to doubt that his work was enabling surveillance or violating privacy. It was hard to talk about this, but he had this recurring feeling of unease. It took him years before making sure that he didn’t want to be part of this and left the whole industry. Years later he wrote about this journey, and he mentioned these times of doubt regretfully saying “I have to say that at the time I tamped down my unease”.
These are real stories I witnessed, heard or read about. I am sure there are hundreds of similar stories lived by data practitioners, whatever their titles are; stories that are hidden behind the shiny image of the data science world and the exaggeration about the magical aspects of it. I know that we read articles all the time about privacy, surveillance, fairness and discuss what can be done about it. I personally talk about this in conferences, meetups and with my colleagues. But what I rarely read is the experience of those who are involved in this world; their real struggles, doubts, success or failure to steer things into the right direction, the power dynamics that impact them, and how they navigate all of this!
Why do I rarely read such stories?
I wondered why I don’t frequently read such stories with more details about the practitioners involved rather than the application, threat and impact. I think there are several reasons, some of which could be that:
- many data practitioners (and others) are bound by Non Disclosure Agreements (NDAs) or by personal commitment to keep work-related details confidential. It’s hard to tell the whole story without exposing more details about the workplace or specific individuals.
- some might have lived traumatizing or disappointing experiences that they don’t want to talk about again.
- others fear that if they talk about the mess in the product development cycle, privacy issues, manipulation of metrics, etc, someone might accuse them of being inexperienced or lacking the business skills because they couldn’t push to improve things.
- some might just want to keep the fake positive image and talk about the excitement, the great products and the great adventures they went through because that’s how they would appeal to the next employer. Employers would like to see positive candidates who talk about challenges, opportunities, usage of the state of the art methods (even if they are not needed) with a lot of enthusiasm.
- I might just be less exposed to communities or groups that share these stories and that’s why I don’t see them around.
What type of stories am I missing?
Someone could argue that we read stories in this context from time to time when people announce that they stopped working on a certain research area as in the case of Joe Redmon the creator of YOLO who stopped doing computer vision research and explained the reasoning behind this decision, which was more of a personal experience and reflection.
I stopped doing CV research because I saw the impact my work was having. I loved the work but the military applications and privacy concerns eventually became impossible to ignore.https://t.co/DMa6evaQZr— Joe Redmon (@pjreddie) February 20, 2020
Also sometimes practitioners reveal privacy or fairness issues with crucial applications and start to raise awareness about these aspects. Others announce their decision to not build something they could build.
Overheard in lab meeting: “we can’t build that, I don’t wanna end up on black mirror”— 🎃KYLE BUT SPOOKY👻 (@KyleMorgenstein) September 15, 2020
But I am more interested in the daily struggles of data practitioners, the gray areas, the little things that build up, the scenes before the final decision. For instance:
How often do they face a situation like Jasmin’s? Do they adapt? Do they try to change things several times before quitting? Do they convince themselves that it is out of their control and move on? Do they seek support from other peers? What if everyone said it wasn’t a big deal? How do they cope and ignore the fact that a manager exists who might do this again and present their work in the wrong way for his own purposes?
What about Sara’s position? How often does someone argue against using a certain feature or bring the bias topic up? What are the common reactions from others who are involved in the product development cycle? Do they mock them, or do they engage in this discussion? What if they insisted and ignored them? Do they surrender? Do they keep trying to convince them of the potential harm? Do they simply quit and move on?
What about the case when a data practitioner faces several negative experiences? How does this impact their motivation and ability to work on new products or trust new teams?
All these are little bits of stories that everyone might be part of regardless of the experience level or title. My expectation is that these situations are far more common than what we can imagine but we don’t know about them. So when a data practitioner faces a tough situation, they feel alienated and confused. They might think they are the only ones having such a hard time because everyone else is sharing the good exciting stories. They might also feel disturbed by the gap between the image others have about them (doing rigorous work) versus the reality (submitting to the power dynamics and being part of something they are not satisfied with). All this can be demoralizing to an extent that one might transform into someone who doesn’t care or leave the whole domain to find a less disturbing job.
Why does sharing more stories matter?
I believe it is easy to say what people can and should do when they find themselves in a situation where things seem shady or heading in the wrong direction. But in practice, things are more complicated and take more time to evolve and get clearer. People are highly likely to go through moral confusion, self doubt and stress while thinking about the best action to take. Being unable to talk about all this with others or get inspired by similar stories adds an extra layer of stress. Sometimes it seems like experiencing a scriptless situation; something with no precedence in one’s mind to know how to deal with or where to seek support without fearing the consequences.
Stories show us similar experiences of people with whom we have a lot in common. They can be a source of inspiration, consolation or guidance. They also help start conversations about the humans behind the products. So instead of talking all the time about AI screwing up stuff, we can see how the designers, product owners, data practitioners in the background are the ones who screw up or fix stuff. And how in similar situations, one can contribute to fixing stuff, walk away, or take other actions!
Sometimes I wonder whether all those people who express how thrilled, excited and delighted on linkedin/twitter to take data science roles, really experience these feelings in all of their jobs! Does everything go smoothly? It could be the case because those who are worried, anxious and cautious do not write about it. So I hope to read more true stories reflecting real experiences with all the joy and sorrow one can go through in the data science world!