Automating Prejudice: Bias in Artificial Intelligence and Tech

Grace Madden
Social Policy

Artificial intelligence, or AI for short, has allowed data to be processed and utilized like never before. An up-and-coming computer science discipline, AI leverages data and machine learning to mimic human decisions. The potential for this technology is massive and strives to revolutionize the way we live,  from creating safer roads with driverless cars to offering faster and reliable medical diagnoses. In the meantime, AI already exists in our google search bars, hiring practices, and even our criminal justice system. While AI developers paint pictures of an efficient future, researchers and industry experts urge caution and thoughtful attention to data bias for fear AI may create more harm than good by perpetuating the existing prejudices of the systems they aim to improve. 

“The most important aspect of data is not what you do with the data, it’s what data you use” - Andrew Mercer senior methodologist at the Pew Research center

There is a common misconception that data science is an objective practice and while it is nice to believe that the data we gather and the algorithms and technology we create reside in an objective realm free of human bias, this is not the case. A contributing factor to this exists within data set selection. Data sets are collections of separate information analyzed as a whole by a machine learning or AI algorithm. Compiling these sets are human modelers who tend to focus on efficiency and accuracy before bothering with bias. As a result, data bias runs rampant throughout many AI algorithms. Data bias can take many forms, including confirmation bias, the tendency to process and interpret information in a way that supports already-held beliefs; correlation bias, where connections are inferred by conflating or compounding ultimately unrelated variables; and sample bias, when data gathered doesn’t include certain groups. Stereotype bias is a craftier form of sample bias where the algorithm may start distinguishing secondary factors as significant features of a category. This bias can be incidental or shaped by a pre-existing bias that the model has no way to identify or counteract. For example, a machine learning algorithm created to predict a presidential candidate's success based on prior presidents would likely overlook women or people of color given their exclusion from the position. 

Although this example is hypothetical, examples of these biases exist and have disastrous effects. We find a clear case of stereotype bias in attempts to automate the justice system through recidivism risk instruments.  Recidivism prediction instruments or RPIs provide decision-makers with a prediction of the likelihood that a criminal defendant will re-offend in the future.  While RPIs are only growing in popularity, reports, and investigations expose discriminatory bias. ProPublica performed one such study on the Correctional Offender Management Profiling for Alternative Sanctions or ‘COMPAS,’ an RPI designed by Northpointe Incorporated. Their research confirmed suspicions that the formula was prejudiced and found the RPI more likely to falsely flag black defendants as re-offenders and almost twice as likely to label as such compared to white counterparts. Black defendants were 77 percent more likely to be pegged as high-risk for committing a future violent crime and 45 percent more likely to be predicted to commit a future crime of any kind. These predictions were based on a survey that asked defendants questions like, ”Was one of your parents ever sent to jail or prison?” and “How often did you get into fights while at school?” The issue with these questions is obvious: the U.S has the highest incarceration rate and incarcerated population in the world, a disproportionate number of whom are black. Given the racist history and reality of the United State’s Criminal Justice system, creating an algorithm based on previous decisions will undoubtedly perpetuate the biases that exist and continue to exist within the system. This is not to say that there is no potential for AI within the criminal justice system. If computers could accurately predict which defendants were likely to commit new crimes, the system could be fairer and more selective about who is incarcerated and for how long. The trick, of course, is to make sure the computer gets it right, and identifying and correcting data bias is an important step to doing so.  

While recidivism prediction instruments are an example of bias within AI, data bias permeates nearly all aspects  of society, including, hiring and healthcare. Employers have increasingly used AI in the hopes of increasing efficiency while hiring quality candidates. For instance, when an applicant submits a resume, it is run through automated analysis and scored on the applicant’s quality and how fit an individual is for the job, based on keywords assigned value by a machine-learning algorithm. The algorithm is calibrated using data from the resumes of current employees in conjunction with data on their job performance. On the surface, these tools may appear completely evidence-based, but like RPI’s this does not mean the technology is free from human bias. Unsurprisingly, there is mounting evidence that such tools can reproduce and even exacerbate hiring prejudice depending on the data chosen. Algorithms, by nature, do not question the human judgments underlying a dataset. Instead, they faithfully attempt to reproduce past decisions that lead them to reflect the very human bias they intend to remedy. A method which is especially detrimental when used in systems like criminal justice and hiring, which have historically been discriminatory. 

Ultimately, most algorithms are created using data that is biased either inherently through the fraught history of the systems they analyze or any of the many forms of data bias.  There are, however, steps we can take to try to remedy bias within the tech industry itself and public policy, beginning with diversifying the tech industry itself. The lack of diversity both in race and gender within AI fields is both extreme and embarrassing, with women only representing 18 percent of authors at leading AI conferences and more than 80 percent of AI professors being composed of men. Women comprise a measly 15 percent of AI research staff at Facebook and 10 percent at Google. Racial diversity amongst tech employees is even more grim, with Black employees representing  2.5 percent of Google’s workforce and only  4 percent of Facebook’s and Microsoft’s. Given the concern and investments taken to redress this imbalance, the current state of diversity in AI and the overall tech field is alarming. There is no doubt that including more types of people in creating technology will allow for that technology to better serve more people. 

Creating a more diverse workforce is only one piece of the puzzle of crafting unbiased tech. There also needs to be a more expansive effort on tech ethics within computer science education and the industry itself. Tech ethics is often seen as a specialty or subset of the computer science field rather than as a permeating practice. This mindset has no doubt contributed to disparate impacts in AI.

The relationship between public policy and AI is a fairly new one with the first policy guidance related specifically to AI issued during the Obama administration which chartered a National Science and Technology Council (NSTC) Subcommittee on Machine Learning and Artificial Intelligence in 2016. This relationship has been further explored in the seven years since the creation of the American AI Initiative and the passing of the National AI initiative act during the last administration, a relationship that has only continued to develop during the Biden administration. While promising, these initiatives have given little to no attention to the social consequences and based on recent publications such as the American Initiative for AI Annual Report, seem mainly concerned with encouraging innovation within AI fields through increased funding and removing barriers to development. While encouraging innovation is not inherently negative, it is slightly concerning considering the limited attention given to the possible social implications and disparate impact of the technology or any attempt to remedy them. The number of times the word “ethical” or “fair” was mentioned in the report could be counted on one hand. Although published, the Biden administration has yet to demonstrate any significant changes. However, I am somewhat more hopeful that this will change, with the establishment of a National AI Advisory Committee which seems to be more slightly more attuned to the social implications of AI. That being said, there remains an overall lack of attention to data bias and creating ethical technology not only in government policy but within computer science education and the industry itself. 

Rather than government regulation, these initiatives seem to support more partnerships between government, industry, and universities. This, again, is not inherently a bad approach and has shown to be useful in areas like law and healthcare, which also serve a vulnerable population and within which there is an expectation that the practitioner will comply with a set of ethics. However, within these industries, ethical expectations are not merely expectations but requirements regulated through somewhat independent government entities like the medical board association or the bar association. The issue is that no such entity exists for technology, there is no independent enforcer of tech ethics, only the flawed expectation that the consumer’s best interest and the company's best interest are aligned. An expectation that has been proven false, time and time again. To address the ethical shortcomings of AI, we need to establish ethical accountability within computer science education, the industry itself, and public policy. While AI created responsibly has the potential to mitigate bias in decision-making without a careful hand and watching eye, societal prejudice will likely slip through the cracks and be baked into objectively perceived systems.

Grace Madden

Hi! My name's Grace Madden and I'm an 18 year old from NYC. I'm a recent high school graduate and an incoming freshman at Bowdoin College. I've always been passionate about politics and social justice and am excited to be involved with YIP!