Beginning thoughts on algorithmic discrimination.
This blog is often me thinking by writing, and I’m having some thoughts on ethics, AI and discrimination. They are pretty bullet point type things now, and I hope I can flesh them out with two events: the DATA471 paper I’m taking this year at the University of Canterbury (‘the ethical data scientist’), and leading a discussion on research data ethics at the upcoming eResearch 2020 conference.
In short, I just need to do a brain dump.
- Algorithmic Discrimination (AD) is not intentional. By this I mean that there is no use in considering that the team building training data for machine learning, or developing the code itself are intentionally trying to create discriminatory results.
- AD is not rational. This is trickier. Discriminatory results are not logically perceptible. I’m not sure if this is true. For example, if a system spits out ‘mother is to nurse as father is to doctor’, I can see why it did that. In the past, and therefore the training data, it can come to that conclusion, easily. However, for reasons of future equity and fairness, its not a helpful result, as logical as it it. Therefore, it’s not a rational process, and requires some kind of computational Angel to encourage the logic to results that are more helpful.
- “The philosophers have only interpreted the world, in various ways. The point, however, is to change it.” Theses on Feuerbach # 11, by my old buddy Karl. Results of AI are a good way to study the world as it was. So, how do we make it a tool that can be used to make it better? Without a good understand of what it is we want to change, how do we build our guiding Angel?
- Training data is only as good as it is. To avoid accusations of direct personal discrimination, we need to be able to look at AI predictions., and trace AD results back to the training data. We need to save that data, and set it in preservative archivist’s amber, otherwise the results will look like the researcher’s personal prejudices.
- Amber is transparent, but only just. The metaphor of amber is that you can see the outline of the data, but not the personal details and confidential information. To create trust, you have to be able to see the data publicly, without exposing the people who gave you the data initially. This Is Hard.
- Follow the money. AD could be artificially used to the benefit of corporations against people. I mean, if its explicit, then that’s OK, and needs to be dealt with by the forces that shave off the worst excesses of capitalism, like regulation. If it’s opaque, then it is evil.
- Māori research methods have a built in framework that involves kōrerō, deep hanging out, iterative improvement, respect, māna, and many other things that could be a way forward.
There will be more thoughts, and I’ll try on expanding them as I work through the next six months.