Ethical & legal aspects

What is ethics?

"Ethics is a study of what are good and bad ends to pursue in life and what is right and wrong to do in the conduct of life.
It is therefore, above all, a practical discipline.
Its primary aim is to determine how one ought to live and what actions one ought to do in the conduct of one's life."

Introduction to Ethics, John Deigh

Defining what is good or right is hard.

Trolley problem

https://en.wikipedia.org/wiki/File:Trolley_Problem.svg

Chick classifier

$\Rightarrow$
$\Rightarrow$
Rooster
$\Rightarrow$
Hen

Wikipedia, Wikipedia, and Wikipedia

  • Ethics is guided by the principles and values of people and society.
  • Usually no easy answers. Many gray areas.
  • Ethics change over time with the values and beliefs of people.
  • Legal $\neq$ Ethical.
  • We have agency to decide what kind of technology we want to build.

    Train a classifier to predict a persons IQ from photos and texts.

  • Should we do this?
  • Who could benefit from this technology?
    • Employers
    • Schools
    • Immigration offices
  • How could use of the classifier cause harm?
    • IQ might be a bad proxy for future performance
    • People with low IQ but with other skills do not get a chance

We very often use proxy values as labels.

What we wantProxy label
Performance at jobIQ
Probability that someone commits a crimeProbability that someone is convicted
Interest of a personClick on a link
Next correct word in a sentenceNext word used by someone in a sentence
  • Often the true label is unavailable / very complex.
  • The proxy might only be correlated with the true target.
  • The proxy can be biased.

    Train a classifier to predict a persons IQ from photos and texts.

  • Who could benefit from this technology?
  • How could use of the classifier cause harm?
  • What kind of harm can be caused by the classifier?
    • People might not get a job / education / deported
  • What is the error of the classifier?
    • Is accuracy good for all subgroups?
    • Tradeoff between recall and specificity.
    • What is the cost of misclassification?

    Train a classifier to predict a persons IQ from photos and texts.

  • Who could benefit from this technology?
  • How could use of the classifier cause harm?
  • What kind of harm can be caused by the classifier?
  • What is the error of the classifier?
  • Who is responsible?
    • Researcher/developer? Manager? Lawmaker? Society?

The "AI Gaydar" study

Goal: Predict sexual orientation from facial images.

"Deep neural networks are more accurate than humans at detecting sexual orientation from facial images", Wang & Kosinski

  • Training data from a popular American dating website
    • 35,326 pictures of 14,776 people. All white. Gay, straight, male and female represented evenly.
  • Deep learning model to extract facial and grooming features and classifier to predict orientation.
  • Accuracy: 81% for men, 74% for women

Humans have long tried to predict hidden characteristics from external features

https://commons.wikimedia.org/wiki/File:Franz_Joseph_Gall_measuring_the_head_of_a_bald_Wellcome_L0004149.jpg

    Is the research question ethical?

  • In many countries a gay person can be prosecuted and some places have death penalties for it.
  • Can affect a persons employment, family relationships, educational or health care opportunities, ...
  • Properties like gender, race, sexual orientation, religion should be intimate and private. But they are often used to discriminate against.

Researchers claim they wanted to show the dangers the technology poses.

Is this a good justification?

No, the dangers are apparent without building it.

Huge potential harms vs. questionable value.

Wider class of such applications (startup Faception)

https://www.faception.com

    The data

  • Training data from a popular American dating website
    • 35,326 pictures of 14,776 people. All white. Gay, straight, male and female represented evenly.

Was it ethical to use this data?

    Was it ethical to use this data?

  • People did not intend their pictures to be used in other ways
  • People did not agree to participate in the study
  • Legal $\neq$ Ethical
  • Public $\neq$ Publicized

Biased data

35,326 pictures of 14,776 people. All white. Gay, straight, male and female represented evenly.

  • Only white people.
  • Self disclosed orientation
  • Certain social groups
  • Certain age groups
  • The photos were selected to be attractive for target group
  • Balanced data does not represent true distribution

Training and test data with lots of bias.

$\Rightarrow$ Classifier will likely not work well outside of this specific data set.

    Assessing AI systems

  • Ehtics of the research question
  • Value and potential harm
  • Privacy of data
  • Bias in the data
  • Bias of the model
  • What error do we make?

Legal aspects of automated systems

  • Common misconception: AI (the internet / computers) is such a new technology that there are no laws to govern it.
  • Usually existing laws cover many new cases.
    • They might be a bad fit.
    • There often are not a lot of concrete precedents.
    • Gray areas can be hard to judge.

Example

    We have bought a smart voice assistant which has the option to buy products from an online shop.

  • Case 1: We accidentally order something we do not want.
  • Case 2: Our kid orders something we do not want.
  • Case 3: A news speaker on television orders something we do not want.

Who is at fault (pays the shipping fees for the return)?

Details depend on jurisdiction.

  • In Germany a contract is based on a "Willenserklärung" (meeting of the minds).
    • In this case one party obviously did not agree to the purchase.
  • To facilitate trade there are shortcuts to get to a valid contract.
    • Greeting someone at an auction does count as a valid bid.
    • You can delegate authority to someone else - even an algorithm.

Example based on "KI & Recht kompakt", Matthias Hartmann.

When delegating product ordering to the voice assistant, the shop has to assume that we are using it responsibly.

  • In case 1 we are likely at fault.
  • In case 3 we are likely not at fault.
  • Case 2 is more difficult - probably we are at fault.

Highly dependent on what can be expected of the voice assistants owner.

While he is somehow responsible it is very unlikely that the speaker would be held liable.

https://chat.openai.com

Attacking machine learning systems

Idea: Use gradient to compute small perturbation towards different class.

"Explaining and Harnessing Adversarial Examples", Goodfellow, Shlens & Szegedy

Adversarial attacks can work generically with small perturbations. Example: Using adversarial accessories.

"Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition", Sharif, Bhagavatula, Bauer & Reiter.

"Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition", Sharif, Bhagavatula, Bauer & Reiter.

If users can modify your training data, your model is especially vulnerable.

    Microsoft Twitter Bot "Tay"

  • Released 2016
  • Interacting with users
  • Learned from past conversations

Twitter.com

Twitter.com

  • Machine learning systems have vulnerabilities in addition to regular software.
  • Who can modify the training data?
  • How is new data added?
  • Where do the input features come from?
  • What kind of damage is possible?

Not all safety issues are specific to software

https://xkcd.com/1958/