Does POPIA impact AI software or ML systems?

0
413

Venolan Naidoo | Senior Associate | Fasken | mail me |


This article discusses the impact of the Protection of Personal Information Act (POPIA) on Artificial Intelligence or Machine Learning (ML) systems used in the context of the workplace.

Although this article is centric to a particular type of workplace or employer (that of a technology centric 4th industrial revolution focussed business) regarding the degree in the processing of personal information by artificial intelligence systems and may appear prescient, it is reasonable to conclude that this particular employer will be many in the not too distant future as more technologies are infused- making up the future workplace.

Some of these systems or technologies should already be familiar to us.

Take for example:

  • Chatbots that attend to general queries;
  • systems that train or coach individuals on best sales/marketing techniques;
  • biometric technology, automation of administratively repetitive tasks;
  • cyber security or fraud detection;
  • robotic process automation in the healthcare, manufacturing, and logistics industries.

These are already part of the changes as workplaces move further forward into the 4th industrial revolution.

From a general perspective, the article will discuss what is viewed as those relevant provisions or themes of POPI and its possible relation to artificial intelligence or machine learning systems. However, the provisions of POPI discussed are not exhaustive and other areas (not covered) may also be relevant considering the specific context in each given case.

Background

In the age of data harnessing and analytics, be it from Google searches, to the Internet of Things, the collection of data has become an innate part of our connected world given its considerable value. This, of course, has also transformed the workplace. With the advent of cloud based computing we have seen the scalability and flexibility of virtual workplaces.

The COVID-19 pandemic has undoubtedly further accelerated the cloud based/’remote work from home’ regime with many businesses having immediately shifted to this new work order amid the pandemic and other businesses, needing to get up to technological speed, being left with no option but to do so.



In 2021, this has to a certain extent become the norm (and no longer the new normal). Businesses are now focusing on cloud based technology solutions for better efficiency, productivity (and the list goes on) as we keep up with global trends in a changing digitized world.

What has been remarkable in recent times is the reliance by local businesses on particular technologies which, in varying degrees, rely on AI or ML systems (these terms will be used interchangeably for purposes of the article series albeit AI has a much wider meaning and includes, amongst others, machine learning).

As mentioned in the AI examples set out in the preface, some of these technologies we should certainly be familiar with.

In not delving into much detail of a technical nature, AI or machine learning systems ordinarily rely on mass data that include datasets which, given its purpose, would comprise of personal information. In some instances, the technology is not privy to information that can identify a particular person i.e. it being anonymized and merely aggregated data (based on various factors). Thus, it functions to what the AI has been designed or programmed to do in line with its underlying logic. However, on the other hand, AI may in fact rely on personal information (given its specific purpose) or can otherwise be a combination of both anonymized and identifiable personal information.

As we gear further into digitization, employers will invariably require using their employees’ personal information for a range of business related reasons. Be it for security, performance, or better workplace efficiency. This will include information relating to one’s views or preferences and other personal information. Businesses in the global north are already far ahead in this trajectory and it is only a matter of time until this is fairly extensive in South Africa.

Whilst there has been a proliferated transformation of technologies in the workplace particularly in regard to data or information processing, the conversation of data privacy in relatively recent times has hit centre stage on all facets of technological platforms. Be it What’s App, Google search profiles, Facebook and other social media platforms. Individuals have become conscious in wanting to keep their data private and seeking not to be the ‘the product themselves’.

Striking a balance between technologies requiring as much ‘real world’ data to work efficiently (and in view of technological development) and the rights to data privacy seems to be the new conundrum.

The Protection of Personal Information Act 4 of 2013 (POPI)

In November 2013, South Africa enacted the Protection of Personal Information Act 4 of 2013 (also known as the acronym ‘POPI’ or ‘POPIA’). POPI is aimed at protecting individuals personal information.

This, of course, gives effect to the constitutional right to privacy but has been enacted in light of, and amongst others, to:

  • protect important interests, including the free flow of information within South Africa and across international borders; and
  • regulate how personal information may be processed, by establishing conditions, in harmony with international standards, that prescribe the minimum threshold requirements for the lawful processing of personal information.

Although POPI was signed into law several years back, only certain provisions incrementally came into effect over the past few years. Published in the Government Gazette on 22 June 2020, the remaining transitional provisions in relation to compliance will take effect on 1 July 2021.

POPI impacts all responsible parties holding or processing a person’s personal information. What is more apparent are employers, especially those of a medium to large nature (but not always the case), that hold a multitude of personal information of their data subject employees.

It is worth emphasising that personal information is widely defined under section 1 of POPI. It includes information relating to an identifiable natural person and where applicable an identifiable existing juristic person. It further defines personal information (which is not exhaustive) as information relating to the education, medical, financial, criminal, employment history of a person. It also includes information relating to their race, gender, sex, ethnic and social origin, location information, online identifier or other particular assignment to the person, etc. Rather importantly, it includes information relating to the personal views, opinions or preferences of the person and the views or opinions of another individual about the person.



Processing is also widely defined (section 1 of POPI) and means – any operation or activity or any set of operations, whether or not by automatic means, concerning personal information, including –

  • the collection, receipt, recording, organisation, collation, storage, updating or modification, retrieval, alteration, consultation or use;
  • dissemination by means of transmission, distribution or making available in any other form; or
  • merging, linking, as well as restriction, degradation, erasure or destruction of information.

Given that we are now in a shift in requiring the processing of information, including personal information, to better workplace security, productivity, efficiency etc. through emerging technologies, the question that is usually posed is what could possibly be the impact when POPI takes full effect on 1 July 2021.

Part of the provisions of POPI in order to be compliant is that it envisages consent (notwithstanding some of the other exceptions) from data subjects. It further gives data subjects, or employees in the context of the workplace (‘data subject employees’), a right to be made aware when collecting their information, which should be directly from them. This must be explicitly defined and lawful. The processing must also be lawful and conducted in a reasonable manner, and pertaining to a function or activity of the employer (‘responsible party employer’).

POPI further imposes requirements in the processing of special personal information. As a default position, there is a prohibition in the processing of special information unless the general authorisation provisions apply. Under section 26 of POPI, special personal information includes information relating to religious or philosophical beliefs, race or ethnic origin, trade union membership, political persuasion, biometric information of a data subject, etc. POPI makes specific provision in regulating special personal information.

Personal versus de-identified information: the likely relationship between POPI and AI systems

Scope of POPI

Based on what POPI covers and the machine learning systems requiring a wealth of data that may include personal information, it is evident that the provisions of POPI will apply to these systems insofar as personal information is concerned. Indeed, if data or information is de-identified (or anonymized) per se then POPI would not apply.

The scope of POPI would therefore relate to those AI or machine learning systems that require (ordinarily envisaged with duly informed consent) personal information (including special personal information), or, information which, at face value, is anonymized but in some way can be associated, through a reasonably foreseeable method, to a particular individual that would bring it within the scope of POPI. More on this aspect will be discussed below.

Personal versus de-identified information

In order to constitute ‘de-identified’ personal information in terms of its definition under POPI (the term anonymized will be used interchangeably with de-identified), which in essence means to delete information that can identify the data subject.

Importantly, it also means a deletion of information that can be used or manipulated by a reasonably foreseeable method to other information in order to identify the data subject, or, the possibility by a reasonably foreseeable method to other information that can identify the data subject.

In other words, if the face value ‘anonymised’ personal information can still identify a person through a ‘reasonably foreseeable method’, it will not constitute de-identified personal information.

In this regard, it will then fall under the definition of ‘re-identification’ which means to resurrect any information that has been identified that-  

  • identifies the data subject;
  • can be used or manipulated by a reasonably foreseeable method to identify the data subject; or
  • can be linked by a reasonably foreseeable method to other information that identifies the data subject, and ‘re-identified’ has a corresponding meaning.

While these definitions may seem straightforward, it will be interesting to determine in due course to what degree of measure the test of ‘reasonably foreseeable method’ will apply in practice (even with an ‘objective standard’) in the context of AI systems. This is especially when taking into account the varying intricacies of a machine learning system, becoming more intuitive in its ‘training’ as it progresses, and whether such a standard will be suitable in each given case.

To the extent that POPI does apply, its impact on machine learning systems:

Automated decision making

One of the provisions of POPI that impacts on AI systems is section 71 of POPI which specifically deals with automated decision making (as related to personal information).

Section 71(1) of POPI expresses that a data subject-  may not be subject to a decision which results in legal consequences for him, her or it, or which affects him, her or it to a substantial degree, which is based solely on the basis of the automated processing of personal information intended to provide a profile of such person including his or her performance at work, or his, her or its credit worthiness, reliability, location, health, personal preferences or conduct.

It is understood that this provision would certainly apply to an AI or machine learning system due to it being automated. Section 71(1) of POPI, as a result, imposes a preclusion if the personal information processed by a machine learning system culminates in a decision that has legal consequences, or substantially affects, the data subject employee including in their work performance, credit worthiness, personal preferences or conduct, and so forth.



This provision will therefore have consequences on responsible party employers who would need to process this information, through an AI or automated system, in regard to various legitimate business reasons, be it for enhanced work performance or better workplace efficiency.

However, sections 71(2) & 71(3) of POPI provides for an exception to the above preclusion, and includes if such a decision has been taken – in connection with the conclusion or execution of a contract and – the request of the data subject in terms of the contract has been met, or, appropriate measures have been taken to protect the data subject’s legitimate interests, or, is governed by a law or code of conduct in which appropriate measures are specified for protecting the legitimate interests of data subjects.

Regarding the appropriate measures, section 71(3) of POPI states that it must provide – 

(a) provide an opportunity for a data subject to make representations about a decision; and;

(b) require a responsible party to provide a data subject with sufficient information about the underlying logic of the automated processing of the information relating to him or her to enable him or her to make representations in terms of paragraph (a).

Accordingly, if a responsible party employer meets these requirements as set out above it ought to be able to process the personal information.

Conclusion

Some of the aspects discussed above naturally raise challenging questions on the relationship between POPI and the AI software systems that may require this personal information as part of its dataset to function effectively and to achieve its desired results.

It accordingly remains to be seen how this likely conundrum between AI or machine learning systems and data privacy under POPI will unfold in time to come and whether each is mutually exclusive or are adaptable to function symbiotically.

It is also anticipated to see whether the Regulator will issue a code of conduct in respect of this particular area.

For now, given the transitional deadline on 1 July 2021, it would seem creating well defined and clear provisions, including contractually and in policy documents, between the processing of personal information by an AI or machine learning system and the regulation of data privacy under POPI, will certainly be a foundational start.

Moreover, ensuring compliance from the outset when setting up the AI or machine learning systems would result in a system processing personal information (if any) to be compliant in tandem with POPI.


 



LEAVE A REPLY

Please enter your comment!
Please enter your name here