Blak Focus July 2025 Edition – AI bias and Indigenous Knowledges
As AI and Generative AI become embedded in our everyday digital activities, there are ethical concerns about inherent bias. Digital technology and, by extension, machine learning and AI systems are not neutral (Schuman, 1985; Crawford, 2021; O’Neil, 2016; Noble, 2018; Hare, 2022; Wang, Chen, Huang, Redwing & Tsai, 2024; Worrell & John, 2024; Khurana, 2025) as they hold the values and biases of those who build and train these systems. Indigenous knowledges, which are relationally and place-based, have historically been marginalised and mispresented. Generative AI further compounds this misrepresentation and raises questions about the ‘sacredness of knowledge’ (Khuarna, 2025, p. 1) and the ethical use of AI and the incorporation of Indigenous knowledges into these algorithms and systems.
AI models’ design, training and implementation rely on big data and data systems that elevate Western knowledge systems and reflect power dynamics (Peng, L., & Zhao, B., 2024). In Fairness and Bias in Artificial Intelligence: A Brief Survey, Ferrara (2024, p. 4) describes bias in AI design and usage as fitting into the following categories: sampling, algorithmic, representation, confirmation, measurement, interaction and generative bias.
Bias refers to the systematic errors that occur in decision-making processes, leading to unfair outcomes. In the context of AI, bias can arise from various sources, including data collection, algorithm design, and human interpretation. Machine learning models, which are a type of AI system, can learn and replicate patterns of bias present in the data used to train them, resulting in unfair or discriminatory outcomes (Ferrara, 2024, p. 2).
Data colonialism
In Weapons of Math Destruction, O’Neil critiques how big data can increase inequality and threaten democracy by highlighting the problems of mathematical models, complex tapestries, and probabilities that comprise algorithms (2016, pp. 19-20). Large language models and AI systems are generally trained on datasets of internet-scale, – such as LAION-5B, Common Crawl or YouTube-8M, which include audio, video, image, text and multimodal datasets (Vijay, 2024). The training process on these datasets often results in large-scale and algorithmic bias. O’Neil explains that these models are frequently.
Lack specific data or information, so they tend to ‘substitute stand-in data or proxies’ and form discriminatory correlations (20216, p. 21). This has significant implications for Indigenous knowledge representation, especially since large data sets often reflect dominant cultural narratives and underrepresent Indigenous voices (Wang et al., 2024).
Algorithms of Oppression (2018) Nobel discusses how tools are used in decision-making processes, including big data and algorithms that reproduce societal bias, structural racism, and do little to promote equality. In discussing On Our Back, a black feminist publication, Noble describes the challenges also faced by information workers:
…from the digitization of indigenous knowledge from all corners of the earth that are not intended for mass public consumption, to individual representations that move beyond the control of the subject. We cannot ignore the long- term consequences of what it means to have everything subject to public scrutiny, out of context, out of control (2018, p. 132).
Landini (2025) argues that AI systems draw Indigenous data from sources that distort narratives and don’t preserve underlying relational and ethical dimensions. In doing so, AI systems fail to understand the connection between Indigenous knowledges, culture, intangible cultural heritage and intellectual property rights, which leaves Indigenous knowledges open to misuse and misappropriation (Landini, 2025, pp. 505-509).
Mitigating bias in AI
Addressing bias in AI is a pressing concern for researchers, developers, and users. Ferrara, in addressing sources of bias, emphasises technical mitigation strategies, including:
- Pre-Processing Data: resampling and re-weighing data to ensure it reflects diverse global communities, including marginalised groups.
- Model Selection: choosing algorithms that account for multiple group fairness criteria during evaluation.
- Post-Process Decision: explainability and transparency tools to assess model outputs in real-time for bias and skew (2024, pp. 5-6).
Ferrara does, however, caution that these strategies are not without their challenges. They can be time-consuming – understanding what fairness equates to is different for many groups of people, and more complex data may be needed (2024, pp. 6-12).
In contrast to purely technical solutions, Worrell (2024) critiques the wider cultural impacts of AI systems like ChatGPT or, as she has coined it, ‘Uncle Chatty Gee’. She argues that these generative tools undermine Indigenous traditional knowledge transmission and culturally appropriate Indigenous knowledges in ways that don’t reflect Indigenous cultural values.
Worrell and Johns (2024) advocate adopting culturally grounded protocols and accountability mechanisms that align with Indigenous data sovereignty principles. Ethical principles and relational accountability must govern AI’s interaction with Indigenous knowledges.
The multifaceted nature of indigenous data sovereignty gives rise to a wide-ranging set of issues, from legal and ethical dimensions around data storage, ownership, access and consent, to intellectual property rights and practical considerations about how data are used in the context of research, policy and practice (Kukutai & Taylor, 2016, p. 2).
Respecting protocols around storytelling, custodianship, and Country is essential to ensure that digitisation does not become another form of dispossession. Indigenous data sovereignty offers a framework to counteract AI’s extractive approach. Wang et al. (2024) argue that a shift from generative AI as a colonial replica toward a collaborative, sovereign-aware technology requires tribal-centred knowledge creation. This also requires governance solutions for oral traditions, Indigenous-created documents and collaboration with tribal governments (Wang et al., 2024, pp. 641-642).
Wang et al. (2024), in their study Tribal Knowledge Cocreation in Generative AI Systems, explore how generative AI models deployed in US public sector contexts misrepresent Indigenous knowledges. They identify key issues and biases in AI related to over-reliance of AI in government decision-making processes, unfair treatment as a result of over-reliance, responses based on Western-centred information, and ‘the challenge of intersectoral and cross-sovereignty data governance’ (Wang et.al, 2024, pp. 638-639). To mitigate these biases, they propose three strategies around tribal-centred knowledge creation:
- Tribal Digital Equity: prioritising equitable access and authentic representation of tribal culture and history in datasets.
- Tribal Sovereignty: prioritising the right of tribal nations to control data and how to make decisions related to technology and AI.
- Knowledge Cocreation: using Indigenous perspectives to cocreate knowledge for AI systems (2024, pp. 640-641).
In their work, Abundant intelligences: placing AI within Indigenous knowledge Frameworks, Lewis, Whaanga, and Yolgormez (2024) highlight the importance of placing AI within Indigenous knowledge systems and culturally informed data governance. They identify five areas where AI research would benefit from Indigenous knowledge frameworks.
- Language: expanding AI’s understanding of Indigenous languages in Natural Language Processing (NLP) and supporting low-resourced languages through hybrid deep learning.
- Storytelling: increased use of agency in narrative experience and developing systems that better decode and encode narrative information.
- Environmental stewardship: drawing on Indigenous Knowledges to inform climate, ecological and environmental research in AI such as forecasting.
- Multi-agent systems: incorporate Indigenous perspective into AI frameworks, and ensure that “humans are kept in the loop”, aiming to “develop systems that are driven by consensus-based goals and natural observation of others’ behaviors.”
- Socio-neuro AI: drawing on human experience to better understand socio-cultural contexts (2024, pp. 2149-2150).
Similarly, the Indigenous Protocol and Artificial Intelligence Position Paper (Lewis, 2020) expands these conversations globally, offering guidelines for Indigenous-centred AI design. These guidelines include principles such as locality, relationality and reciprocity, responsibility, relevance and accountability, develop governance guidelines from Indigenous protocols, recognise the cultural nature of all computational technology, apply ethical design to the extended stack, and respect and support data sovereignty (Lewis, 2020, pp. 21-22).
If AI is to serve all humanity, it must be capable of recognising and respecting the multiplicity of ways we understand the world. This begins with confronting the colonial foundations of digital knowledge systems and building futures where Indigenous knowledges are not filtered through AI, but where Indigenous knowledges transform AI.
Next month’s Blak Focus will focus on Indigenous innovation and transformative uses of AI. In the edition, we will look at what a group of Indigenous scholars, Lewis, Arista, Pechawis, & Kite, (2018) have coined ‘making kin with the machine’, a ‘circle of relationships’ that includes non-human kin.
Blak Focus is a monthly edition of Indigenous-focused content, created by Deakin Library’s Indigenous Programs team. Blak Focus is intended to share Indigenous ways of being and knowing to help facilitate the transition to embedding Indigenous knowledges into academic practice. For enquiries about Blak Focus, or to request topics for future editions, please reach out to lib-indigenous@deakin.edu.au.
References
Crawford, K. (2021). The Atlas of AI : Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press. https://research.ebsco.com/linkprocessor/plink?id=a147eb24-5a31-356c-99b9-f1f9092f06fa
Ferrara, E. (2024). Fairness and Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, and Mitigation Strategies. Sci, 6(1), 3. https://doi.org/10.3390/sci6010003
Hare, Stephanie. (2022). Technology Is Not Neutral : A Short Guide to Technology Ethics. London Publishing Partnership.
Khurana, S. (2025). Decolonizing Artificial Intelligence: Indigenous Knowledge Systems, Epistemic Pluralism, and the Ethics of Technology. Journal of Computer Allied Intelligence, 3(3), 1-10. https://doi.org/10.69996/jcai.2025013
Kukutai, T., Taylor, J. (2016). Indigenous data sovereignty: toward an agenda. ANU Press, Canberra https://research.ebsco.com/linkprocessor/plink?id=543c04c3-4a00-381e-9694-7b2aaa32fc61
Landini, G. G. (2022). Traditional knowledge, environmental challenges and artificial intelligence, Ethical Generative AI Use and Sustainable Approaches in The Routledge Handbook of Artificial Intelligence and International Relations 1st edition 2025
Lewis, Jason Edward, ed. 2020. Indigenous Protocol and Artificial Intelligence Position Paper. Honolulu, Hawaiʻi: The Initiative for Indigenous Futures and the Canadian Institute for Advanced Research (CIFAR). https://spectrum.library.concordia.ca/986506
Lewis, J. E., Arista, N., Pechawis, A., & Kite, S. (2018). Making Kin with the Machines. Journal of Design and Science. https://doi.org/10.21428/bfafd97b
Lewis, J. E., Whaanga, H., & Yolgörmez, C. (2024). Abundant intelligences: placing AI within Indigenous knowledge frameworks. AI & Society, 40(7), 2141–2157. https://doi.org/10.1007/s00146-024-02099-4
Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press https://research.ebsco.com/linkprocessor/plink?id=f266c056-acab-313a-835e-101de115a28c
O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.
Peng, L., & Zhao, B. (2024). Navigating the ethical landscape behind ChatGPT. Big Data & Society, 11(1). https://doi.org/10.1177/20539517241237488
Suchman, L. A. (1985). Plans and situated actions: The problem of human-machine communication (ISL-6). Xerox Palo Alto Research Center.
Vijay K, AiOps Redefined!!!. (2024, November 5). What datasets are used to train generative AI models? https://www.theaiops.com/what-datasets-are-used-to-train-generative-ai-models/
Wang, Y.F., Chen, Y.C., Huang, Y.C., Redwing, C., & Tsai, C.H. (2024). Tribal Knowledge Cocreation in Generative Artificial Intelligence Systems. In Proceedings of the 25th Annual International Conference on Digital Government Research (dg.o ’24) (pp. 637–644). Association for Computing Machinery. https://par.nsf.gov/biblio/10528101-tribal-knowledge-cocreation-generative-artificial-intelligence-systems
Worrell, T. (2024) Uncle Chatty Gee: Harms of Generative AI on Indigenous knoweldges and sovereignty. In Australian Association for Research in Education https://www.aare.edu.au/publications/aare-conference-papers/show/15077/uncle-chatty-gee-harms-of-generative-ai-on-indigenous-knowledges-and-sovereignty
Worrell, T., & Johns, D. (2024). Indigenous considerations of the potential harms of generative AI. Agora, 59(2), 33–36. https://search.informit.org/doi/10.3316/informit.T2024070500013200755488162