Measurement modeling could further the governments understanding of AI policymaking tools.
Governments are increasingly using artificial intelligence (AI) systems to support policymaking, deliver public services, and manage internal people and processes. AI systems in public-facing services range from predictive machine-learning systems used in fraud and benefit determinations to chatbots used to communicate with the public about their rights and obligations across a range of settings.
The integration of AI into agency decision-making processes that affect the publics rights poses unique challenges for agencies.
System design decisions about training data, model design, thresholds, and interface design can set policythereby affecting the publics rights. Yet today many agencies acquire AI systems through a procurement process that lacks opportunities for public input on system design choices that embed policy, limits agencies access to information necessary for meaningful assessment, and lacks validation and other processes for rooting out biases that may unfairly, and at times illegally, affect the public.
Even where agencies develop AI systems in house, it is unclear given the lack of publicly available documentation whether the policy relevant design choices are identified and subject to rigorous internal scrutiny, and there are only a few examples of such policy relevant design choices being subject to public vetting.
AI systems can be opaque, making it difficult to fully understand the logic and processes underlying an output, therefore making it difficult to meet obligations that attach to individual decisions. Furthermore, automation bias and the interfaces and policies that shape agency use of AI tools can turn systems intended as decision support into decision displacement.
Some governments have begun to grapple with the use of AI systems in public service delivery, providing guidance to agencies about how to approach the embedded policy choices within AI.
Canada, for example, adopted new regulations to ensure agency use of AI in service delivery is compatible with core administrative law principles including transparency, rationality, accountability, and procedural fairness. In April 2021, the European Commission unveiled a proposed Artificial Intelligence Act which is currently wending its way through the complex EU trilogue process. If adopted, the European law will, among other things, set standards and impose an assessment process on AI systems used by governments to allocate public benefits or affect fundamental rights.
These efforts are important. Nevertheless, building the capacity of administrative agencies to identify technical choices that are policyand therefore ought to be subject to the technocratic and democratic requirements of administrative law regardless of whether AI systems are built or boughtrequires tools and guidance to assist with assessments of data suitability, model design choices, validation and monitoring techniques, and additional agency expertise.
There is a growing set of tools and methods for AI system documentation. Used at appropriate times in the development or procurement of an AI system, these tools can support collaborative interrogation of AI systems by domain experts and system designers.
One such method is measurement modeling. Part of routine practice in the quantitative social sciences, measurement modeling is the process of developing a statistical model that links unobservable theoretical constructs (what we would like to model) to data about the world (what we are left with). We have argued elsewhere that measurement modeling provides a useful framework for understanding theoretical constructs such as fairness in computational systems, including AI systems.
Here, we explain how measurement modeling, which requires clarifying the theoretical constructs to be measured and their operationalization, can assist agencies to understand the implications of AI systems, design models that reflect domain specific knowledge, and identify discrete design choices that should be subject to public scrutiny.
The measurement modeling process makes the assumptions that are baked into models explicit. Too often, the assumptions behind models are not clearly stated, making it difficult to identify how and why systems do not work as intended.
But these assumptions describe what is being measured by the systemwhat the domain-specific understanding of the system is, versus what is actually being implemented. This approach provides a key opportunity for domain experts to inform technical experts about the reasonableness of assumptionsboth assumptions about which intended domain specific understanding of a concept should be used, and assumptions about how that concept is being implemented.
Careful attention to the operationalization of the selected concept offers an additional opportunity to surface mismatches between technical and domain experts assumptions about the meaning of observable attributes used by the model.
The specific tools used to test measurement modeling assumptions are reliability and construct validity. Broadly, this entails asking questions such as: What does an assumption mean? Does the assumption make sense? Does it work and in the way we expect?
An easily overlooked yet crucial aspect of validity is consequential validity, which captures the understanding that defining a measure changes its meaning. This phenomenon includes Goodharts Law, which holds that once a measure is a target, it ceases to be a good measure. In other words, does putting forward a measurement change how we understand the system?
As Ken Alder has written, measures are more than a creation of society, they create society. This means that any evaluation of a measurement model cannot occur in isolation. As with policymaking more broadly, effectiveness must be considered in the context of how a model will then be used.
AI systems used to allocate benefits and services assign scores for purposes such as predicting a teachers or schools quality, ranking the best nursing homes for clinical care, and determining eligibility for social support programs. Those assigned scores can be used as inputs into a broader decision-making process, such as to allocate resources or decide which teachers to fire.
Consider SASs Education Value-Added Assessment System (EVAAS), a standardized tool that claims to measure teacher quality and school district quality. Measurement modeling can help break down what EVAAS is doingthat is, what policies are being enforced, what values are being encoded, and what harms may come to pass as a result.
The EVAAS tool operationalizes the construct of teacher quality from a range of abstract ideals into a specific idea, a latent force that can be measured from differences in student test scores across years. To ensure that a measurement model is capturing what is intended, the designers of specific EVAAS tools need to consider the validity of the design choices involved.
For instance, does the operationalization of teacher quality fully capture the ideal (content validity) or match other agreed upon measures (convergent validity)? Cathy ONeil described examples where EVAAS scores were misaligned with teachers receiving teaching awards and support from the community.
We can further ask: Are the EVAAS teacher scores reliable across years? Again, ONeil has pointed to examples where a teacher could go from scoring six out of 100 to 96 out of 100 within one year. Teacher scores can further penalize students near the lower thresholds. Under-resourced school districts systematically result in lower teacher quality scores, which are much more likely a reflection of other social phenomena affecting the scores than teachers themselves (discriminant validity).
In addition, EVAAS tools literally encourage teaching to the testthat is, pedagogy that emphasizes test performanceat the expense of other educational priorities.
But even AI tools used for discovery are implicitly assigning scores, which are used to allocate agency attentionyet another decision.
Consider a federal government-wide comment analysis tool that surfaces relevant regulatory comments, identifies novel information and suppresses duplicate comments. What are those tools doing? Sorting comments by relevancebut that requires finding an implicit ranking, based on some understanding and measurement of what relevance means.
A measurement of relevance depends on defining or operationalizing relevance. So any system that sorts by relevance depends on this measurements. And these measurements are used to guide users action about what comments should be followed up on, or safely ignored, with what urgency, and so on.
All this means that the definition and operationalization of relevanceor any other conceptis governance. Even though one persons understanding of what is relevant might differ from another persons, there is now one understanding of relevance embedded in the AI modelout of sight and upstream. Human decisions that once informed policy are now tasks defined through design in upstream processes, possibly by third-party vendors rather than expert agency staff.
Previously visible and contestable decisions are now masked, and administrators have given this decision-making away. Unless of course, they have tools that help them retain it. That is where measurement modeling comes in.
Although even skilled experts cannot fully understand complex AI systems through code review, measurement modeling provides a way to clarify design goals, concepts to be measured, and their operationalization. Measurement models can facilitate the collaboration between technical and domain experts necessary for AI systems that reflect agency knowledge and policy.
The rigor imposed by measurement modeling is essential given that important social and political values that must guide agency action, such as fairness, are often ambiguous and contested and therefore exceedingly complex to operationalize. Moreover, the data that systems train and run on is imbued with historical biases, which makes choices about mappings between concepts and observable facts about the world fraught with possibilities for entrenching undesirable aspects of the past.
When the measurement modeling process surfaces the need to formalize concepts that are under-specified in law, it alerts agencies to latent policy choices that must be subject not only to appropriate expert judgment but to the political visibility that is necessary for the legitimate adoption of algorithmic systems.
Whether an agency is developing the AI system or procuring it, there are a range of methods for bringing the knowledge of outside experts and the general public into the deliberation about system design. These include notice-and-comment processes, more consultative processes, staged processes of expert review and public feedback, and co-design exercises. Measurement modeling can be used within them all.
Issues warranting public participation can include decisions about the specific definition of a concept to be modeled as well as its operationalization. For example, fairness has multiple context-dependent, and sometimes even conflicting, theoretical definitions and each definition is capable of different operationalizations.
Existing jurisprudence on the setting of formulas and numerical cutoffs, and the choices underlying methodologies, provides useful guidance for identifying aspects of AI systems that warrant public input. Agency decisions that translate ambiguous concepts such as what is classified as appropriate into a fixed number or establish preferences for false negatives or positives are clear candidates.
The introduction of AI systems into processes that affect the rights of members of the public demands urgent attention. Agencies need new ways to ensure that policy choices embedded in AI systems are developed through processes that satisfy administrative laws technocratic demands that policy decisions be the product of reasoned justifications informed by expertise.
Agencies also need guidance about how to adhere to transparency, reason giving, and nondiscrimination requirements when individual determinations are informed by AI-driven systems. Agencies also need new experts and new tools to validate and monitor AI systems to protect against poor or even illegal outcomes produced by forces ranging from automation bias, model drift, and strategic human behavior.
Without new approaches, the introduction of AI systems will inappropriately deny and award benefits and services to the public, diminish confidence in governments ability to use technical tools appropriately, and ultimately undermine the legitimacy of agencies and the market for AI tools more broadly.
Measurement modeling offers agencies and the public an opportunity to collectively shape AI tools before they shape society. It can help agencies clarify and justify the assumptions behind models they choose, expose and vet them with the public, and ensure that they are appropriately validated.
Abigail Z. Jacobs is an assistant professor of information and of complex systems at the University of Michigan.
This essay is part of a nine-part series entitledArtificial Intelligence and Procurement.
View post:
The Hidden Governance in AI - The Regulatory Review
- The Smell Of Death Has A Strange Influence On Human Behavior - IFLScience - October 26th, 2024 [October 26th, 2024]
- "WEIRD" in psychology literature oversimplifies the global diversity of human behavior. - Psychology Today - October 2nd, 2024 [October 2nd, 2024]
- Scientists issue warning about increasingly alarming whale behavior due to human activity - Orcasonian - September 23rd, 2024 [September 23rd, 2024]
- Does AI adoption call for a change in human behavior? - Fast Company - July 26th, 2024 [July 26th, 2024]
- Dogs can smell human stress and it alters their own behavior, study reveals - New York Post - July 26th, 2024 [July 26th, 2024]
- Trajectories of brain and behaviour development in the womb, at birth and through infancy - Nature.com - June 18th, 2024 [June 18th, 2024]
- AI model predicts human behavior from our poor decision-making - Big Think - June 18th, 2024 [June 18th, 2024]
- ZkSync defends Sybil measures as Binance offers own ZK token airdrop - TradingView - June 18th, 2024 [June 18th, 2024]
- On TikTok, Goldendoodles Are People Trapped in Dog Bodies - The New York Times - June 18th, 2024 [June 18th, 2024]
- 10 things only introverts find irritating, according to psychology - Hack Spirit - June 18th, 2024 [June 18th, 2024]
- 32 animals that act weirdly human sometimes - Livescience.com - May 24th, 2024 [May 24th, 2024]
- NBC Is Using Animals To Push The LGBT Agenda. Here Are 5 Abhorrent Animal Behaviors Humans Shouldn't Emulate - The Daily Wire - May 24th, 2024 [May 24th, 2024]
- New study examines the dynamics of adaptive autonomy in human volition and behavior - PsyPost - May 24th, 2024 [May 24th, 2024]
- 30000 years of history reveals that hard times boost human societies' resilience - Livescience.com - May 12th, 2024 [May 12th, 2024]
- Kingdom of the Planet of the Apes Actors Had Trouble Reverting Back to Human - CBR - May 12th, 2024 [May 12th, 2024]
- The need to feel safe is a core driver of human behavior. - Psychology Today - April 15th, 2024 [April 15th, 2024]
- AI learned how to sway humans by watching a cooperative cooking game - Science News Magazine - March 29th, 2024 [March 29th, 2024]
- We can't combat climate change without changing minds. This psychology class explores how. - Northeastern University - March 11th, 2024 [March 11th, 2024]
- Bees Reveal a Human-Like Collective Intelligence We Never Knew Existed - ScienceAlert - March 11th, 2024 [March 11th, 2024]
- Franciscan AI expert warns of technology becoming a 'pseudo-religion' - Detroit Catholic - March 11th, 2024 [March 11th, 2024]
- Freshwater resources at risk thanks to human behavior - messenger-inquirer - March 11th, 2024 [March 11th, 2024]
- Astrocytes Play Critical Role in Regulating Behavior - Neuroscience News - March 11th, 2024 [March 11th, 2024]
- Freshwater resources at risk thanks to human behavior - Sunnyside Sun - March 11th, 2024 [March 11th, 2024]
- Freshwater resources at risk thanks to human behavior - Blue Mountain Eagle - March 11th, 2024 [March 11th, 2024]
- 7 Books on Human Behavior - Times Now - March 11th, 2024 [March 11th, 2024]
- Euphemisms increasingly used to soften behavior that would be questionable in direct language - Norfolk Daily News - February 29th, 2024 [February 29th, 2024]
- Linking environmental influences, genetic research to address concerns of genetic determinism of human behavior - Phys.org - February 29th, 2024 [February 29th, 2024]
- Emerson's Insight: Navigating the Three Fundamental Desires of Human Nature - The Good Men Project - February 29th, 2024 [February 29th, 2024]
- Dogs can recognize a bad person and there's science to prove it. - GOOD - February 29th, 2024 [February 29th, 2024]
- What Is Organizational Behavior? Everything You Need To Know - MarketWatch - February 4th, 2024 [February 4th, 2024]
- Overcoming 'Otherness' in Scientific Research Commentary in Nature Human Behavior USA - English - USA - PR Newswire - February 4th, 2024 [February 4th, 2024]
- "Reichman University's behavioral economics program: Navigating human be - The Jerusalem Post - January 19th, 2024 [January 19th, 2024]
- Of trees, symbols of humankind, on Tu BShevat - The Jewish Star - January 19th, 2024 [January 19th, 2024]
- Tapping Into The Power Of Positive Psychology With Acclaimed Expert Niyc Pidgeon - GirlTalkHQ - January 19th, 2024 [January 19th, 2024]
- Don't just make resolutions, 'be the architect of your future self,' says Stanford-trained human behavior expert - CNBC - December 31st, 2023 [December 31st, 2023]
- Never happy? Humans tend to imagine how life could be better : Short Wave - NPR - December 31st, 2023 [December 31st, 2023]
- People who feel unhappy but hide it well usually exhibit these 9 behaviors - Hack Spirit - December 31st, 2023 [December 31st, 2023]
- If you display these 9 behaviors, you're being passive aggressive without realizing it - Hack Spirit - December 31st, 2023 [December 31st, 2023]
- Men who are relationship-oriented by nature usually display these 9 behaviors - Hack Spirit - December 31st, 2023 [December 31st, 2023]
- A look at the curious 'winter break' behavior of ChatGPT-4 - ReadWrite - December 14th, 2023 [December 14th, 2023]
- Neuroscience and Behavior Major (B.S.) | College of Liberal Arts - UNH's College of Liberal Arts - December 14th, 2023 [December 14th, 2023]
- The positive health effects of prosocial behaviors | News | Harvard ... - HSPH News - October 27th, 2023 [October 27th, 2023]
- The valuable link between succession planning and skills - Human Resource Executive - October 27th, 2023 [October 27th, 2023]
- Okinawa's ants show reduced seasonal behavior in areas with more human development - Phys.org - October 27th, 2023 [October 27th, 2023]
- How humans use their sense of smell to find their way | Penn Today - Penn Today - October 27th, 2023 [October 27th, 2023]
- Wrestling With Evil in the World, or Is It Something Else? - Psychiatric Times - October 27th, 2023 [October 27th, 2023]
- Shimmying like electric fish is a universal movement across species - Earth.com - October 27th, 2023 [October 27th, 2023]
- Why do dogs get the zoomies? - Care.com - October 27th, 2023 [October 27th, 2023]
- How Stuart Robinson's misconduct went overlooked for years - Washington Square News - October 27th, 2023 [October 27th, 2023]
- Whatchamacolumn: Homeless camps back in the news - News-Register - October 27th, 2023 [October 27th, 2023]
- Stunted Growth in Infants Reshapes Brain Function and Cognitive ... - Neuroscience News - October 27th, 2023 [October 27th, 2023]
- Social medias role in modeling human behavior, societies - kuwaittimes - October 27th, 2023 [October 27th, 2023]
- The gift of reformation - Living Lutheran - October 27th, 2023 [October 27th, 2023]
- After pandemic, birds are surprisingly becoming less fearful of humans - Study Finds - October 27th, 2023 [October 27th, 2023]
- Nick Treglia: The trouble with fairness and the search for truth - 1819 News - October 27th, 2023 [October 27th, 2023]
- Science has an answer for why people still wave on Zoom - Press Herald - October 27th, 2023 [October 27th, 2023]
- Orcas are learning terrifying new behaviors. Are they getting smarter? - Livescience.com - October 27th, 2023 [October 27th, 2023]
- Augmenting the Regulatory Worker: Are We Making Them Better or ... - BioSpace - October 27th, 2023 [October 27th, 2023]
- What "The Creator", a film about the future, tells us about the present - InCyber - October 27th, 2023 [October 27th, 2023]
- WashU Expert: Some parasites turn hosts into 'zombies' - The ... - Washington University in St. Louis - October 27th, 2023 [October 27th, 2023]
- Is secondhand smoke from vapes less toxic than from traditional ... - Missouri S&T News and Research - October 27th, 2023 [October 27th, 2023]
- How apocalyptic cults use psychological tricks to brainwash their ... - Big Think - October 27th, 2023 [October 27th, 2023]
- Human action pushing the world closer to environmental tipping ... - Morung Express - October 27th, 2023 [October 27th, 2023]
- What We Get When We Give | Harvard Medicine Magazine - Harvard University - October 27th, 2023 [October 27th, 2023]
- Psychological Anime: 12 Series You Should Watch - But Why Tho? - October 27th, 2023 [October 27th, 2023]
- Roosters May Recognize Their Reflections in Mirrors, Study Suggests - Smithsonian Magazine - October 27th, 2023 [October 27th, 2023]
- June 30 Zodiac: Sign, Traits, Compatibility and More - AZ Animals - May 13th, 2023 [May 13th, 2023]
- Indiana's Funding Ban for Kinsey Sex-Research Institute Threatens ... - The Chronicle of Higher Education - May 13th, 2023 [May 13th, 2023]
- Have AI Chatbots Developed Theory of Mind? What We Do and Do ... - The New York Times - March 31st, 2023 [March 31st, 2023]
- Scoop: Coming Up on a New Episode of HOUSEBROKEN on FOX ... - Broadway World - March 31st, 2023 [March 31st, 2023]
- Here's five fall 2023 classes to fire up your bookbag - Duke Chronicle - March 31st, 2023 [March 31st, 2023]
- McDonald: Aspen's like living in a 'Pullman town' - The Aspen Times - March 31st, 2023 [March 31st, 2023]
- Children Who Are Exposed to Awe-Inspiring Art Are More Likely to Become Generous, Empathic Adults, a New Study Says - artnet News - March 31st, 2023 [March 31st, 2023]
- DataDome Raises Another $42M to Prevent Bot Attacks in Real ... - AlleyWatch - March 31st, 2023 [March 31st, 2023]
- Observing group-living animals with drones may help us understand ... - Innovation Origins - March 31st, 2023 [March 31st, 2023]
- Mann named director of School of Public and Population Health - Boise State University - March 31st, 2023 [March 31st, 2023]
- Irina Solomonova's bad behavior is the star of Love Is Blind - My Imperfect Life - March 31st, 2023 [March 31st, 2023]
- Health quotes Dill in article about rise of Babesiosis - UMaine News ... - University of Maine - March 31st, 2023 [March 31st, 2023]
- There's still time for the planet, Goodall says, if we stay hopeful - University of Wisconsin-Madison - March 31st, 2023 [March 31st, 2023]
- Relationship between chronotypes and aggression in adolescents ... - BMC Psychiatry - March 31st, 2023 [March 31st, 2023]