Essential Takeaways From Stanford’s AI Index 2022 Report
Stanford’s HAI Institute aims to research AI with a human-centric approach, inspired by the depth and diversity of human minds and concerned at each step for its ethical impact on society; their mission is to use AI to augment human capabilities rather than replace them.
Researched and written by the institution’s leading experts in academia and business, the fifth annual edition of the AI Index report showcases the data behind the latest trends in the field of AI over the past year. The report is 230 pages long. In this article, we outline the essential takeaways from each chapter of the report.
Read on to learn which nations have made the biggest strides in AI, the top locations around the world for AI jobs, the state of AI regulation, and more.
Chapter 1: Research and Development
Chapter Summary
AI is a hot commodity, so a wide range of research and development has happened over the last year. This chapter tries to quantify research and development across academic institutions, government organizations, and industry.
The report has collected the following dataset from publicly available sources such as conference papers, journal articles, research papers, and patents. It also looks into GitHub stars in open source repositories and even conference attendance. This enables a birds-eye view of how AI R&D is advancing across the world.
Key Highlights
USA and China are AI Comrades
When examining international cooperation in the field of AI, USA and China have the most significant number of collaborations on AI publications from 2010 to 2021, increasing five times since 2010 despite the political differences between the two nations. The second-closest partnership is that of the UK and China.
China is Leading the Way in Publications
With the highest number of publications, journals, conferences, and repositories published, China is currently leading the way in AI research activities.
Published Journals Are at An All-Time High
In 2021, 51.5% of all AI documents published were journals (i.e., research papers). As a general trend, AI journals and repositories are at an all-time high. However, the number of conferences held has been on a decline since 2018.
Fastest-growing fields of study
Publications in pattern recognition and machine learning have more than doubled since 2015. Around 52,000 and 40,000 research papers were published in 2021 in each field, respectively.
Cross-Sector Collaborations
Nonprofits and educational institutions produced the highest number of cross-sector AI publications from 2010 to 2021. Next in line for cross-sector collaboration is between the private sector and educational institutions, followed by government and educational institutions.
AI Patents are at an All-time High
As developments in AI accelerate, more researchers and scientists have rushed to protect their work, with patent filings compounding at an annual growth rate of 76.9%.
AI repositories: The new way of sharing AI research
Publishing pre-peer-reviewed research on arXiv and SSRN has become a popular method among AI researchers to share their findings. Over the past 12 years, the number of AI repository publications has grown by a factor of almost 30.
Popular open-source AI libraries
The report compared the total number of GitHub stars for popular open-source AI libraries over the last seven years. Tensorflow has the most GitHub stars, followed by OpenCV, Keras, and Pytorch. Other popular libraries below 40k stars are faceswap, 100-Days-Of-ML-Code, and AiLearning.
Chapter 2: Technical Performance
Chapter Summary
This chapter tracks the progress of various models in computer vision, natural language processing, speech, recommendation, reinforcement learning, hardware, and robotics. This chapter tries to quantify the performance of these models using commonly used model benchmarks and professional surveys across a period of 10 years. Some of the benchmarking used in the report can be found below:
Computer Vision Image & Video |
|
Machine Learning Problem |
Benchmarking Used |
Image Classification |
ImageNet: Top-1 & Top-5 Accuracy |
Deep Fake Detection |
FaceForensics++ Accuracy |
Semantic Segmentation |
CITYSCAPES CHALLENGE: MEAN INTERSECTION-OVER-UNION (IOU) |
Activity Recognition |
Kinetics-400, Kinetics-600, Kinetics-700 Dataset: TOP-1 ACCURACY |
Object Detection |
[Common Object in Context] COCO-TEST-DEV: MEAN AVERAGE PRECISION |
Language |
|
Machine Learning Problem |
Benchmarking Used |
English Language Understanding |
SUPERGLUE: SCORE, SQUAD 1.1 and SQUAD 2.0: F1 SCORE, READING COMPREHENSION DATASET REQUIRING LOGICAL REASONING (RECLOR): ACCURACY |
Text Summarization |
ARXIV: ROUGE-1, PUBMED: ROUGE-1 |
Natural Language Inference |
STANFORD NATURAL LANGUAGE INFERENCE (SNLI) ABDUCTIVE NATURAL LANGUAGE INFERENCE (aNLI) |
Recommendation |
|
Machine Learning Problem |
Benchmarking Used |
Commercial Recommendation |
MOVIELENS 20M: NORMALIZED DISCOUNTED CUMULATIVE GAIN@100 |
Click-Through Rate Prediction |
CRITEO: AREA UNDER CURVE SCORE (AUC) |
Reinforcement Learning |
|
Machine Learning Problem |
Benchmarking Used |
Arcade Games |
ATARI-57: MEAN HUMAN-NORMALIZED SCORE |
Human Games: Chess |
CHESS SOFTWARE ENGINES: ELO SCORE |
Key Highlights
Big Data is the Key to Success
Large training datasets are key to building a successful model with high accuracy. Almost all state-of-the-art models with millions of parameters are exclusively trained on large datasets to achieve great results. Here, big tech companies have an edge as they hold a lot of data.
Affordable AI
All these years of innovation have paid off, as the average Joe can now easily afford to build large-scale models. Since 2018, the cost of training an image classifier has decreased by 63.6%. Moreover, the time it takes to train models has improved by 95%. The improvement in training times is one reason for the reduced cost. Other factors such as dedicated cloud services, efficient open-source packages, and availability of talent also add up to lower costs.
Cheap robotic arms
Based on a survey developed by the AI Index team, there is a notable downward trend in the pricing of robotic arms. The median price of a robotic arm was $42,000 in 2017, but in 2021 that cost decreased by nearly 50% to $22,600.
Focus on medical imaging
Computer vision research is moving towards more use-case-oriented applications such as medical imaging, according to the report. From their dataset, the AI Index found that there is a significant increase in research using Kvasir-SEG medical imaging dataset and CVC-ClinicDB—in 2020, there were only three papers using this dataset, whereas 2021 saw 25 relevant publications.
AI still lacks Language Skills
Even though models such as SuperGlue and Squad have already surpassed humans on relatively simple tasks such as reading comprehension, for more complex linguistic tasks like abductive natural language inference (aNLI), AI models are still far from human-level proficiency.
Improvement in General Reinforcement Learning
In the past 10 years, AI has become proficient at narrow reinforcement learning tasks such as playing chess. However, in 2021 there is a significant trend in the improvement of more general reinforcement learning tasks such as Procgen, a reinforcement learning environment that tests an AI’s ability to learn generalizable skills that was released by OpenAI in 2019.
Chapter 3: Technical AI ethics
Chapter Summary
Undoubtedly, AI has generated businesses massive value over the past few years. However, as these machine learning models are productionized, their use has exposed some of the most significant shortcomings of bias in AI. In many cases, AI models use real-world data for training which contains certain social biases. These biases are further amplified when a machine learning model is built around the flawed data.
This chapter tries to quantify the progress of eliminating bias by providing in-depth benchmarks for the respective domain such as natural language and computer vision. Some of the benchmarks used can be found below:
Natural Language |
|
Ethical Issues |
Benchmarking |
Toxicity in Natural Language |
|
Stereotype bias |
StereoSet Score by Model Size |
Gender Bias |
Key Highlights
AI Ethics Goes Mainstream
Research on the fairness of AI models has increased many-fold—from 2014, the number of publications has grown by 71% year-on-year. Bias in outcomes of various commercial models has caused these models to unfairly discriminate against specific subgroups in real-world applications such as credit card scoring. Therefore, it has prompted significant interest for academic researchers and commercial companies to focus on the fairness machine learning models.
Large Language Models are Biased
Gigantic state-of-the-art language models with hundreds of billions of parameters such as GPT-3 have become really successful in language-based tasks. The technical innovation behind these models is to apply bidirectional training of transformers with attention. However, new data shows these large models are more prone to inducing biases from their extensive training data.
Multimodal models are inclined to be more biased
Similar to large language models, multimodal language-vision models like DALL·E 2 demonstrate a diverse set of capabilities due to significant scale training of both the text and the image as a single stream of data. This results not only in the top-of-the-line photorealistic images generated from a text prompt but also in outputs that reflect societal stereotypes and biases.
Chapter 4: The Economy and Education
Chapter Summary
This chapter examines the impact of AI on both the economy and education. The report has gathered publicly available data from Linkedin, Computing Research Association, McKinsey, Netbase Quid, and EMSI Burning Glass. It also looks at AI’s effect on jobs, including hiring, labor demand, and skill penetration.
Key Highlights
AI is high on investments
In 2021, AI startups raised $93.5 billion, double the amount raised in 2020. However, the number of newly funded AI startups has reduced from 1051 in 2019 to 746 companies in 2021. In 2021 there were 15 rounds of $500 million or more investments.
The USA is betting on AI innovation
The United States has the most significant private investments in AI and the most newly funded AI companies, almost two and three times higher respectively than runner-up China.
Cloud is Prioritized on Investments
Companies in the cloud data management and processing sector, such as Databricks, received the most private funding. The runner-ups are in healthcare and fintech.
Top Regions for AI jobs
New Zealand, Hong Kong, Ireland, Luxembourg, and Sweden have the highest growth in AI job openings from 2016 to 2021.
AI job Scope Within the US
The state of California, home to Silicon Valley, has the highest number of job postings, having 2.3 times more numbers than runner-up Texas. However, Washington DC had the most AI job postings when compared to the USA’s overall number of job postings.
AI is the Most Popular Specialty Among Computer Science PhDs
The most popular specialty among Computer Science PhDs over the past decade has been machine learning/artificial intelligence. One in every five CS students who graduated with Ph.D. degrees in 2020 specialized in AI.
Chapter 5: AI Policy and Governance
Chapter Summary
This chapter breaks down legislation and regulation affecting artificial intelligence worldwide. It looks at how different countries and regions are working to accommodate AI tech while keeping the well-being of its people as the central focus. Researchers looked at policymaking worldwide and quantified the bills proposed and passed that mention artificial intelligence.
Key Highlights
More and more countries are regulating the use of AI
The report analyzed the number of AI-related bills (containing Artificial Intelligence as a keyword) being passed across 25 countries from 2016 to 2021. The report shows a sharp increase in the number of bills passed related to AI in the last two years. Spain, the United Kingdom, and the United States had the highest number of AI-related bills passed into law, with 3 each.
The US leads in proposing AI-related bills
US legislative records showed a sharp increase in the number of proposed bills to regulate AI. In 2021, the legislation in the US proposed 130 bills. However, only 2% of those were passed into law.
State-level AI legislation in the United States
In the US, legislation around AI has been considered nationwide, with 41 out of 50 states having proposed at least one AI-related bill between 2012 and 2021. Top three states with the most AI bills proposed:
- Massachusetts (40)
- Hawaii (35)
- New Jersey (32)
Democrats vs. Republicans on AI
In the US, state-level data on AI lawmaking can be further broken down by sponsorship from the parties, and the data reveals that Democrats support lawmaking around AI more than Republicans. This gap has further increased over the years. In 2021, Democrats sponsored 39 more bills than Republicans.
AI Mentions in global legislative hearings
The AI Index report analyzed the verbally recorded hearings of 25 countries with the keyword “artificial intelligence” from 2016 to 2021. The mentions of AI grew by a factor of 7.7 in the past six years to 1,323 total mentions in 2021.
Keeping up with AI
The field of AI is continuously evolving, and millions of experts worldwide are working round the clock to keep this a reality. To further keep up with AI and how it’s advancing, you can check out more resources below.
blog
Introducing The State of Data & AI Literacy Report 2024
DataCamp Team
3 min
blog
How to Learn AI From Scratch in 2024: A Complete Guide From the Experts
blog
25 Practical Examples of AI Transforming Industries
Nahla Davies
16 min
blog
7 Exciting AI Projects for All Levels in 2024
blog
The 5 Best AI Tools for Data Science in 2024
podcast