A sustainable machine-learning infrastructure

Written by Rupak Chakraborty | Jul 27, 2023 9:21:00 AM

Acknowledgments: This article was made possible by contributions and collaboration from the entire Machine Learning team at eyeo. Thank you for running the benchmarks, crunching the numbers and reviewing the content.

Artificial intelligence (AI), an undeniable force in the modern world, can lead to positive developments or be used for more sinister motives. At eyeo, we are committed to using AI and machine learning (ML) models ethically and sustainably as we pioneer the automation of commercial ad-filtering to protect user choice while providing solutions for publishers to monetize their content and advertisers to connect with consumers on mutually agreed terms. Our series “eyeo and Ethical AI” focuses on how we integrate AI and machine learning with our engineering work, adhering to fundamental values like individual rights, privacy and non-manipulation, while minimizing the negative impact this technology can have on the environment.

Sustainable AI is defined by two branches– “AI for sustainability” and “sustainability of AI”. (1) AI for sustainability is using AI to help promote environmental sustainability, e.g. using AI to decarbonize media by optimizing the supply chain of ads (supply-path-optimization) in the advertising industry. (2) The sustainability of AI is developing AI with minimal negative costs to the environment.

While much of the ML community is moving towards bigger datasets and multi-billion parameter models, we are moving in the opposite direction— making our models as lean and as energy efficient as practically possible. This blog will illustrate our approach to ensuring that our entire machine-learning infrastructure is just as effective as any other model but also aligns with our purpose of transforming the web into a safe, sustainable and accessible place.

Industry-wide emission standards

If we’re talking about using AI ethically, the environmental impact of the technology cannot be ignored. The energy consumption of popular ML models is staggering and growing exponentially. Here are some figures relating to the training and inference of Machine Language Models according to a 2019 study by the University of Massachusetts Amherst:

GPT-3 (175 billion parameter model), on which the eponymous ChatGPT is based, was trained over the course of six months. 4,789 different versions of the model were trained, requiring 9,998 total days’ worth of GPU time (more than 27 years). Taking all these runs into account, the researchers estimated that building this model generated over 78,000 total pounds of CO₂ emissions.
Training and maintaining an ML model comparable to Large Language Models (LLMs), can emit more than 626,000 pounds of carbon dioxide equivalent (CO₂e)— nearly five times the lifetime emissions of the average American car.
BERT, which forms the de-facto base of most LLMs, has a carbon footprint of roughly 1,400 pounds of CO₂ during one training phase– equivalent to a round-trip trans-America flight for one person.
The inference stage consumes even more energy than training. According to this 2019 Forbes article, Nvidia estimates that 80 to 90 percent of the resources consumed in a neural network happen when deploying a trained model to make predictions (in inference) rather than in training.

eyeo’s approach: Lean and green

To protect user privacy, it is imperative for our models to run in the browser, which has limited computing power and memory thus necessitating the need for leaner models. To focus on keeping our carbon footprint close to zero without compromising effectiveness, we have consciously adopted the following best practices to optimize our ML models and support infrastructure.

Sparse model architectures: reduce the computation complexity by 3x-10x while upholding the ML quality and prediction performance. We carefully selected the feature set and trained the graph convolutional neural networks (GCNs) to handle sparse graph structures efficiently.
Model Weight Quantization: selecting 32-bit floating point numbers over 64-bit floats for our weights reduces the model binary size to ~700 KB (BERT base model weighs 440 MB in comparison).
Limiting the number of layers reduces the total trainable parameters to ~145K (BERT base has 110M parameters).
Cloud-based computation rather than on-premise computation reduces energy usage and therefore emissions by 1.4x-2x. Cloud-based data centers are new, custom-designed warehouses equipped for energy efficiency for 50,000 servers, resulting in high-quality power usage effectiveness (PUE).
We chose Google Cloud Platform as our cloud service provider since it runs 57 percent of its services on renewable energy and specifically selected regions in the EU that support renewable energy, further reducing the gross carbon footprint by 5x–10x.
Using processors (GPU/TPU) and systems optimized for ML training, versus general-purpose processors, can improve performance and energy efficiency by 2x–5x.

By the numbers

In theory, all those best practices sound helpful but are they really effective? We plugged our dataset into the Machine Learning Emissions Calculator to estimate our carbon footprint. Here are the findings:

Training emissions

Training our general-purpose ad-filtering model experiments were conducted:

Using: Google Cloud Platform (GCP)
Region: europe-west2 (carbon efficiency of 0.62kg CO₂/kWh)
Cumulative computation: 50 hours
Hardware type: AMD EPYC 7763 (TDP of 280W)

Total emissions are estimated to be 8.68 kg CO2 of which 100 percent were directly offset by the cloud provider (GCP in our case).

_{*Estimations were conducted using the Machine Learning Emissions Calculator}

Training energy consumption

Our model saves more than 1000x the energy than the standard Large Language Models for training:

Inference energy consumption

On average our energy consumption is ~0.1 kWh per user session. Much less than the energy consumed by a fluorescent light bulb and an incandescent lamp:

Inference emissions

The EPA estimate that 1 kilowatt-hour of energy consumption generates 0.43 kg of carbon emissions. Our model, which only consumes 0.1 kilowatt-hour of energy per user session, emits 0.043 kg of carbon dioxide.

Optimizing the sustainability of AI

If we are going to create something to promote sustainability, we need to consider the ethics and synchronicity of doing it in a sustainable way.

eyeo Director of Engineering, Dr. Humera Noor Minhas states,

“Through our commitment to sustainability and ethical AI, we harness the power of machine learning responsibly, ensuring our technological advancements align with a greener future.” It is a clear sign of our imperatives to sustainability that all signposts and stations of the process are approached with an eye toward optimization.

We encourage others to look at their own models to see where they can reduce their carbon emissions. In this challenge, you may just find an opportunity.

Come back for the next blog in the series “eyeo and Ethical AI” where we discuss using AI to protect user choice.

––

Machine Learning and AI will be one of the focus topics of the upcoming 2023 Ad-Filtering Dev Summit in Amsterdam and online from 4-5 October 2023. Registration is now open, save your spot here.

View full post