This article is part of a special issue of VB. Read the full series here: Intelligent Sustainability†
Everything counts in large quantities. You don’t have to be Google or build big AI models to take advantage of writing efficient code. But how do you measure that?
It’s complicated, but that’s what Abhishek Gupta and the Green Software Foundation (GSF) are working relentlessly. The GSF is a non-profit organization formed by the Linux Foundation, with 32 organizations and nearly 700 individuals participating in various projects to further its mission.
Its mission is to build a trusted ecosystem of people, standards, tooling and best practices for creating and building green software, which it defines as “software responsible for emitting fewer greenhouse gases”.
Accenture, BCG, GitHub, Intel, and Microsoft participate in GSF, and the efforts are organized into four working groups: standards, policy, open source, and community.
Gupta, who is the chair of the Standards working group at GSF, in addition to his roles as senior responsible AI leader and expert at BCG and the founder and principal investigator of the Montreal AI Ethics Institute, shared current work and roadmap on measuring the impact of software on durability†
The first step to greener code is to measure its impact
The first thing Gupta notes about the GSF is that it focuses on reduction, not neutralization. This means that things like renewable energy credits or power purchase agreements aimed at compensation and neutralization are not part of the GSF’s mission. The focus, Gupta said, is on actually reducing the way you design, develop and deploy software systems. This is a work in progress and a very complex exercise.
But businesses of any scale can benefit from more efficient code. Think about what happens to your phone or laptop when you run apps that require more or less processing, such as playing videos versus editing text. The difference in battery consumption is significant. The bigger the scale, the bigger the stake — making large language models for example, being more efficient can yield significant savings.
The first step to improvement is to measure, as the famous saying goes. The focal point of Gupta’s work with the GSF Standards working group is something that the software carbon intensity specification: (SCI). The SCI specification defines a method of calculating the rate of carbon emissions for a software system.
The GSF has adopted the idea of carbon efficiency as a way of thinking about the carbon impacts of software systems. This, Gupta explained, is broken down into three parts: energy efficiency, hardware efficiency and carbon awareness.
Energy efficiency is trying to use as little electricity as possible. Electricity is the primary way software consumes energy, and in most parts of the world it is primarily generated by burning fossil fuels. This is where the CO2 impact comes from.
Hardware efficiency tries to use as little embodied carbon as possible. Embodied carbon, Gupta noted, aims to capture the carbon impact of anything that goes into hardware, such as servers, chips, smartphones, etc.
Carbon consciousness focuses on trying to do more work when the electricity is “clean” and less when the electricity is “dirty,” Gupta said. He also referred to the concept of energy proportionality. The idea there is that higher utilization of a piece of hardware means electricity is converted into more useful work rather than idle. However, when it comes to actually measuring impact, things get messy.
“Some people watch flops† Some look directly at the energy consumption of the systems, and there are different approaches that lead to very different results. That’s one of the challenges we face in the field,” Gupta said.
The goal, Gupta said, is to have energy efficiency, hardware efficiency and carbon awareness speak very explicitly in the calculation. Ultimately, the SCI aims to become an official standard that promotes comparability.
Granularity and transparency are essential for a complex business
One of the key points Gupta made is that “software and hardware are inextricably linked”. The GSF gives priority to reducing CO2 emissions in software, but the choice and use of hardware is a very important part of that.
Today, the cloud is where most software is produced and deployed. When we talk about software systems deployed in the cloud, one question Gupta said people often ask is about fractional usage. If only a fraction of a certain hardware is used, but for a certain amount of time, how should that be justified? This is where time sharing and resource sharing come into play.
These are ways of calculating what proportion of a hardware system’s embodied emissions should be taken into account when calculating the carbon intensity score for software. Scale is also taken into account, via a parameter Gupta called functional unit. This can be, for example, the number of minutes that you are working with the software, or the number of API calls that are made.
For hardware, essentially, the full life cycle assessment must be considered in order to calculate embodied emissions. That’s really complex, so the GSF started an initiative to create open datasets with which people can calculate embodied emissions.
“If you reserve a particular instance with a cloud provider, they give you some information about the performance of that node and its parameters. But what are the specifics of that piece of hardware on which your software actually runs?” said Gutta. “Getting transparency, getting data about it, is often also important. And that’s why we’re investing in creating some open data so you can make those calculations easier.”
Granularity is key, as Gupta emphasized, otherwise it all becomes rather abstract and vague. This inevitably also leads to complexity and questions about boundaries, that is, what should be included in software CO2 emission calculations.
“You can think of memory, storage, computer use, but also things that we often forget. What is the logging infrastructure? Do you have any kind of supervision? Do you have idle machines on standby for redundancy? Do you have any kind of pipelines to build and implement?” He said. “Then we are talking about machine learning models. You can have an inventory of used models. you can have shadow implementations, canary implementations† You have all these things, backups that are in place, which are ultimately part of that boundary as well.”
The other important principle that Gupta emphasized is transparency. Transparency about what is included in calculations, but also about how these calculations are made. For example, if direct observability isn’t possible, the GSF promotes what Gupta called “a lab-based or model-based approach.”
“If we’re talking about consuming modules, APIs and third-party libraries, if you don’t have direct visibility, take a lab based on a model-based approach where you can get an approximation and get some directional information about what the carbon impacts are yet always useful, and you can use that in your SCI score calculation, with the requirement that you are transparent and [state] that’s what you did,” Gupta said.
From measuring to acting
Ultimately, the SCI with all its intricacies and complexities is a means to an end, and the goal is to make it accessible to everyone. The goal, notes the GSF, is to help users and developers make informed choices about which tools, approaches, architectures and services to use in the future. It’s a score rather than a total; lower numbers are better than higher numbers, and reaching zero is impossible.
It is possible to calculate an SCI score for any software application, from a large distributed cloud system to a small monolithic open source library, any on-premise application or even a serverless function. The product or service can run in any environment, be it a PC, a private data center, or a hyperscale cloud.
As Gupta pointed out, there is an arsenal of related tools out there: Allen AI Institute’s Cup† RAPL† greenframe† Code carbon and PowDroid, to name a few. The GSF offers a extensive list†
These tools can help companies gain a better understanding of your application’s power consumption, but because everyone does it a little differently, the results you get are often different, Gupta said. That is why the GSF promotes the introduction of the SCI.
An important aspect, regardless of the choice of a specific tool, is actionable feedback. That is, the tool should not only measure the CO2 impact of the software, but also offer suggestions for improvement. Some of these tools provide targeted recommendations on which parts of the code consume more energy and where to optimize. But that’s not all that matters — recommendations about processes and choices are also important, Gupta said.
For AI systems, Gupta explained that people also need to think about things like system design, training methodology, and model architectures. Quantifying weights, using distilled networks, assuming TinyML approaches can all be very helpful in reducing the carbon impacts of systems. Because there’s “huge pressure” to make AI models work on resource-constrained devices, that also has the byproduct of reducing carbon footprint.
Making the right hardware choices can also help, according to Gupta. Using appropriate hardware, i.e. application-specific integrated circuits, or AI chips such as TPUs, can help reduce the amount of energy used to train AI models. The same goes for deploying AI models — there are systems developed specifically for that purpose, Gupta noted. Making tactical choices in where and when models are trained can also be beneficial.
Currently, sustainability reporting on software is in an embryonic stage. It’s rarely done, it’s voluntary and it’s not standardized† An example that comes to mind is: Google Cloud Model Maps, used to report on AI models. Gupta believes that durability should become a first-class citizen everywhere, in addition to business and functional considerations.
“When you have a product that has to go out, the things that are optional fall away first. If we start including these as mandatory requirements, I think people will pay more attention to them,” he said.
At the same time, Gupta added that if consumers get smarter, looking at environmental impact scores and making choices based on that, that will also make a difference. If users are only willing to pay for software that is green, it will impact profits and force organizations to change the way they work.
Currently, the GSF is working on releasing the first official version of SCI, which Gupta says will be “a huge milestone.” It is expected to be unveiled at the UN Climate Change Conference in 2022† As Gupta shared, organizations that are part of the GSF are considering incorporating SCI into their measurement methods and the software systems they build.
The GSF also works on awareness raising, including holding summits around the world.
“We are embarking on this mission to raise awareness. It’s not something people really think about these days. So we’re making people aware that — ‘Hey, green software is a thing, and this is why you should care,’ concluded Gupta.