An academic conference produces greenhouse-gas (GHG) emissions in various ways. Emissions from flying are typically large (measured in tonnes of carbon or CO2) relative to emissions from other sources (measured in kilograms). The main environmental challenge, therefore, is to reduce emissions from flying.
An economy seat on a typical intercontinental return flight corresponds to roughly one tonne of carbon equivalent or 3.7 tonnes of CO2 (co2.myclimate.org). Expressed per person-kilometer, these emissions may be some four times higher for flying than for bus travel, and twenty times higher than for electric train (eea.europa.eu/transport). By comparison, air conditioning at a conference in a hot, humid location might use some tens of kW for some tens of hours, or roughly 1000 kWh. If the electricity is from fossil fuels, generating one kW produces about one kg of CO2 by burning 400 g carbon. So a few hundred kg of carbon will be needed for air conditioning—less than the flying emissions of one participant. The CO2 emitted during production of beef served at the conference are even smaller; for every 10 kg beef served, roughly 100 kg carbon is burned. Emissions from a kg of plastic packaging are roughly 2 kg carbon.
Consider now the emissions from internet-based audiovisual (AV) communication. YouTube (currently the most popular platform) is emitting approximately 10 million tonnes of CO2 equivalent per year, while being watched for 1 billion hours per day (Preist et al., (2019). If on one day 27,000 tonnes CO2 equivalent are emitted while videos are watched for 1 billion hours, one video produces 27 g CO2 per hour, or about 7 g carbon. At a semi-virtual multi-hub conference (described below), 500 participants might watch YouTube for 2 hours each. If we add virtual presentations within the conference program and viewing by individuals after the conference, the total might be 3000 hours or 20 kg carbon—negligible compared to emissions from flying.
Greenhouse-gas (GHG) emissions from aviation currently account for about 3% of global CO2 emissions. Due to other gases (e.g., NOx and CH4) and particles emitted, and their complex interactions with the atmosphere, the contribution of aviation to anthropogenic global warming is very roughly twice that of the CO2 alone. Global emissions from aviation are increasing by some 4% per year (5% more person-km less 1% gain in efficiency) with no end in sight. Alternatives such as electric motors and biofuels cannot scale up without causing other serious environmental problems. Flying may represent half the footprint of those relatively few people who currently fly (ICAO, 2016, 2019; Freeman et al., 2018; Graver et al., 2019; Owen et al., 2010).
In recent years, technological developments in internet-based AV communication have made it possible to significantly reduce GHG emissions produced by travel to and from academic conferences by incorporating virtual interactions. This strategy can have positive spinoffs for academic communication, collaboration, inclusion, cultural diversity, and dissemination. The available technology for high-quality AV transmission is reliable, inexpensive, and easy to use, provided organizers are well informed in advance and have qualified technical support (e.g. local students in audio engineering or related fields).
This practice paper outlines promising alternatives to the conventional conference format that involve live streaming, and provides guidance for successful implementation. Live streaming can be used in different ways to improve the accessibility and cultural diversity of an academic conference while at the same time reducing emissions. Since YouTube made live streams freely available in 2013, recorded videos have become an important means of academic communication and documentation; impressive examples within YouTube include “Cambridge University Press – Academic” and “Oxford Academic (Oxford University Press)”. Conferences can be split across several global locations, allowing face-to-face social interaction within hubs and virtual social interaction among hubs (“semi-virtual”).
Live streaming is not the only way to reduce emissions. Another option is to encourage colleagues to use surface transport wherever possible. For those that fly, emissions can be saved by avoiding more than one take-off (combining flying with train or bus) and informing travel agents of this constraint (Astudillo & AzariJafari, 2018).
A further option is to avoid flying altogether. A “Nearly Carbon Neutral (NCN) Conference Model” has been developed and implemented by Ken Hiltner, professor of environmental humanities at the University of California at Santa Barbara. A consideration of zero-location formats is beyond our present scope.
In this contribution, we aim to help colleagues in different disciplines and countries to significantly reduce the carbon footprint of their conferences by explaining how they might take advantage of appropriate internet-based communication technologies. We discuss how an approach of this kind can help academic conferences (and perhaps conferences in general) reduce their dependence on air travel while at the same time improving inclusiveness.
The recommendations in this paper are based on our experiences while organizing an innovative multi-location semi-virtual conference. The 15th International Conference on Music Perception and Cognition (ICMPC15), combined with the 10th triennial conference of the European Society for the Cognitive Sciences of Music (ESCOM10), happened simultaneously in four countries (Argentina, Canada, Austria, and Australia) and lasted from 23 to 28 July 2018. The research area of the conference was music cognition, but the semi-virtual idea could be realized in any academic discipline in sciences, engineering, humanities, or arts, whether that discipline be specialist, interdisciplinary, pure research, or practically oriented.
We aimed to halve per-capita emissions per participant. We achieved an even greater reduction, while at the same time making other improvements. By flexibly incorporating new communication technologies, we were able to increase both the number of participants and their cultural diversity. For many participants, the total cost of participation was significantly reduced. The cost reduction was greatest for colleagues from non-rich countries, for whom all three costs (registration, accommodation, and travel) were significantly reduced by comparison to flying to a conference in a rich country. At our Argentinian hub, South American participants paid less than half the registration fee charged in Canada, Australia, or Austria. They also paid less for accommodation, given the lower cost of living in Argentina, and traveled a shorter distance than to a single-location conference in Graz, Austria. It would have been relatively easily within our new conference format to magnify this effect by establishing additional hubs in low-GDP countries such as South Africa or India.
All talks at all hubs were live-streamed. At each hub, local and virtual programs ran in parallel in adjacent rooms; participants could easily change rooms after each talk, and in that way experience a mixture of live and remote presentations. The global program ran around the clock, whereas local programs were confined to usual working hours.
The potential of live streams for academic conferences is only starting to be realized. Every conference, large and small, in every discipline and in every country, can benefit. Live streams enable any talk to be shared with a larger audience. The information becomes more openly accessible. The geographic outreach and cultural diversity of presenters and audiences is increased (cf. Neustaedter et al., 2018). A talk can be given almost anywhere, opening up new possibilities for global academic exchange. Ultimately, increasing the diversity of academic documentation formats helps both academic and general audiences to understand the content.
The incorporation of live streams changes the conference experience. The added variety of content, presentation format, and interaction style makes the conference more interesting. Many academic colleagues will have experienced poor live streams at conferences, but that was often due to solvable technical problems. The AV quality of YouTube live streams is consistently and reliably high. Technical problems can be avoided by careful advance rehearsal.
While it is undeniably more fun and often more productive to communicate with people in person, face to face, it is also true that a regular live conference can be improved by adding electronic communication with remote participants. Colleagues can be included who could not have flown to a central conference location; reasons for not flying can be financial, family-related (caring commitments), physical (disability), and political (passport, visa). Talks can be viewed at any time, which is both an advantage (allowing any participant to watch any talk, even if the live talks happen simultaneously) and a disadvantage (breaking up the communal experience of watching and discussing a talk together).
In future, total emissions from flying to semi-virtual conferences could approach zero if conference hubs were located such that few or no colleagues needed to fly. Such conferences would also be more accessible, especially for colleagues in non-rich countries, who would no longer be faced with impossibly high costs for a long flight, registration, and accommodation in a rich country. Participants would be treated equally if hub sizes were about equal, all talks were given live at hubs, and participants were neither implicitly rewarded for flying nor penalized for not flying. In the following, we outline various technological/logistic format options.
Including some remote presentations is the simplest option for many conference organizers, requiring no additional equipment (hardware or software). Any conventional conference can be adapted to include remote presentations, which reduces the number of local participants, while increasing the total number. The event loses some of its elitist jet-set character and becomes more open: any researcher in the world can participate. Environmentally aware participants are no longer under pressure to fly in exchange for academic career benefits. The program becomes more interesting and diverse.
Remote presenters can set up one- or two-way connections using facilities and support at their local institutions. Setting up a one-way stream (e.g., YouTube) is more difficult for the remote presenter than for the conference organizers. Organizers need no equipment beyond that for regular teaching: a regular computer with cable internet connection and internet browser, and a data projector. Below, we will consider an alternative two-way option (e.g., Zoom) that conversely means more effort for the organizers but less for the presenters.
At the time of writing, YouTube live streams are the most promising option for one-way streaming. The platform is reliable and the cost to the user is zero. AV quality is consistently high; transmission quality is almost independent of internet stability. However, the high AV quality comes at the cost of a buffering delay of some 20–60 seconds. Other conference participants can watch these remote talks either in real time or later, which can partially compensate for the social disadvantage of remote presentation. Personal communication with remote presenters can happen at a different time in a quiet room (“global foyer”) or during special sessions (which we called “virtual socializing” at the Conference on Interdisciplinary Musicology, Graz, Austria, 26–28 September 2019).
YouTube offers three access options:
Many academics who present papers at conferences do not want their talk to be published in the internet. They may be worried about making a mistake that cannot be corrected later, or publishing sensitive data. For that reason, options 2 and 3 may currently be preferable for academic conferences. Option 3 is realistic only if there is no individual access, but the remote stream is only shown in the auditorium or for smaller conferences, so we focus here on 2. Participants can be asked to sign an agreement to keep the links confidential (at our conference we indicated that this was voluntary, but all participants signed). Remote presenters can set up their stream (including starting time and URL) in advance. Organizers can then distribute the URLs to all participants in an email just before the conference.
Anyone with a mobile phone can send a YouTube live stream from a remote location. To ensure a good presentation, conference organizers can insist that speakers use a dedicated room at a university or similar institution with technical support (a person with whom your technician can communicate). Organizers can also insist on the presence of a small local expert audience (e.g. some PhD students) at the remote location. This creates a more comfortable situation for the speaker due to the natural visual and auditory feedback from the local audience during the presentation. If speakers cannot comply with these conditions, organizers can negotiate with them.
Questions and discussion following talks are important, and there are various options. One is to switch from one-way to two-way communication, as explained below. Another is to remain within one-way communication. For talks that are transmitted as YouTube live streams, audience members can type a question into a laptop or mobile phone. The remote speaker can then answer the question acoustically on the live stream. An advantage for both presenter and audience is that all questions and comments are documented. Examples of public YouTube live streams that are currently running and publicly available can be seen by visiting youtube.com/live. Next to the moving image is a chat stream. In an academic conference, audience members with a Google account can use the chat stream to comment or ask questions at any time. Another option is for the speaker to provide a mobile phone number. Audience members can send their questions by SMS, WhatsApp, Signal, or other messenging service and the speaker can answer acoustically. The chair can even make a regular phone call to the speaker at the end of the talk and pass the phone around the auditorium as audience members ask their questions. The speaker answers on the live stream. This last option does not work well on YouTube live stream due to the time delay. A further advantage of this approach is that conference organizers do not need to set up wireless microphones. This method can also be used as a backup in case microphones do not work or the sound quality on Skype or Zoom is poor.
Adding remote talks to a conference changes the conference budget. The total number of talks may go up at the same time as the number of local participants falls. This reduces both registration income and local costs, but not in the same proportion. Consider the following strategy:
While the option of remote presentation should normally be explained in the call for papers, it is also possible to introduce it for the first time when papers are accepted. Organizers can then inform participants that they can present remotely, give them a technical guideline, and ask for their decision by a deadline. On the assumption that all remote talks will be sent from dedicated rooms at other universities, the main target readers of the technical guidelines will be the technical support staff at the presenter’s university. The guideline might explain how to install Open Broadcaster Software (OBS) and use that to stream to YouTube. OBS juxtaposes the talking head of the speaker with the Powerpoint image (see appendix).
To avoid legal problems, presenters should be asked to avoid including soundfiles or images from the internet. YouTube can automatically find potential copyright infringements and block the stream. To our knowledge and in our experience, that is the only time YouTube is likely to fail.
More ambitious conference organizers may decide to live-stream every talk. For this, additional equipment is necessary. Conference rooms need regular teaching equipment plus wireless microphones, a sound mixer, and webcams for speaker and audience. All talks can then be live-streamed to YouTube as unlisted videos, so anyone in the world who knows the URL of a given stream can watch it either in real time or later.
URLs can be provided to participants in different ways. One option is a password-protected system (e.g. Moodle), with a separate page for each talk. These pages can contain abstracts and any other materials that presenters wish to upload such as proceedings, sound examples or other videos. They can also include discussion forums where participants could ask questions and speakers could answer them.
For a smaller conference, a password-protected system is not necessary. Instead, organizers can set up the YouTube streams in advance (to start automatically at set times) and send the URLs to all participants by email. As further protection against unauthorized use, it is possible to download all internet videos on a given date following the conference, delete them from the internet, and keep them in a private archive.
Remote presentations can be improved by making them (or the discussions that follow them) two-way. A two-way presentation is one in which people at two locations can talk back and forth with little or no perceptible time delay.
For this to work, both presenter and remote audience must be able to rely on a fast and stable internet connection. To check whether the connection is sufficient, the average netspeed per country can serve as a first indicator. According to Wikipedia “List of countries by internet speed” (accessed on 4 March 2019), Austria’s speed is 14.1 Mbit/s according to Akamai Technologies 2017, whereas for example Australia’s speed is 11.1 Mbit/s. Note that internet speeds vary considerably within countries and available speed is constantly increasing. Especially academic institutions may be well above average. During ICMPC in 2018, the netspeed at the University of Graz was typically 60 (download) to 90 (upload) Mbit/s. In mid 2019, one Gbit/s was available in most rooms. Online speedtests (e.g. speedtest.net) can be used to assess this parameter. Another important parameter for the viewing experience is the packet-loss rate (Hestnes et al., 2003). For these reasons, we recommend conducting test sessions in the exact rooms used for streaming.
Having tested various two-way AV communication options, such as WebEx, Google Hangout, Jitsi, Blue Jeans and Skype, we chose Zoom. We found Zoom relatively high in AV quality and easy to use and flexible in our specific case where one person is giving a talk using Powerpoint or any other presentation program, and both the talking head and the image of the remote audience can be seen next to the ppt slide. Also, it is possible to share the presenter’s computer sound directly. To increase flexibility, we purchased an inexpensive upgrade to Zoom Pro to enable talks from speakers at more than one location simultaneously. In a business context, Zoom was recommended by Fasciani et al. (2018).
For conference organizers, the process begins by sending an email with a Zoom link to the presenter to organize a short rehearsal. To conduct a comfortable question session with the local audience, use a webcam or inbuilt camera (so the remote presenter can see your audience), some wireless microphones, and a small audio interface or sound mixer with built in interface.
During the talk, the presentation will be shown in Zoom via screen-share. The camera picture will appear in the corner automatically. On the receiving end, the voice can be electronically amplified. After the talk, questions can be asked by the audience using a wireless microphone, which is connected to the computer via the audio interface. Consider the following points:
A multi-location semi-virtual conference incorporates all the above features. In addition to remote talks by individual presenters, the conference itself is spilt across several global locations, connected by one- and two-way streaming. ICMPC15/ESCOM10 took place simultaneously on four continents (Australia, South America, North America, and Europe) and the hub locations were Sydney, La Plata (Argentina), Montréal (Canada), and Graz (Austria). Many more hubs would have been possible using the same technology. On the basis of our practical experience organizing this conference, we recommend the following for future semi-virtual conferences.
To ensure a common academic standard, we carried out a thorough peer review procedure for submitted abstracts at one of the hubs (Graz) using ConfTool (for which we paid a small fee). Other hub organizers did not have to bother with abstract review, which made it easier for us to recruit hub organizers. Although Graz carried out the review procedure, Graz did not have special status on the conference program; to treat all participants equally, all hubs were nominally equal.
Consider a multi-hub conference, in which each hub is nominally equal, independently offers its own local program from accepted abstracts, and independently decides what talks or sessions from other hubs to include in its virtual program. In that case, there could be ten or more hubs at different global locations. The greater the number of hubs, the greater is the accessibility and cultural diversity of the conference, the smaller are the GHG emissions per participant, and the easier it is for individuals to organize hubs (because they are smaller). A large number of smaller hubs is possible and practical if each hub works relatively independently according to a central guideline and makes its own independent programing decisions. The organizer of each hub should have a relevant PhD (for an academic conference) and take advantage of the equipment available in regular local teaching rooms.
Each hub needs two or more presentation rooms: one or more sending rooms (also called live rooms or streaming-out rooms) for regular talks and one or more receiving rooms (virtual rooms, streaming-in rooms) for virtual talks. These terms refer primarily to one-way communication (YouTube); during two-way discussions (Zoom), sending rooms are also receiving, and receiving rooms are also sending. Since many institutions are now charging large amounts of rent for lecture theatres, organizers can save money by using regular teaching rooms instead, which becomes more feasible if the conference is split across a larger number of smaller hubs.
No matter where a hub is located in the world, it can communicate in real time with almost any other global location if the local daily program is divided into two half-days of four hours, separated by a relatively long lunchbreak or siesta. In the morning at each hub, conference participants will be communicating internationally toward the East; in the afternoon or evening, toward the West. For information about changing time differences due to daylight saving (summer time) see timeanddate.com.
Figure 1 is a sketch of one day of ICMPC15/ESCOM10. The top row (UTC or GMT) is time relative to UK time in winter (our conference was in the Northern summer, for which clocks had been put forward one hour). The red block in the middle of the figure represents a period of four hours during which three of the four hubs worked together. In local time, this block started in Montreal at 9 am, in La Plata at 10 am, and in Graz at 3 pm (15).
Each hub of our conference had a regular local program that included both live and virtual talks and was printed on paper. As at a conventional conference, we delivered this document to a printer a week before the conference. All participants also had access to an electronic 24-hour global electronic program with times in UTC (GMT). The global program gave an overview of all regular talks at all hubs. Each talk could be seen remotely at one or more other hubs (central organization was necessary to ensure that). During parallel sessions, participants could choose between local and virtual talks (“semi-virtual”). We recommended that participants switch back and forth to balance local and global content and experience a varied and dynamic program. Each hub offered one keynote, during which nothing else was scheduled at any hub.
For the future, we recommend that all events start on the hour or the half hour, around the clock. Every two hours, there should be a globally coordinated half-hour break (which can overlap with a poster session or pre-conference activity, see below). Breaks might start at UTC 000, 200, 400 and so on. The 90-minute blocks between these breaks would then begin at UTC 030, 230, 430 and follow standard patterns such as:
The daily program at any location would then be divided into two 3.5-hour blocks (90 minutes work + 30 minutes break + 90 minutes work), separated by a long lunch break (siesta). Depending on location, the local-time plan might be one of the following:
A shorter lunch break is possible, depending on where the hubs are located. Organizers can explore what is gained or lost at each hub when the break at a given hub is made longer or shorter.
At ICMPC15/ESCOM10, we divided talks into long (30-minute slot) and short (20 minutes) based on reviewers’ grades. In retrospect that made the task of creating a thematically coherent program, with sessions focusing on given issues or areas, too difficult. We now recommend choosing a single basic time unit in advance. A 20-minute basic unit would mean breaks of only 20 minutes every two hours, which would be more stressful for participants. Shorter breaks also increase the chance of delays or technical problems (technicians have to check a list of points before the start of each session; see appendix). Therefore, 30-minute units are preferable.
The 30-minute “break” just before the start and after the end of each half-day can be used in many ways: concert, warm-up activity, demonstration, installation, discussion. Participants who attend such activities will then be seated in time for the start of the following 90-minute work periods. That is important, given the importance of avoiding delays at this kind of conference (see timekeeping).
In the following, we explain in more detail the technological solutions that we adopted at ICMPC15/ESCOM10. They consist of the same components suggested for streaming and adding remote presentations above. Colleagues in different disciplines and countries can imitate us or adapt our approach for their purposes. We do not wish to specify an exact solution, because different conference traditions have different priorities. Moreover, the technology is constantly improving, so parts of this guideline will quickly become obsolete. The technology that we used was not exactly the same at each hub, depending on differences in available hardware. We will mention some of the differences in this document, but for simplicity we will focus on the similarities.
A 30-minute program slot at an academic conference is often divided into three parts:
For each of these, we used different software solutions, combining one-way and two-way streaming.
The easiest hardware solution for streaming out is a regular laptop with internet connection (preferably wired or over a hidden network) and a built-in camera and microphone. The laptop can be placed on the presenter’s podium (lectern, pulpit), as shown in Figure 2. Install OBS software on the laptop and use it to mix the talking head of the presenter with the Powerpoint screen. After that, send the mixed signal to YouTube or Zoom. Another way to encode the video signal before sending to YouTube is to use a hardware encoder. At our conference, that option was available at two of the hubs (Graz, Sydney).
The entire procedure can happen on a single computer, but in practice it works better on two, so the presenter and technician have separate screens, as shown in Figure 2. The presenter’s computer runs Powerpoint and the technician’s computer takes care of the streaming. By “presenter’s computer” we mean a laptop that organizers make available to presenters; presenters should not use their own laptops. The output of the presenter’s computer is connected to a hardware device (HDMI video grabber, cost: roughly $100), which appears as a webcam on the technician’s computer, as shown in Figure 2. A functional diagram of these connections is shown in Figure 3.
The two-way communication software Zoom can run either on the tech computer or on a third computer (cf. Audience Computer in Figure 2). Switching between talk and discussion involves changing the input to the stream and to the local projector. Two-way communication can be improved by including a video picture of the audience (giving a feeling of presence) as well as the picture of the presenter’s head and the Powerpoint slides. This solution can be realized either with the speaker’s computer also joining the Zoom meeting or with a camera operated by an assistant.
An external camera and microphone can be used for the presenter’s face and voice. The camera should be at the height of the presenter’s head. The microphone can be on a stand positioned closer and lower than the camera. Ideally, one may use a lavalier or headworn microphone. The camera and microphone should be placed such that the presenter is not distracted by them, so the situation feels as natural as possible. It is important to test this by having a colleague give a regular research presentation with a regular audience and talking to them about it later. Note that lighting is important to ensure that the presenter’s face is clearly visible. Test the lighting both during the day and in the evening.
A single external microphone can be used for audience members asking questions. Things will move faster in the question session if there are 2–3 wireless microphones: one for the chair and 1–2 for the audience (the speaker already having one). For that, an audio mixer is needed, and cheap ones are available. It is crucial that questions and answers can be heard at remote hubs. We asked all participants to hold the microphone close to the mouth, explaining that amplification levels need to be kept low to avoid acoustic feedback, when two-way communication is involved.
Technical problems can delay the start of a talk. At ICMPC15/ESCOM10, we avoided delays by backing up every channel of communication in real time: Zoom acted as a backup for YouTube and vice-versa. In other words, we ran YouTube and Zoom simultaneously in all talks. The technician at the front of the room switched from Zoom to YouTube at the start of each talk and back from YouTube to Zoom at the start of the question period, without turning off either stream. We started the YouTube stream during the break before a session and maintained it throughout the session. Afterwards, each YouTube file contained 2–4 talks, which could later be separated by editing.
We also backed up videos for later viewing. If the encoder (hardware or software) permits a local recording during the stream, this can be used to make a backup. If two-way communication fails, Jitsi can be used as a backup. The advantage of Jitsi is that a meeting can be started immediately without login in and with a customizable URL. If this URL is communicated, the other party can join immediately, again without logging in. But the AV quality of Jitsi is not always satisfactory.
In addition, a regular cellphone can be connected to a long cable. If the audio quality of two-way communication software is insufficient, participants can use this phone to ask questions. At our conference, this never became necessary, but it was reassuring to know that it was available. A telephone connection is very reliable, but the audio quality is worse than online communication services.
Presenters may be asked to do the following:
If the conference program includes parallel talks at different locations, it is important to ensure that they start exactly on time. We asked session chairs to start each talk within ten seconds of the advertised time. We informed presenters in advance of the importance of exact timing and clarified that in an internationally coordinated conference it is not possible to continue speaking after the programmed stopping time.
To achieve this, we created a timing tool. The TACT website was developed by Hannes Karlbauer in collaboration with the first author. TACT stands for Tonal Academic Conference Timekeeper. The abbreviation reminds us that it is necessary to tactfully remind presenters when their time is up.
TACT is an internet page that shows the time in UTC as well as the local time at each conference hub. Toward the end of each conference timeslot, it plays music to ensure that the discussion following the talk stops on time.
Throughout the conference, TACT ran on a separate computer in each room with an external loudspeaker. For this, we had no additional equipment costs because old laptops and loudspeakers (cheap PC monitors) were available from our IT department or privately.
TACT did not inform the presenter about the number of minutes to go before the end of the talk and the start of the discussion. Instead, student assistants held up signs with “5 minutes to go”, “3…”, “1…”, and “Time’s up!”
We had one technical assistant and one non-technical assistant (two each for plenary keynotes) in each sending room, in addition to the chair and the presenter. The technical assistants were studying audio engineering or similar and were coordinated by a head technician (the second author). Remuneration was by contract, course credits, or both.
The technical assistants trained for about two days before the conference began. Training included getting to know the setup and procedures, rehearsing communications with conference presenters, and showing non-technical assistants the basics. They were given access rights (passwords) and technical guidelines.
During the talks, the technical assistant sat at the front of the room next to the presenter and chair. One or more non-technical assistants were in the audience and passed around the microphone(s) during the discussion. See the appendix for a checklist for setting up and monitoring a talk/discussion.
Technicians at different locations need to be able to communicate easily, independently of the conference streaming system. They need to say things like: “Everything fine from your end?”, “Please turn up your microphone!”, “Are you ready for a question from your hub?” or “We can’t hear you!” Speaking quietly on the phone can be quicker than writing when critical situations occur.
One option is to use an instant messenger on the technicians’ private phones. While this might seem obvious and easy, one has to make sure beforehand that everyone knows which number to write to for which room at which time. Also, everyone needs to agree to one application. This can be difficult since some find WhatsApp problematic for security reasons (or cannot use it due to older operating systems) and in some countries open source apps like Signal are not supported. If there is one phone reserved for every room, technicians know which number to contact. At our conference, a lot of technical communication was done in the chat window of the two-way software, but this is not ideal, since during the discussion it may be visible to participants. We gave a lot of thought to setting up these various channels of communication in advance. In retrospect, that was one of the main reasons we managed to avoid technical delays.
Participants at all the four hubs of ICMPC15/ESCOM10 could electronically meet and virtually socialize with colleagues from other hubs, either spontaneously or at planned meetings. Breaks were timed to make this possible at different locations. Each hub had a quiet room called “global foyer” near the coffee area. It gave remote presenters a feeling of participation and local participants the opportunity to communicate easily and informally with them.
Each global foyer had a number of computers, each with a (built-in) webcam, a headphone amplifier, and a USB microphone. To avoid background noise, there were acoustically absorbent walls between the computers. Often, up to three people could sit at each computer and talk to up to three people at the remote location. People spoke into one central microphone but wore separate headsets.
A small 4-channel headphone amp and a USB microphone cost less than $60. The microphone had a cardioid pickup-pattern. Cheap headphones were provided and participants could also use their own. In terms of software, one can use Skype or similar services. We especially recommend solutions such as Jitsi, which run in the browser and makes use of the WebRTC API. Here, no user accounts are required. Every computer was constantly connected to a computer at another hub, so anyone could walk up to one of them, sit down, and start talking to someone, as in a typical conference coffee break.
At a single-location conference with remote presentations, the global foyer is primarily for remote presenters, whereas at a multi-location conference, all presenters may take advantage of it. We recommend setting up a 15- or 30-minute private communication timeslot for each presenter in advance. Discussion timeslots can be scheduled in a separate program, enabling speakers to communicate privately with interested audience members, either alone or in groups. A student assistant might get the task of organizing advisory meetings between senior and junior remote and local participants. The global foyer also gives local participants the chance to communicate with anyone anywhere at any time, e.g. during breaks.
Given the resources and time needed to set up livestreaming, and the constant risk of technical problems that could delay the conference program, it is interesting to ask whether the overall costs of such a project are offset by the overall benefits. To answer this question, we first need to evaluate the benefits, which in a first-order estimate can be done in US dollars. That includes benefits to future generations (due to CO2 emissions reductions), new conference participants (who would not otherwise have been able to participate), regular participants (who get access to a broader range of colleagues with whom to interact and research projects from which to learn), and participants with disabilities or caring commitments (who, like all participants, gain access to the entire conference program).
However calculated, these benefits far exceed the costs of purchasing, setting up, testing, and running the necessary equipment. For ICMP15/ESCOM10, we invested about €5800 in wages for a head technician (6 months, 10 hours per week) and a few hundred Euros each for four additional technical assistants at the main hub in Graz (Master’s students who also received course credits; €1 ≈ $1.1). We also spent some €200 on a Zoom upgrade, €1100 on wireless microphone rental, and €400 on other electronic equipment. Adding these together, the total technical costs were about €9,000 (8% of our total budget of €110,000). Other hubs spent less: their head technicians worked for only a few weeks and they needed only a few hundred Euros for equipment hire. The head technician was necessary in Graz to test different technical options during the months preceding the conference; for future conferences based on our model, this cost will be reduced. The other hubs did not need to carry out such tests, but instead followed our guidelines; our head technician conducted tests with each of them separately, communicating mainly by Zoom, Skype, and Whatsapp. Expenses that were the same as for a conventional conference included wages for a conference co-organizer (the third author, who was responsible for the peer-review procedure and many other organizational tasks) and €1200 for conference organization software (ConfTool).
Academics are experts in the art of evaluation (peer review, teaching evaluation). It is understandable to want to evaluate a new conference format thoroughly in advance of implementation. At ICMPC15/ESCOM10, we realized that a new conference format cannot be evaluated without experiencing it and getting used to it first. We noticed that participants were changing their minds about our approach during the event, and some continued to change their minds about it in the following weeks and months.
We also observed that opinions about such an issue can be very diverse. Our impression is that student assistants (hospitality and technology) had a more positive opinion, on average, than regular conference participants. In addition, younger participants had a more positive opinion than older, perhaps because they were more open to the technology (similar to social media) or because climate change will affect them more than it affects older people. For these reasons, it may therefore not be possible to speak of an “average response”.
On the second-last day of the conference, we asked all participants at all hubs to access an internet-based evaluation form. Of about 600 participants, 199 took part. We presented summary results in the closing session.
The survey asked for each participant’s active or passive role, physical location, and overall rating of the conference experience. Of the 199 survey participants, 84% were active (59% speakers; 25% poster presenters). The breakdown by physical location was 55% in Graz, 29% Montreal, 7% La Plata, 6% Sydney, and 3% no hub (remote participants).
We then asked participants to rate the semi-virtual format on an 11-point scale from very bad to very good. Of those that avoided the middle point of the scale, 61% responded positively. Satisfaction was highest in La Plata, where most participants could not have afforded to travel to a conventional conference in Graz, followed by Sydney and Graz. Satisfaction was lowest in Montreal, where many participants would have preferred to fly to Graz. The number of participants in Graz was about twice that in Montreal, but the two hubs otherwise had equal status on the conference program. Differences in satisfaction across hubs might be avoided in future by making hubs more equal in size. For this purpose, we could have created an additional hub elsewhere in Europe, such as in the UK. The hubs in Austria, UK, and Canada would then have been more similar in size.
No statistics were recorded about the typical size of audiences for live versus virtual talks. We recommend collecting that data in future. Anecdotal evidence suggests that audiences were bigger for live talks, but only at larger hubs. First, live talks work better for the audience than virtual talks (but with improvements in technology and increasing familiarity with virtual communication, this difference will gradually disappear). Second, at the larger hubs there were relatively many live talks of high academic quality to choose among. At smaller hubs, the difference was smaller due to the higher academic quality of remote talks relative to live talks.
At ICMPC15/ESCOM10, every talk was live-streamed and seen at one other hub, either in real time or after a delay due to international time differences. The discussion following each talk always involved two hubs. Real-time discussions were acoustic, and delayed discussions were written. In addition, all talks were available to all participants to watch and comment on using their laptops, tables and mobile phones. Because we carefully rehearsed procedures with colleagues at all hubs, no talk was canceled or delayed for technical reasons.
We consider the benefits of the semi-virtual approach to outweigh the disadvantages by a considerable margin. The main disadvantage is lack of face-to-face contact with colleagues from distant countries. This is more than counterbalanced by allowing new colleagues to participate who would not otherwise have been able to afford it or to travel to the conference (increasing accessibility, equity, and cultural diversity) and reducing climate-damaging GHG emissions.
A semi-virtual, multiple-location conference, with hubs on different continents around the globe, can for the first time reasonably be called “global”. The new format makes it possible to aspire to and approach a global balance among representative geographic areas. An equivalent one-location conference typically attracts many more participants from the continent where it is located than from other continents.
A semi-virtual conference format can be used to promote socioeconomic and cultural diversity among the participants by including one or more hubs in lower-GDP or culturally contrasting countries. This strategy ultimately impacts positively on the relevance and quality of the academic content. Colleagues from non-rich countries will initially have less experience with such events, and younger colleagues may not enjoy the same level of academic supervision from older colleagues. But if the conference is repeated periodically, academic levels in the non-rich countries will increase, positively impacting the academic quality of the entire event, which in turn will positively impact the discipline. More generally, this strategy will make a new positive contribution to global development.
At our conference, GHG emissions per participant were reduced by 60–70% relative to an equivalent single-location conference. We estimated this by asking participants at registration how they traveled to the conference. For practical reasons at the different hubs, this data was not always collected; where data were missing we estimated the carbon footprint of each participant by making reasonable assumptions about typical travel patterns. In future, emissions and international time-difference problems could be further reduced by adding more hubs.
The semi-virtual format assumes a network hub structure (not a hierarchy) and places no limit on the number of hubs, just as the internet places no limit on the number of servers in the world. At a semi-virtual conference with many hubs (say, 10–20), each hub would propose its own live program to all the others, after which each hub would choose its virtual program from the offerings of the other hubs. All participants would still have the chance to see any talk virtually, either in real time or later, which is not possible at a conventional conference with parallel sessions.
Colleagues considering a low-GHG conference of this kind may be wary of changing an existing, successful tradition. We were similarly cautious. While preparing for ICMPC15/ESCOM10, we tried out several different technological and logistic solutions. In advance of the conference, we were unsure how our technological solutions would be received by participants. For either of these reasons, we might have given up and returned to a conventional format. But doing so would have delayed a long-overdue reform.
During and after the conference, we noticed that skeptical colleagues became less so. That would be consistent with the psychological finding that acceptance increases with familiarity (cf. Kang & Gretzel, 2012). Like preference for music, preference for an electronic conference format may depend primarily on a combination of complexity (optimum complexity being preferred, neither too simple nor too complex) and familiarity (North & Hargreaves, 1995). “Individuals who have greater familiarity with technology in general, those with higher educational levels, and those who have greater prior experiences are likely to have more positive beliefs about new technologies” (Agarwal & Prasad, 1999, p. 385).
One- and two-way AV communication is not the only way to reduce the CO2 emissions of conferences and improve accessibility for distant, disabled, financially disadvantaged, or otherwise less mobile participants. Another promising approach involves telepresence robots (Neustaedter et al., 2018). This idea was beyond the scope of our 2018 conference due to the additional cost and technological complexity in hardware and software. It is nonetheless a promising avenue to explore. Our multi-location format may in the future be combined with telepresence robots or other emerging technologies to more closely achieve our academic, personal, and environmental goals.
Our experience with ICMPC15/ESCOM10 allows us to make the following predictions for the coming decades. First, low-GHG conferences will become the norm rather than the exception. As the global climate crisis escalates, academics will increasingly reject environmentally damaging, elitist, single-location academic conferences. Instead, they will take the opportunity offered by modern internet communication technology to open up their research traditions to colleagues from non-rich countries and in that way contribute to international development efforts such as the Sustainable Development Goals of the United Nations. Second, live streams and videos will increasingly be regarded as normal forms of academic dissemination, alongside more traditional conference proceedings, peer-reviewed journal articles, book chapters, monographs, and popular media reports. Each kind of dissemination will be seen as having its own special uses and functions. Sometimes it is easier to watch a good video than to read an academic paper. Individual colleagues can only benefit from this additional possibility, which they can either use or ignore as they see fit.
Other conference organizers with similar ambitions can copy our approach or adapt it for their purposes. We will be glad to participate in discussions about adapted or alternative formats.
The following guideline was written in 2018. There will be many detailed changes in coming years, but they are likely to be self-explanatory. The basic principles should remain stable.
Note that YouTube alone will not create a stream that can show both the speaker and the speaker’s screen. The user needs to encode the data and send it to the YouTube server—either with hardware or software such as OBS (see below).
A live stream must be created in OBS before starting the YouTube stream.
Here is what the technician in each live room did before the start of each session at our conference at the Graz hub, with a hardware encoder. Every conference will have a different setup and a different list.
After the talk – before the discussion:
We thank the hub organizers and their assistants for making ICMPC15/ESCOM10 possible: Christine Beckett and Eldad Tsabary (Concordia University, Montréal, Canada), Isabel Cecilia Martínez (Univercidad Nacional De La Plata, Argentina), and Emery Schubert (University of New South Wales, Sydney, Australia). The conference was supported financially or institutionally by SEMPRE (Society for Education, Music, and Psychology Research), Land Steiermark (Province of Styria, Austria), University of Graz, UNLP (Universidad Nacional De La Plata), ESCOM (European Society for the Cognitive Sciences of Music), SMPC (Society for Music Perception and Cognition), AMPS (Australian Music Psychology Society), and Österreichische Forschungsgemeinschaft (Austrian Research Association). For the assessment of the conference’s carbon footprint, we thank Jakob Mayer, Wegener Centre for Climate and Global Change, Graz.
The authors have no competing interests to declare.
Agarwal, R and Prasad, J. 1999. Are individual differences germane to the acceptance of new information technologies? Decision Sciences 30(2): 361–391. DOI: 10.1111/j.1540-5915.1999.tb01614.x
Astudillo, MF and AzariJafari, H. 2018. Estimating the global warming emissions of the LCAXVII conference: Connecting flights matter. International Journal of Life Cycle Assessment 23(7): 1512–1516. DOI: 10.1007/s11367-018-1479-z
Fasciani, M, Eagle, T and Preset, A. 2018. Magic quadrant for meeting solutions. gartner.com
Freeman, S, Lee, DS, Lim, LL, Skowron, A and De León, RR. 2018. Trading off aircraft fuel burn and NOx emissions for optimal climate policy. Environmental Science & Technology 52(5): 2498–2505. DOI: 10.1021/acs.est.7b05719
Graver, B, Zhang, K and Rutherford, D. 2019 Month: September . CO2 emissions from commercial aviation, 2018. Working paper 2019–16, International Council on Clean Transportation. www.theicct.org.
Hestnes, B, Brooks, P, Heiestad, S, Ulseth, T and Aaby, C. 2003. Quality of experience in real-time person-person communication–user based QoS expressed in technical network QoS terms. Proceedings of the 19th International Symposium on Human Factors in Telecommunication, 3–10. hft.org/HFT_03.htm.
ICAO. 2016. On board: A sustainable future. ICAO environmental report 2016: Aviation and climate change . Montreal, Canada: International Civil Aviation Organization (ICAO). icao.int.
ICAO. 2019. ICAO global environmental trends – Present and future aircraft noise and emissions. Assembly — 40th Session; Executive Committee; Agenda Item15: Environmental protection – General provisions, aircraft noise and local air quality – policy and standardization . https://www.icao.int/Meetings/A40/Documents/WP/wp_054en.pdf.
Interagency Working Group on Social Cost of Greenhouse Gases. 2016. Technical Support Document: Technical Update of the Social Cost of Carbon for Regulatory Impact Analysis Under Executive Order 12866. United States Government. epa.gov.
Kang, M and Gretzel, U. 2012. Perceptions of museum podcast tours: Effects of consumer innovativeness, Internet familiarity and podcasting affinity on performance expectancies. Tourism Management Perspectives 4: 155–163. DOI: 10.1016/j.tmp.2012.08.007
Neustaedter, C, Singhal, S, Pan, R, Heshmat, Y, Forghani, A and Tang, J. 2018. From being there to watching: Shared and dedicated telepresence robot usage at academic conferences. ACM Transactions on Computer-Human Interaction (TOCHI) 25(6): 33. DOI: 10.1145/3243213
North, AC and Hargreaves, DJ. 1995. Subjective complexity, familiarity, and liking for popular music. Psychomusicology 14(1–2): 77. DOI: 10.1037/h0094090
Owen, B, Lee, DS and Lim, L. 2010. Flying into the future: Aviation emissions scenarios to 2050. Environ. Sci. Technol . 44: 2255–2260. DOI: 10.1021/es902530z
Preist, C, Schien, D and Shabajee, P. 2019, Month: April . Evaluating sustainable interaction design of digital services: The case of YouTube. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 397. chi2019.acm.org/for-attendees/proceedings/. DOI: 10.1145/3290605.3300627
Ricke, K, Drouet, L, Caldeira, K and Tavoni, M. 2018. Country-level social cost of carbon. Nature Climate Change 8(10): 895. DOI: 10.1038/s41558-018-0282-y