How DWP managed a surge in demand for Universal Credit during COVID-19
How DWP managed a surge in demand for Universal Credit during COVID-19 Derek du Preez Fri, 11/13/2020 - 01:08
Summary:
The reformed working-age benefits system, Universal Credit, has been being rolled out for a number of years - but faced an unprecedented test when COVID-19 hit.
(Image sourced via DWP )
Universal Credit (UC) was designed as a means-tested benefits system to replace six existing benefits schemes for those of working age. The idea behind the Department for Work and Pensions (DWP) policy - which has been controversial - was to simplify support for citizens and encourage those on benefits to get back to work. Back in 2010-2012 Universal Credit was consistently in the news for its poor approach to technology development and implementation. The programme was initially being developed by a handful of suppliers - including Accenture, IBM, HP and BT - but was later ‘reset' after it was found that the technology was not in tune with the policy requirements. DWP then started developing a digital version in-house, known as the ‘full service', which was being led by a new ‘DWP Digital' team that aimed to focus on user-centred design, agile methodologies and multi-disciplinary development. DWP Digital has since written 50 million lines of code, works on 10,000 changes to IT systems per year, maintains 1,000 applications and exchanges 10 million data records per day. The team has been leading the rollout of Universal Credit nationally for the past few years, and once at full scale will be serving 7 million people and paying out £67 billion annually. We got the chance to speak with Tom Padgham, Deputy Director of Engineering at DWP, and Patrick Downey, Head of Platforms at DWP, this week as part of MongoDB’s .Live Northern European event. DWP Digital is using MongoDB as its core database for Universal Credit. The comments from Padgham and Downey in this story are taken both from their presentation at the event and from our conversation together. As noted above, DWP Digital started on the Universal Credit project back in 2013 and the team was determined to do things differently. It adopted agile ways of working in collaboration with the Government Digital Service, took a user centred approach to design, focused on building minimum viable products, and sought to iterate quickly. By 2015 there was a controlled rollout of the benefits system to job centres and in 2017, during the larger national rollout, the team migrated from a government approved datacenter to AWS. By 2018 the national rollout was complete, albeit just for new applicants or those on the existing previous benefits that had had a change in circumstance, which were then migrated to Universal Credit. There are still 6 million people on previous existing benefits to be migrated. Universal Credit is based on a variety of “mature” technologies, according to Padgham, including Java microservices, MongoDB, Kafka, Jenkins, Git, Bazel, Serenity, Gatling, AWS, Terraform, Vault, Ansible and Puppet. Unexpected demand However, by mid to late March of this year, DWP was facing an unexpected and unprecedented test of its systems and infrastructure, when the realities of the COVID-19 pandemic hit the UK. Padgham said: Were DWP expecting a pandemic? Well, no, like most other people we weren’t. We do have good predictions about future traffic from our business under normal circumstances. This helps us plan ahead and make sure our service is ready for that traffic. A few years ago we were asked to test how Universal Credit would handle a massive spike due to a news story. We didn’t take this too seriously because of the type of service Universal Credit is. We do however do proactive performance testing based on those business predictions and we test six months ahead of ourselves. This gives us time to resolve any performance issues that might arise. This type of testing and the way we have built our platform has stood us in good stead for the events of 2020.
Downey added that the infrastructure itself was designed in a way to help cope with demand, particularly with the use of AWS, microservices and MongoDB. He said: The very fact that we have got the number of microservices that we have, gives us a lot of levers about where we can increase resources as necessary. We don’t need to make everything bigger, we can fairly gradually target which pieces of the architecture need a bit more help.
Downey said that Universal Credit currently uses 8 MongoDB clusters, most of which consist of 5 nodes, spread across 3 AWS availability zones. The busiest of these clusters currently handles around 15,000 requests per second. DWP Digital stores 8.5 billion unique data objects in each of these clusters, with 110 TB of uncompressed MongoDB data - which gives you an idea of the scale of the operation. When COVID-19 hit, particularly after the Prime Minister’s lockdown announcement and the subsequent financial support packages that were unveiled, the Universal Credit system and the DWP team faced a nail biting test of whether they could hold up against the huge spikes in demand from claimants. Padgham said: We had rolled out really carefully in the early days to make sure that we landed the service well. And then from late 2018 through to 2020 we were on a fairly standard rollout up to around 2 million people. This was all looking pretty normal in early March 2020 and our six month performance testing was looking pretty good. At that time we were experiencing around 4% average month on month growth and we would typically have around 100,000 successful claims to UC every fortnight. COVID-19 really took hold in the UK in the middle of March 2020. That kind of massive increase in traffic and claims was totally unexpected and within a few months it took us to over 5 million active claims. We were experiencing a 40% increase in people on Universal Credit up to April 2020. A tenfold increase. And in one particular fortnight we have around 950,000 claims.
Downey explained that at certain peaks there was 2.5X the normal number of claims per second and that kind of traffic didn’t die down until the beginning of May. He added: We were sitting on the edge of our seat about how the platform will perform and whether it will stand up to demand. We were actually in a pretty good state because of the six month ahead performance testing that we do. So we knew that our site could withstand much more traffic than it currently receives. But we weren’t so sure that it would handle this much extra traffic. How did we respond from an operations perspective? We increased our application capacity and added more web servers and more application servers. We increased our database capacity - in the first and second week we increased our compute and memory for two of our Mongo clusters. And over the coming months we would move one of our large Mongo databases into a cluster of its own in order to provide us with a bit more capacity. We also increased the oplog size for the busiest cluster so that we would consistently have a day or two of oplog, so that responding to a recovery event wouldn’t be so bad. We benefited from using AWS and being able to expand our capacity on demand. We also benefited from a lot of the government announcements happening late in the day when our usual daytime traffic had fallen away. This actually meant that more capacity was available to deal with access requests. Fortunately we never had to taper or control the amount of traffic coming in, so that was good for us.
For an idea of how dramatic the spikes in demand for Universal Credit were during this period, take a look at the below graph:
(Image sourced via DWP Digital )
Typically DWP Digital does major releases every fortnight and minor releases in the weeks in between - with the ability to release urgent changes on demand with zero downtime. To give you further idea of the impact of COVID-19, in January and February the team released 6 urgent changes. In March and April it released 76 urgent changes. Downey said that this testament to the true agile nature of Universal Credit, in that it could adapt and evolve as the situation changed. Future plans Padgham and Downey state that the experience of COVID-19 and the fact that the Universal Credit systems held up under pressure is evidence that agile delivery and the approach the team has taken is effective. They are now looking to make further changes and are considering the use of a Database-as-a-Service - such as MongoDB Atlas. Downey explained: The value of that is that it cuts across a number of different areas. We don’t need the ownership cost of running a cluster and making sure it’s up to date and backing it up and testing restorations. From an operational perspective it’s a bit nicer and will reduce some support effort. It makes running that part of the service someone else’s problem. We’re here to provide benefits to the nation, we’re not here to run database clusters. The other important aspect is giving more control to the delivery teams and allowing them to scale a bit. At the moment there’s a bunch of shared resources and so different aspects of the system will compete for resource. With the addition of something like DBaaS that means creating single clusters or database instances should become a whole bunch easier, which means that we can better decouple our system.
The significance of the work DWP Digital is doing with Universal Credit is not lost on Padgham and Downey either, as they look to the months ahead and what pressures may be placed on the system down the line. Padgham said: Technology is one thing, but more importantly, Universal Credit is a system whose failure affects people with real life consequences. If people don’t get paid their families go hungry, they could miss their rent, or not afford their heating. And Britain has entered the deepest recession since records began, so this will continue to have an impact for months, if not years to come. Our focus remains on making sure people get paid and have access to the support they need, when they need it.
Tags
Coronavirus
COVID-19
Read more on:
Digital government and public services
Use cases
Author: Derek du Preez
Date: 2020-11-13
URL: https://diginomica.com/how-dwp-managed-surge-demand-universal-credit-during-covid-19
diginomica.com
Building COVID-19 dashboards with Tableau at Bank of America (2020-11-02) | Building COVID-19 dashboards with Tableau at Bank of America Derek du Preez Mon 11/02/2020 - 05:59 Summary: Bank of America needed new insights into customers requesting mortgage holidays during the height of the COVID-19 crisis It used Tableau to present the data Image sourced via Bank of America When COVID-19 hit the United States in full force earlier this year one of the primary concerns for .. |
Unit4 survey looks at the buyer landscape. Part 2, the people element (2020-11-19) | Unit4 survey looks at the buyer landscape Part 2 the people element Den Howlett Thu 11/19/2020 - 07:09 Summary: The people element in the survey found important points that suggest an emphasis on meeting user need is now central to product development Thats good news from Unit4 survey report In the first part of this mini-series I outlined the findings of a recent Unit4 survey from the overall buy.. |
Wellcome has a new trust in technology (2020-11-25) | Wellcome has a new trust in technology Mark Chillingworth Wed 11/25/2020 - 01:41 Summary: Head of Technology at influential health funding body Wellcome Trust describes common platforms and leadership styles have seen a new digital approach thrive Image sourced via the Wellcome Collection Eileen Jennings-Brown has a diversity challenge As Head of Technology for the Wellcome Trust the independent g.. |
COVID and the contact center - how eBay’s biggest re-commerce seller musicMagpie has become more efficient during the crisis (2020-11-17) | COVID and the contact center - how eBays biggest re-commerce seller musicMagpie has become more efficient during the crisis Stuart Lauchlan Tue 11/17/2020 - 03:19 Summary: musicMagpie is an e-commerce success story but the onset of COVID brought fresh challenges to how the firm ran its contact center operation In the event the crisis has in fact changed a lot for the better Pixabay A common theme .. |
We haz audio, check it out (2020-11-16) | We haz audio check it out Den Howlett Mon 11/16/2020 - 10:08 Summary: You can now get audio versions of our content directly from the story headline Image by Free-Photos from Pixabay Im pleased to announce that we now have audio versions of diginomica content Ill get into the detail further on but will start with some background Earlier in the year I spoke withcolleagues who promote accessibility .. |
Should the power of the crowd be used to solve our social problems? (2020-11-20) | Should the power of the crowd be used to solve our social problems? Gary Flood Fri 11/20/2020 - 02:49 Summary: According to new research from Nesta and The GovLab the answer is yes Which is why we need to encourage the growth of open source-powered collective intelligence Question - what links all the following great projects? GoodSAM: a way to call emergency services and instantly share your loca.. |
Data and analytics are the key to marketing-led growth, but marketers need to overcome key challenges first (2020-11-12) | Data and analytics are the key to marketing-led growth but marketers need to overcome key challenges first Jonathan Beeston Thu 11/12/2020 - 01:16 Summary: Jonathan Beeston Product Marketing Director EMEA at Salesforce Datorama describes how marketers are increasingly responsible for business growth While data and analytics can unlock actionable insights marketers are met with challenges in integr.. |
A conversation with PwC’s Mark Chalfen on all things SAP S/4HANA (2020-11-10) | A conversation with PwCs Mark Chalfen on all things SAP S/4HANA Den Howlett Tue 11/10/2020 - 10:52 Summary: How is PwC seeing the world of SAP going into 2021? I had a conversation with Mark Chalfen lead architect for S/4HANA at SAP Image sourced via PwC Theres a steady drumbeat of comment flowing our way about SAP and S/4HANA in particular The latest conversation Ive had was with Mark Chalfen dir.. |
ISolved wants to move up the HCM food chain - is that possible? (2020-11-17) | ISolved wants to move up the HCM food chain - is that possible? Den Howlett Tue 11/17/2020 - 05:56 Summary: James Norwood newly minted CMO at iSolved has bold ambitions for the companyHere is our conversation James Norwood CMO iSolved via iSolved iSolved has appointed Industry veteran James Norwood as CMO Who? What? Its aninteresting pair of questions in a market thatswide deep and fragmented Firs.. |
COVID-19 leads to first easyJet loss in 25 years - but digital customer experience is top priority (2020-11-17) | COVID-19 leads to first easyJet loss in 25 years - but digital customer experience is top priority Derek du Preez Tue 11/17/2020 - 04:35 Summary: European budget airline easyJet has suffered its first loss in its history as a result of the COVID-19 pandemic but the company is continuing to invest in digital customer channels Image sourced via easyJet website The COVID-19 pandemic and the subsequen.. |