Identifying long COVID using coded data and free text entries from primary care health records in Scotland

15 May 2023
Background: Long COVID is a debilitating multisystem condition. Accurate estimates for the prevalence of long COVID are vital for policy makers and healthcare planning. To estimate prevalence, we analysed routinely collected data from almost the entire adult population of Scotland. Methods: A cohort of adults (≥18 years) resident in Scotland between 1-March-2020 and 20-October-2022 was created from primary and secondary care, laboratory testing and prescribing data. Long COVID was identified using four outcome measures: clinical codes, free text in primary care records, free text on sick notes, and a novel operational definition. We looked for differences in the prevalence of long COVID by patient characteristics including age, sex, body mass index, deprivation, severity of disease, vaccination status and the following pre-existing respiratory diseases: asthma, chronic obstructive pulmonary disease (COPD), respiratory cancer, pulmonary embolism, cystic fibrosis, bronchiectasis or alveolitis. Results: Of 5,104,198 participants, 90,712 (1.8%) were identified as having long COVID by one or more outcome measure. Clinical codes were recorded infrequently (n=1,092, 0.02%). More people were identified using free text (n=8,368, 0.2%), sick notes (n=14,471, 0.3%) and the operational definition (n=73,767, 1.4%). Compared with the general population, a higher proportion of people with long COVID were female, middle-aged, overweight/obese, had at least two comorbidities, were immunosuppressed, shielding, or hospitalised within 28 days of testing positive, and had tested positive before the Omicron variant became dominant. Of the respiratory diseases investigated only asthma was found to be more prevalent among cases of long COVID identified by clinical codes, free text, or sick notes, relative to the general population. Discussion: The prevalence of long COVID presenting to general practice in Scotland was 0.02 - 1.8%, depending on the measure used. Identifying long COVID using free text in health records or sick notes identified substantially more cases than clinical codes.

Resource information

Respiratory conditions
  • COVID-19
Respiratory topics
  • Diagnosis
Type of resource
Munich 2023
Luke Daines1, Karen Jeffrey1, Lana Woolford1, Rishma Maini2, Siddharth Basetti3, Ashleigh Batchelor4, David Weatherill4, Chris White4, Vicky Hammersley1, Tristan Millington1, Calum Macdonald1, Jenni Quint5, Steven Kerr1, Syed Ahmar Shah1, Adeniyi Francis Fagbamigbe8, Colin Simpson9, Srinivasa Vital Katikireddi10, Chris Robertson11, Lewis Ritchie12, Aziz Sheikh1 1Usher Institute, University Of Edinburgh, Edinburgh, United Kingdom, 2Public Health Scotland, Glasgow and Edinburgh, UK, 3NHS Highland, Inverness, UK, 4Patient and Public Contributors, affiliated to Usher Institute, Edinburgh, UK, 5National Heart and Lung Institute, Imperial College London, London, UK, 6NHS Borders, Melrose, UK, 7NHS Dumfries & Galloway, Dumfries, UK, 8Institute of Applied Health Sciences, Aberdeen, UK, 9School of Health, Wellington Faculty of Health, Victoria University of Wellington, Wellington, UK, 10MRC/CSO Social & Public Health Sciences Unit, University of Glasgow, Glasgow, UK, 11Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK, 12Academic Primary Care, University of Aberdeen, Aberdeen, UK