Phase 2 – Collect Data

Patient Pathways Guide: Phase 2 – Collect Data


Collect Data

The second phase of the PPA is to research and collect the data sources needed for each of the core metrics. At least one care seeking data source and one data source on health facilities will be required, and it is likely that additional data sources will be needed.

In general, TB-specific data should be prioritized over general data. For example, where both TB-related care seeking data are available from a TB prevalence survey as well as general illness care seeking data from a DHS survey, the data from the TB prevalence survey should be prioritized.

Further, where sub-national data are available, they should be prioritized over national-level data. In some cases, care seeking data are available for TB only at the national level, but general care seeking data are available at the sub-national level. In these situations, the scope of the analysis decided on in section Phase 1 should help decide which data to use. For example, if the scope of the PPA is the national level, TB-specific data at the national level should be prioritized. However, if the scope of the PPA is sub-national, care seeking for general illness can often be a useful proxy for TB care seeking. This relationship between TB and general care seeking can be tested by comparing the two data sources at the national level.

When reviewing data sources, the team should put a limit on the age of the data sources. It is often useful if this time limit is programmatically informed. For example, if major changes to the TB program were implemented six years prior to the time of conducting the PPA, it might be useful to limit the data sources from within the last six years.

As mentioned earlier, collecting data for the PPA will require between one to three weeks of effort. As soon as the team has identified available data sources, they should determine whether or not the data need to be requested from other partners. If so, those requests should be made immediately, as it can take some time for access to be granted.

The rest of this section will describe the most common data sources used for each of the core metrics.

The goal of this section is to develop a list of all the data sources that will be used in the PPA, the types of data files that are available (e.g., a .csv file, an online database, or a table in a report), and whether or not the team will need to request the data from a third party in order to access them.


Data Sources for Core Metrics

Number of Health Facilities

A census or survey of all health facilities in the country is essential to conducting a PPA. This includes facilities with TB services and facilities without TB services. This list of facilities serves as the denominator that enables calculation of TB service coverage.

Health Facility Master List (HFML)
Data Source Description
The best source of data on number of health facilities is a national health facility master list or registry. These databases often provide a list of all registered health facilities in a country, along with important details about each facility, which might include facility name, ownership, location, or type of facility (e.g. hospital, clinic).
Where to Find Data
These databases are typically publicly accessible and are available to download from a Ministry of Health website. In some cases, they will not be accessible publicly and will need to be requested from a MoH counterpart.
Typical File Type
Most websites have an option to download the entire list rather than just view on the website. If possible, it is best to access an Excel or .csv file that lists all of the facilities in the database
Examples

Place of Initial Care Seeking

The first core PPA metric is the proportion of patients that initiate their care seeking journey at each sector and level of the health system (e.g., Public Level 1 or Private Level 0). This section describes some of the most common data sources that provide information on care seeking patterns, listed in order of priority. As described in the introduction for this section, if multiple data sources are available, it’s best to prioritize TB-specific data wherever possible. After that, using data on general illness (rather than specific non-TB illness or child illness) should be prioritized.

Service Availability and Readiness Assessment (SARA)
Data Source Description
If the SARA is a census, it may be used for Number of Health Facilities, TB Services Coverage, or both. For example, in the case of a census, a team may use a Health Facility Master List for the Number of Facilities Metric and the SARA for coverage of one or more TB services. Or, in the absence of a Health Facility Master List, a PPA team could use the SARA if it was a census for the Number of Facilities metric and NTP records or other data sources for the TB Services Coverage metrics. However, if the SARA is a survey, it must be used for the Number of Facilities Metrics and all TB Service Coverage metrics so that the proportions reflected for TB Services Coverage are accurate. More information about SARAs can be found in the TB Services Coverage section below.
Demographic and Health Survey (DHS)
Data Source Description
The Demographic and Health Surveys are large, nationally representative surveys that collect a wide variety of information on health, population, and—in some cases—care seeking. They are conducted for many countries at roughly five-year intervals. In each iteration of the DHS, there is typically a survey given to a household, as well as to males and females separately. The data from the female survey are referred to as the individual recode and provide the widest variety of survey questions on care seeking patterns. There are four questions about care seeking that are most commonly included in an individual recode data set. A fifth question is less common, but can be useful for the PPA. These care seeking questions (and their associated variable codes) are:
  1. h44a Place of care seeking for a child with diarrhea
  2. h46a Place of care seeking for a child with fever
  3. v829 Place of care seeking for an HIV test
  4. v842 Place of care seeking for an antenatal HIV test
  5. s1113 Place of care seeking for any illness or treatment in the last 30 days, among adults
If question 5 is available, this is the best DHS metric to use in a PPA as it focuses on adult care seeking patterns (rather than children). However, if it is not available, question 2 can be a reasonable proxy for TB-related care seeking.
Where to Find Data
To access the DHS data, you need to download the full survey dataset from the DHS website:
  1. The first step in the process of accessing DHS data is to register on the DHS website. A link to the registration page is here.
  2. After successfully registering, you should navigate to the login page and create a username and password to log in.
  3. Once you have logged in, you will need to create a new project before you can request any data sets. Projects require information about the title of the project, the researchers who should be included on the project, and a short description of the project. The project description should describe the patient pathway analysis and any additional potential ways you might use the care seeking information.
  4. Once you’ve created your project, you can then request data sources to add to that project. You will be notified that access was granted via email. Access is typically granted within one to two days.
Typical File Type
DHS survey results are available in several different file types. However, to be used in the PPA Wizard, the survey data should be downloaded as a .dta file. The zip file containing the .dta file will also include several other files that will include useful documentation, including variable codes and value codes.
Examples
TB Prevalence Survey
Data Source Description
WHO describes prevalence surveys as ‘crosssectional and population-based surveys of a random sample of the population in which the number of people with TB disease in the survey sample is measured.’ In addition to measuring the prevalence of TB in the population, many prevalence surveys ask questions about where participants sought care when they first recognized symptoms of illness related to TB. The responses to this question provide the most useful care seeeking data for a TB PPA, because they are specific to TB and are often representative of the population at the national level.

In some cases, prevalence surveys include a variable on the region or other sub-national area in which survey respondents reside. However, sub-national data should be reviewed with caution, based on the statistical power of the survey. Requests need to be made to the MoH or NTP to access the raw data from a prevalence survey. If the raw data are available, the team may need to refer to a data dictionary to explain some of the variables or responses included in the dataset.
Where to Find Data
Raw data are typically available from the National TB Program, Ministry of Health, or Ministry of Statistics. If they are not available on the website, they could also be available upon request from one of these Ministries. The WHO TB team may also be able to provide or data from prevalence surveys, upon request.
Typical File Type
The raw data from a prevalence survey will likely be accessible in an Excel or .csv file.
Examples
Health Expenditure and Utilization Survey (HEUS)
Data Source Description
Health Expenditure and Utilization Surveys provide data on patient use of and expenditure in the health care system of a country. Occasionally, these surveys include TB-specific expenditure and utilization data. Often, they include information about where patients seek care for general illness or outpatient care. The latter can be used as a proxy for TB care seeking. In some cases, these surveys are powered at the sub-national level, and if the scope of the PPA exercise is subnational, the HEUS is a good data source to use for care seeking patterns.
Where to Find Data
Raw data are typically available from the Ministry of Health or Ministry of Statistics. If they are not available on a public website, they could also be available upon request from one of these Ministries.
Typical File Type
The raw data from an HEUS will likely be accessible in an Excel or .csv file.
Examples

TB Services Coverage

The next metric of the PPA is the percent of health facilities in each health sector and level that have TB services available. Typically, this metric represents the availability of the tool or service within a facility, but does not capture whether there is capacity to use the tool (e.g., diagnose) or the quality of the tool or service. Several common
data sources can be used to calculate TB services coverage.

National TB Lab Database
Data Source Description
A national TB lab database will provide a list of all the known TB labs in the country, with information about the location, sector, and, typically, the health facility name. These lists include the types of diagnostic tools available, such as microscopy, Xpert, or Drug Sensitivity Testing (DST). This information can be used in conjunction with the National Health Facility Master List (HFML) to calculate TB services coverage. The lab lists will often include information about whether treatment initiation or TB drugs are also available in this facility.
Where to Find Data
Most often, lab databases will need to be accessed by a staff member of the NTP, but in some cases, these databases are publicly available.
Typical File Type
Most websites have an option to download the entire list rather than just view it on the website. If possible, it is best to access an Excel or .csv file that lists all of the facilities in the database.
Examples
Service Availability and Readiness Assessment (SARA) or Service Provision Assessments (SPA)
Data Source Description
The SARA and SPA are both assessment tools that capture the availability of health services among health facilities across a country. TB services, both diagnosis and treatment, are commonly included in both types of surveys. SARAs can be conducted as either a census or a representative survey, and SPAs are conducted as a representative survey of health facilities in a country. Final reports from both tools often include information that is helpful in guiding a PPA, including TB service coverage at the subnational level. A PPA team will need to request the raw data for the SARA or SPA to complete a PPA using the wizard.
Where to Find Data
SPA data sets are available from the DHS program that also produces the DHS surveys. The steps described in the section above about DHS data should be followed to access SPA data as well. The SARAs are stored by WHO.
Typical File Type
Similar to DHS datasets, SPA datasets are provided in a variety of file types. For use in the PPA Wizard, SPA datasets should be downloaded as a .dta file. SARA datasets may be provided in a variety of file types, but either a .csv or .dta file should be used in the PPA Wizard.
Examples

^ TOP