Open Access Paper
11 September 2023 Automated sleep staging based on multi-module neural network using simpler signal: respiratory signal
Yinqing Que, Pengyi Jiang, Tianyi Zhang, Yunzhang Cheng
Author Affiliations +
Proceedings Volume 12779, Seventh International Conference on Mechatronics and Intelligent Robotics (ICMIR 2023); 127791T (2023) https://doi.org/10.1117/12.2688854
Event: Seventh International Conference on Mechatronics and Intelligent Robotics (ICMIR 2023), 2023, Kunming, China
Abstract
Sleeping is a vital biological state which help maintaining the homeostasis of organisms of all biological lives. A full sleep can be divided into different repeating stages, Rapid Eye Movement sleep (REM) stage, Non-Rapid Eye Movement Sleep (NREM) stage one to four. An effective sleep staging system can help patients improving their sleep quality. In the past, patients are required to wear Polysomnogram (PSG) for the whole night to collect signals like Electroencephalogram (EEG), Electrooculogram (EOG), Electromyogram (EMG) for diagnosis. And the traditional sleep staging system use one or more signals above to predict a sleep stage. In this paper, we introduce a new sleep staging algorithm based on machine learning. Our model has two main inputs: patients’ respiratory signal and their physical data, like age, gender, and weight. The strategy is to use two CNNs to extract features from raw respiratory signal in time domain and frequency domain, several Word2vec layers are built to extract features from patients’ meta data and a transformer encoder to collect all the features. Using the MIT-BIH Polysomnographic database, our model achieves a result of 81.96% accuracy. This shows that it is completely feasible to classify patients’ sleep stage with their respiratory signal and meta information.

1.

INTRODUCTION

Sleep is an important biological activity to most creatures. With a sufficient sleep, living organisms have a chance to recover from the day’s activities and maintain a normal physiological balance. While sleeping, several physiological changes occur. For example, an organism’s sensory awareness and metabolic rate will decrease, muscle activities will be inhibited. As a result, the organism will produce less metabolic waste than they can be eliminated1. Thus, sleeping is an important way for organisms to restore the normal function to various systems, which allows them for better performance during wakeful activities.

An effective and healthy sleep is closely related to the healthy life of human beings. Some people may suffer from poor sleep quality, and accurate classification of sleep status and a detailed analysis of sleep quality have become vital prerequisite for doctors to assist in improving sleep. According to scientists’ research on the human sleep process, human sleep activity is an irregular biological state2. Sleep can be divided into a series of repeated stages, the cycle varies from person to person, between 60 and 90 minutes. The series of stages can be divided into wakefulness, NREM sleep stage and REM sleep stage. NREM sleep can be further subdivided into four stages: I, II, III, and IV. This is the R&K standard which is widely used to classify sleep stages around the globe3.

One of the authoritative references on sleep scoring and staging is the AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. According to this manual, patients are required to wear PSG with the help of an experienced doctor when assessing sleep stage. Typically, there are three to six electrodes for collecting EEG signals, three electrodes for EMG signals and another two electrodes for EOG signals in addition to pulse oximeters, breathing detectors4. After the data are collected, an experienced doctor will complete the final sleep stage classification based on the EEG curve characteristics, combined with other supporting features, such as patients’ physical performance, moving events, respiratory behaviors. Although PSG is currently the key standard for sleep staging, the PSG device is large and complicated to use. Patients are equipped with dozens of signal sensors on the body when they are sleeping. Their sleep quality will also be affected by the equipment, which introduces additional variables beyond themselves. Raw data must be subjected to long-term manual monitoring and analysis by doctors, resulting in potential errors and deviations from actual values during staging analysis.

The surge in the occurrence of sleep problems has spurred extensive research in sleep science. Sleep staging is a widely researched and multidisciplinary application that encompasses various fields, including medicine, sensor technology, and computer science. The traditional method of sleep staging involves manual marking by experienced doctors, which is a time-consuming and expensive process. Consequently, several research studies and efforts have been conducted to solve this problem. There are a variety of automatic sleep staging systems been developed, including those based on EEG5-7, those based on all PSG signals8-10, and those based on pre-feature extraction based on EEG and EOG signals11

In this paper, we build an automatic sleep staging model using deep learning. We only take patients’ respiratory signal collected via PSG and basic physical data as inputs. Firstly, a low-pass filter is implemented to reduce high-frequency noise from the original signal. Next, we extract features from patients’ respiratory signal in time domain and frequency domain simultaneously using short-time Fourier transform, wavelet packet processing and convolutional neural network. Finally, we use a transformer encoder is used to combine the features extracted above with additional metadata from patients, including their gender, age, and weight. This model is a comprehensive application of convolutional neural network, transformer encoder, Word2vec to unveil hidden features within the signal data.

2.

METHODOLOGY

Data

We use the respiratory signal collected via PSG from the public MIT-BIH Polysomnographic Database12 for training and evaluation. The signal is sampled at 250Hz. After slicing the raw record into 30 seconds pieces, we achieve 15,575 slices, 3,893,750 data points in total. Before we start training, due to the imbalance of the original data distribute, we perform some data expansion and enhancement. After preprocessing, 85% of the dataset is selected as the training set, and the rest is used as the verification and test set.

Preprocessing

Because of some reasons when collecting the raw data, the signal is full of high-frequency noise. Firstly, we designed a Butterworth low-pass filter to remove the noise. The comparison is shown in figure 1.

Figure 1.

Signal before (up) filtering and after (down) filtering

00105_PSISDG12779_127791T_page_2_1.jpg

After the signal is filtered, we divide the signal into 30 seconds part and save it with the label annotated by experienced doctor provided by the database.

Feature extraction

The short-time Fourier transform (STFT) is a method of extracting partial frequency features in a signal. Firstly, it uses a fixed window function to divide a long continuous time-domain signal into several shorter segments of the same length. Then, for each segment, a Fourier transform is performed separately. Finally, overlap these segments and generate a spectrogram of the entire signal.

Given a time-domain signal x[n], a window w[n], and a hop size H. The calculation process is as follows.

00105_PSISDG12779_127791T_page_3_1.jpg
00105_PSISDG12779_127791T_page_3_2.jpg

Continuous wavelet transform (CWT) is also a method of extracting and compressing information from the original time-domain signal. It uses several wavelet functions to convolve the original signal, decomposes the complex information carried by the original signal into several basic waves, and achieves the original signal. To achieve the purpose of analyze the signal in time-domain and frequency-domain simultaneously.

Given a mother wavelet which is a continuous function in both the time-domain and the frequency-domain ψ(t) to generate child wavelet in the further calculation.

00105_PSISDG12779_127791T_page_3_3.jpg

By shifting and translating the mother wavelet, and convolving with the original signal. We can get:

00105_PSISDG12779_127791T_page_3_4.jpg

We use STFT and CWT to extract features in frequency-domain from the original signal, and save them in a 224x224 color map as shown in figure 2.

Figure 2.

Signal features (left: STFT, right: CWT)

00105_PSISDG12779_127791T_page_3_5.jpg

Model

In general, a person’s respiratory signal is related to his physical status and basic information such as age, weight, and gender. Some traditional deep learning model may only take one feature into account, for example, time-domain signal series, while ignoring some other details. Inspired by the model introduced by Minh13. Our model can accept the timedomain signal, frequency-domain features, and other meta data simultaneously. There will be auto-encoder to mine potential relationship between the features extracted separately by different model. The model design is shown in figure 3:

Figure 3.

Model structure

00105_PSISDG12779_127791T_page_4_1.jpg

Feature extraction module This module aims to extract feature vector from the inputs separately. We use 1D Resnet-34 to extract features from the time-domain signal and two independent 2D Resnet-34 to process the STFT feature map and the CWT feature map. For the meta data part, an auto encoder Word2vec is used to extract features. These four feature extraction components will generate four 64-dimensional features.

Transformer encoder module This module aims to generate encodings contained in the inputs and between the inputs. For each part of the input, the multi-head attention module will weigh the relevance from the four feature vectors and extract information from them. The feed-forward module will perform additional processing of the outputs from the multi-head attention module, such as normalization and residual connects.

Model output module This module aims to integrate the feature vector produced from the transformer encoder module and give a prediction. In this module, we use several full-connect layers and dropout layers, ReLU and SoftMax as the activation functions, to generate the result.

3.

RESULTS

Datasets The MIT-BIH Polysomnographic dataset consists of 130 hours long respiratory signal records of 18 patients. The recordings are digitized at a sampling frequency of 250Hz. There are a total of 16 different labeled data in the database tag. In our experiments we choose 6 common labels related to sleep staging. The labels and their meaning are shown in the following table 1.

Table 1.

Labels and their meanings

LabelMeaning
WWake
RREM sleep
1NREM sleep stage 1
2NREM sleep stage 2
3NREM sleep stage 3
4NREM sleep stage 4

Before training, we found that there is a data imbalance in the original database. We use the following strategies to enhance the database data to get a better performance:

  • 1: For two adjacent segments with the same label. Combining the two segments into a 1-minute-long segment and take the center 30-second as a new segment. The label remains unchanged.

  • 2: For some types with fewer samples. Shift the window forward and backward for 5 seconds to resample new records. The label is consistent with the source sample.

After removing partially overlapping signal records, the data is divided into training set and test set at a ratio of 85:15.

Precision and recall matrixes represent how well the model performs.

00105_PSISDG12779_127791T_page_5_1.jpg
00105_PSISDG12779_127791T_page_5_2.jpg

F1-score represents the harmonic mean of the precision score and recall score.

00105_PSISDG12779_127791T_page_5_3.jpg

The experiment result is shown in figure 4.

Figure 4.

Model performance (6-categories)

00105_PSISDG12779_127791T_page_5_4.jpg

To compare with other sleep staging models in table 2, we combine label 3 and label 4 as label 3. And the prediction result is shown below in figure 5.

Figure 5.

Model performance (5-categories)

00105_PSISDG12779_127791T_page_5_5.jpg

Table 2.

Performance comparison between our model with some other sleep staging model

 InputsClassesAcc.
Ref. 86-channel PSG587.50%
Ref. 5F4-EOG(Left)586.20%
Ref. 9All PSG channels585.20%
Ref. 6FP2-EOG(Left)583.35%
Ref. 5FPz-Cz582.00%
OursRespiratory Signal581.96%
Ref. 82-channel PSG581.90%
OursRespiratory Signal681.56%
Ref. 5Pz-Oz579.80%
Ref. 10EEG & EOG579.00%
Ref. 14FPz_Cz574.80%

4.

DISCUSSION AND CONCLUSION

In this paper, we assume that a person’s sleep depth can be reflected from respiratory signal. To prove this concept, we build a multi-module neural network and use the patient’s respiratory signal data collected from PSG during sleep from the MIT-BIH Polysomnographic dataset. And our model achieved an accurate score of 81.96%. By deploying our automated sleep staging system, the characteristics of respiratory signal can be fully mined, and get a similar effect of the model using EEG, EOG, ECG and many other signals collected by PSG.

Our experiments demonstrate the feasibility of performing an automatic sleep staging using only a patient’s respiratory signal. This model implies that in some non-emergency medical scenarios, we can complete sleep staging without PSG. This will eliminate the influence carried by the complicated monitoring equipment. Our automated sleep staging system shows great potential for use in future scenarios, such as home-based medical treatment. The model can be deployed in some IoT devices, and enables people to monitor their sleep at home without the help of the doctors. Moving forward, we believe that our automated sleep staging system will provide people more accessible, cost-effective, and personalized sleep analytics.

REFERENCES

[1] 

Xie L., Kang H., Xu Q., Chen M. J., Liao Y., Thiyagarajan M., et al., “Sleep drives metabolite clearance from the adult brain,” Science, 342 (6165), 373 –377 (2013). https://doi.org/10.1126/science.1241224 Google Scholar

[2] 

Agarwal, R., Gotman J., “Computer-assisted sleep staging,” IEEE Transactions on Biomedical Engineering, 48 (12), 1412 –1423 (2001). https://doi.org/10.1109/10.966600 Google Scholar

[3] 

Rechtschaffen A., “A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects,” Archives of General Psychiatry, 1 –55 (1968). Google Scholar

[4] 

Berry, R. B., Brooks, R., Gamaldo, C. E., Harding, S. M., Marcus, C., Vaughn, B. V., “AASM manual for the scoring of sleep and associated events, Rules, Terminology and Technical Specifications,” 176 American Academy of Sleep Medicine, Darien, Illinois (2012). Google Scholar

[5] 

Supratak, A., Dong, H., Wu, C., Guo, Y., “DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25 (11), 1998 –2008 (2017). https://doi.org/10.1109/TNSRE.7333 Google Scholar

[6] 

Dong, H., Supratak, A., Pan, W., Wu, C., Matthews, P. M., Guo, Y., “Mixed neural network approach for temporal sleep stage classification,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26 (2), 324 –333 (2017). https://doi.org/10.1109/TNSRE.7333 Google Scholar

[7] 

Mousavi, S., Afghah, F., Acharya, U. R., “SleepEEGNet: Automated sleep stage scoring with sequence to sequence deep learning approach,” PloS one, 14 (5), e0216456 (2019). https://doi.org/10.1371/journal.pone.0216456 Google Scholar

[8] 

Biswal, S., Sun, H., Goparaju, B., Westover, M. B., Sun, J., Bianchi, M. T., “Expert-level sleep scoring with deep neural networks,” Journal of the American Medical Informatics Association, 25 (15), 1643 –1650 (2018). https://doi.org/10.1093/jamia/ocy131 Google Scholar

[9] 

Guillot, A., Sauvet, F., During, E. H., Thorey, V., “Dreem open datasets: Multi-scored sleep datasets to compare human and automated sleep staging,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28 (9), 1955 –1965 (2020). https://doi.org/10.1109/TNSRE.7333 Google Scholar

[10] 

Perslev, M., Darkner, S., Kempfner, L., Nikolic, M., Jennum, P. J., Igel, C., “U-Sleep: resilient high-frequency sleep staging,” NPJ digital medicine, 4 (1), 72 (2021). https://doi.org/10.1038/s41746-021-00440-5 Google Scholar

[11] 

Kuo, C. E., Chen, G. T., “Automatic sleep staging based on a hybrid stacked LSTM neural network: verification using large-scale dataset,” IEEE access, 8 111837 –111849 (2020). https://doi.org/10.1109/Access.6287639 Google Scholar

[12] 

Ichimaru, Y., Moody, G., “Development of the polysomnographic database on CD - ROM,” Psychiatry and clinical neurosciences, 53 (2), 175 –177 (1999). https://doi.org/10.1046/j.1440-1819.1999.00527.x Google Scholar

[13] 

Le, M. D., Rathour, V. S., Truong, Q. S., Mai, Q., Brijesh P., Le N., “Multi-module recurrent convolutional neural network with transformer encoder for ECG arrhythmia classification,” in 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 1 –5 (2021). Google Scholar

[14] 

Tsinalis, O., Matthews, P. M., Guo, Y., Zafeiriou, S., “Automatic sleep stage scoring with single-channel EEG using convolutional neural networks,” (2016). Google Scholar
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yinqing Que, Pengyi Jiang, Tianyi Zhang, and Yunzhang Cheng "Automated sleep staging based on multi-module neural network using simpler signal: respiratory signal", Proc. SPIE 12779, Seventh International Conference on Mechatronics and Intelligent Robotics (ICMIR 2023), 127791T (11 September 2023); https://doi.org/10.1117/12.2688854
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Feature extraction

Polysomnography

Electroencephalography

Databases

Data modeling

Education and training

Neural networks

Back to Top