I have a question concerning the sleep stages in the profusion and the HRV data from the MESA dataset.
The extracted sleep stages in 30 second intervals have different lengths which are potentially coming from another endtime
e.g. for mesaid =2 : In mesa-sleep-0002-profusion.xml - sleep stages have a length of 1319
while for HRV in mesa-sleep-0002-rpoint.csv - sleep stages have a length of 1060
Just cutting the longer once off seems logical for many mesaids as the sequence is the same.
But the issue comes as some specific mesaids have a different sequence.
e.g. for mesaid =2: HRV and profusion are compatible till index 177, after this the sequence has a shift and differs.
HRV: [2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 1, 2, 1, 2, 2, 2, 2, 2, 2, 0, 1, 2, 2, 2, 2, 2, 3, 2, 2, 2]
profusion: [2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 1, 2, 1, 2, 2, 2, 2, 2, 2, 0, 1, 2, 2, 2, 2, 2, 3, 2, 2]
Therefore, I am not sure if it is possible to combine the sleep stage sequences or how, maybe there is something I am missing, such as e.g. some specific intervals are deleted in HRV which I can translate to the profusion sequence?
Please let me know if someone has an idea.
Thanks for raising the issue. I will look into this. Can you post two IDs that don't have any mismatches (assuming you cut off longer sequence) and another ID (like mesaid = 2) where there are mismatches?
Is the sequence you posted starting at position/epoch 153 for mesaid = 2? Just wanted to confirm I'm looking at the same snippet.
At a glance I see that the HRV CSV file does not contain any Rpoints from the 157th epoch (wake according to Profusion) for mesaid = 2. Would this account for your sequences "missing" a 0? There are 22 zeros in the HRV array and 23 in the Profusion array you posted. I was looking at the "epoch" column in the HRV CSV file. There would appear to be a 38.84 second "jump" in time (seconds column) between Rpoints.
My guess would be that this area was scrubbed/censored during the HRV Rpoint identification. Maybe there was artifact in the signal. I could have a PSG scorer here look it up to confirm. Let me know if this sort of missingness would account for the sequence shifts you are seeing across subjects.
Thank you for the fast reply. Yes, I think this could count for the mismatch.
To answer your previous questions: e.g. mesaid=6 has the same length and the same sequence in both files. while mesaid= 14 has same sequence but different length.
and yes I was looking at epoch 153 onwards.
It would be very helpful to know why certain epochs are missing for different mesaids, can you maybe confirm that this is because of artefacts as you mentioned?
The scorer reviewed mesaid = 2 and noted that there is artifact on the ECG channel that starts in epoch 156 and ends in epoch 158. The HRV reviewing/editing program (Somte) presumably censored the Rpoint data from those segments. I think it's reasonable to assume this (i.e. excluded ECG artifact) will be the common thread across other missing segments/areas of the files. Note that the ECG beats were only reviewed/analyzed in the sleep period between sleep onset and lights on.
Hope this helps in your work!