Extract hypnograms from .xml : v2 - Sleep Data - National Sleep Research Resource

4 posts

Was this reply useful? Learn more...

[-]

kaaremikkelsen +0 points · almost 4 years ago

Having now downloaded the shhs data set, I am again trying to just extract the hypnograms for further analysis.

Inspecting the xml files seems to work fine:

./luna --xml ./annotations-events-nsrr/shhs1/shhs1-200001-nsrr.xml

With an output like:

. . EpochLength 30

0 - 840 (840 secs) Stages|Stages Wake|0

0 - 32520 (32520 secs) Recording Start Time

304 - 307 (3 secs) Respiratory|Respiratory SpO2 artifact|SpO2 artifact

840 - 870 (30 secs) Stages|Stages Stage 1 sleep|1

870 - 960 (90 secs) Stages|Stages Stage 2 sleep|2

However, when I try to use the 'stage' command:

./luna -s "STAGE" ./annotations-events-nsrr/shhs1/shhs1-200001-nsrr.xmlS

all I get is:

===================================================================

+++ luna | v0.23, 15-Jan-2020 | starting 19-May-2020 14:32:21 +++

===================================================================

What am I doing wrong?

reply link

11 posts

Was this reply useful? Learn more...

[-]

shaunpurcell +1 point · almost 4 years ago

Pls check out the Luna documentation, specifically this and the tutorials

In general, you need to specify the EDF and any associated annotation files together (i.e. even if the analysis only happens to use data from the annotation file). The handful of commands such as "--xml" are special cases.

Assuming you're using the most recent version of Luna:

1) Create a 'sample list', e.g. assuming I've downloaded the data in /data/nsrr/datasets/

luna --build /data/nsrr/datasets/shhs/polysomnography/edfs/ /data/nsrr/datasets/shhs/polysomnography/annotations-events-nsrr/ -ext=-nsrr.xml > s.lst

Each row will be 3 tab-delimited columns (note, lines may wrap in this browser view), which matches up the EDF and the XMLs (i.e. ~5000 rows to this file, the first of which will look like this:)

shhs1-200001 /data/nsrr/datasets/shhs/polysomnography/edfs/shhs1/shhs1-200001.edf /data/nsrr/datasets/shhs/polysomnography/annotations-events-nsrr/shhs1/shhs1-200001-nsrr.xml

2) Run STAGES command using that sample list

luna s.lst 1 5 -t o1 -s STAGE

e.g. here just for first five people, sending output to a folder "o1"

3) Confirm output:

$ ls o1/

shhs1-200001 shhs1-200002 shhs1-200003 shhs1-200004 shhs1-200005

$ head o1/shhs1-200001/STAGE-E.txt

ID E CLOCK_TIME MINS STAGE STAGE_N

shhs1-200001 1 22.00.00 0 wake 1

shhs1-200001 2 22.00.30 0.5 wake 1

shhs1-200001 3 22.01.00 1 wake 1

etc

4) To run for all people, remove the "1 5" from the command line. To dump to a database instead of text, use "-o" etc, etc. as described in the Luna Docs. The output are tab-delimited, you can easily extract fifth column and concatenate across samples, etc, as desired.

reply link

4 posts

Was this reply useful? Learn more...

[-]

kaaremikkelsen +0 points · almost 4 years ago

thanks a lot :)

reply link

11 posts

Was this reply useful? Learn more...

[-]

shaunpurcell +0 points · almost 4 years ago

BTW, we plan in the future to distribute annotation data in simpler text-based formats, i.e. for staging, just one row/item per epoch.

In the mean time, if your interests are only on the hypnogram/stage distribution, there's no need to download EDFs or use Luna, etc. You could always use something like the following lazy *nix/Mac command line hack, which takes advantage of the fact that (i.e. assumes that) intervals in the XML are always in 30-second epochs, even though the XML may specify, e.g. a 90-second block of REM for 3 consecutive REM epochs, for example.

All on one line:

$ luna --xml /data/nsrr/datasets/shhs/polysomnography/annotations-events-nsrr/shhs1/shhs1-200001-nsrr.xml | grep Stage | awk -F"\t" ' { print $2 ,$4 } ' OFS="\t" | tr -d '(' | tr -d ')' | sed 's/secs//g' | awk -F"\t" ' { n=$1/30 ; for(i=0;i<n;i++) print $2 } ' > s.txt

i.e. this generates a simply text file, one row per epoch:

$ head s.txt

Wake|0

Wake|0

Wake|0

Wake|0

Wake|0

Wake|0

Wake|0

Wake|0

Wake|0

Wake|0

To summarize the stages for this study:

$ sort s.txt | uniq -c

102 REM sleep|5

47 Stage 1 sleep|1

457 Stage 2 sleep|2

145 Stage 3 sleep|3

333 Wake|0

Cheers, --S

reply link

Write a Reply

Write a Reply Extract hypnograms from .xml : v2

Write a Reply
Extract hypnograms from .xml : v2