feat[jungfrau][correct]: use new correct data source and link to old data source
1 unresolved thread
Compare changes
- Karim Ahmed authored
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
Doing
create_metadata()
after we've set up the sources, it shouldn't be necessary to pass the source/channel names here, because the file already knows about them.Aha!
Can you also remind me why we prefer creating metadata at the beginning?
For efficiency... we assume that GPFS will likely serve the first few bytes quicker than the rest when opening a file.
changed this line in version 11 of the diff
Thank you both!
And that's not entirely an assumption - train IDs used to be spread throughout the file, and I think we saw about a substantial improvement in the speed of reading them when we put them all together at the beginning.
I think GPFS works with chunks of large files. So ideally all the metadata we need is in one chunk at the beginning - but if it's one chunk at the beginning and one at the end, that's probably not too bad.
I suspect that HDF can directly jump the wherever the location of
METADATA
is and only attempt reading those bits? Or does it need to traverse some tree structure potentially scattered throughout the file?There is a tree, but as METADATA is in the root group and that's quite small, it's hopefully just one or two layers of indirection to follow.