Non-invasive neural recordings that are precisely time-locked to naturalistic language production are rare but crucial for training and evaluating brain-to-text decoders. This dataset supplies multi-hour, high-sampling-rate MEG and EEG recordings collected while participants memorized and typed short Spanish sentences, with per-keystroke/word/sentence timing and behavioral logs ready for alignment with decoding pipelines.
What Sets It Apart
- Simultaneous high-density MEG (306 channels, Elekta Neuromag) and 64-channel EEG (BrainVision), both sampled at 1 kHz, enabling comparisons between modalities for decoding tasks.
- Task design: read → wait (1.5 s fixation) → type from memory on a custom non-ferromagnetic QWERTY keyboard, producing tightly time-locked motor and language signals without on-screen feedback.
- Data and formats: raw MEG (.fif) and EEG (BrainVision .eeg/.vhdr/.vmrk), MATLAB behavioral logs (.mat), and a standardized event dataframe compatible with the Brain2Qwerty/neuralset tooling — about 262 GB total (≈21.5 h MEG, ≈17.7 h EEG).
- Reproducibility & privacy: de-identified recordings only (structural MRI, head videos, eye-tracking excluded) and released under CC BY-NC 4.0 for non-commercial research use.
Who It's For and Trade-offs
Great fit if you are a computational neuroscientist, BCI researcher, or ML practitioner building or benchmarking non-invasive brain-to-text decoders who needs time-resolved MEG/EEG aligned to keystrokes and words. The dataset is prepared to work with existing Brain2Qwerty preprocessing and event-building code, reducing integration overhead.
Look elsewhere if you need a purely EEG-only large cohort with clinical populations, an unrestricted commercial license, or a smaller dataset that can be processed on a laptop: this release is large (≈262 GB), modality- and hardware-specific (Elekta Neuromag, BrainVision), and requires EEG/MEG preprocessing expertise and sufficient compute for model training.
