# Notes on 1st project The given data at [iui-datalrelease1-sose2021-readonly/\*](/opt/iui-datarelease1-sose2021) represents sensor data from a pen. | Data | MinVal | MaxVal | Description | | ------ |:------:|:------:| ----------------------------------------------- | | Millis | - | - | Timestamp from tablet (Unix time) | | Acc1 X | - | 32768 | Front/Tip accelerometer (Direction: Left/Right) | | Acc1 Y | - | 32768 | Front/Tip accelerometer (Direction: Up/Down) | | Acc1 Z | - | 32768 | Front/Tip accelerometer (Direction: Back/Front) | | Acc2 X | - | 8192 | Back accelerometer (Direction: Left/Right) | | Acc2 Y | - | 8192 | Back accelerometer (Direction: Up/Down) | | Acc2 Z | - | 8192 | Back accelerometer (Direction: Back/Front) | | Gyro X | - | 32768 | Gyroscope sensor | | Gyro Y | - | 32768 | Gyroscope sensor | | Gyro Z | - | 32768 | Gyroscope sensor | | Mag X | - | 8192 | Magnetometer | | Mag Y | - | 8192 | Magnetometer | | Mag Z | - | 8192 | Magnetometer | | Force | - | 4096 | Force applied | | Time | - | - | Time from start of "recording" | There were 100 participants. The folder-structure is as follows:\ `/opt/iui-datarelease1-sose2021/{P}/{N}{A}.csv` | Variable | Description | | -------- | ------------------------------------- | | P | The ID of the participant | | N | The N-th letter the participant wrote | | A | The letter that was written | Each participants folder contains a `calibration.txt`, which contains the calibration data of the pen for the participant. Sensor data was recorded at 100hz (100 recordings/s => 1 recording/ms). ## Preprocessing ### General Since information has different scale (i.e. Acc1: [-32768;32768] and Acc2 [-8192;8192]) the information has to be valued differently based on their importance. ### Millis - Could be used for identifying each data entry -> needs to be normalized to the first entry of the data set to see the comlete timeline of the data ### Acc1 todo ### Acc2 todo ### Gyro todo ### Mag todo ### Force - Sometimes sensor data was recorded even when there is no action -> we need to determine the area of interest - maybe sliding window, where window avg has to be certain threshold - general threshold aproach (filter out data below threshold) - more ideas welcome - Data could be normalized by each users relative strength or data entry ### Time - Time is negative for some data, gotta find out why ## Model selection todo