2.9 KiB
2.9 KiB
Notes on 1st project
The given data at iui-datalrelease1-sose2021-readonly/* represents sensor data from a pen.
Data | MinVal | MaxVal | Description |
---|---|---|---|
Millis | - | - | Timestamp from tablet (Unix time) |
Acc1 X | - | 32768 | Front/Tip accelerometer (Direction: Left/Right) |
Acc1 Y | - | 32768 | Front/Tip accelerometer (Direction: Up/Down) |
Acc1 Z | - | 32768 | Front/Tip accelerometer (Direction: Back/Front) |
Acc2 X | - | 8192 | Back accelerometer (Direction: Left/Right) |
Acc2 Y | - | 8192 | Back accelerometer (Direction: Up/Down) |
Acc2 Z | - | 8192 | Back accelerometer (Direction: Back/Front) |
Gyro X | - | 32768 | Gyroscope sensor |
Gyro Y | - | 32768 | Gyroscope sensor |
Gyro Z | - | 32768 | Gyroscope sensor |
Mag X | - | 8192 | Magnetometer |
Mag Y | - | 8192 | Magnetometer |
Mag Z | - | 8192 | Magnetometer |
Force | - | 4096 | Force applied |
Time | - | - | Time from start of "recording" |
There were 100 participants.
The folder-structure is as follows:
/opt/iui-datarelease1-sose2021/{P}/{N}{A}.csv
Variable | Description |
---|---|
P | The ID of the participant |
N | The N-th letter the participant wrote |
A | The letter that was written |
Each participants folder contains a calibration.txt
, which contains the calibration data of the pen for the participant.
Sensor data was recorded at 100hz (100 recordings/s => 1 recording/ms).
Preprocessing
General
Since information has different scale (i.e. Acc1: [-32768;32768] and Acc2 [-8192;8192]) the information has to be valued differently based on their importance.
Millis
- Could be used for identifying each data entry -> needs to be normalized to the first entry of the data set to see the comlete timeline of the data
Acc1
todo
Acc2
todo
Gyro
todo
Mag
todo
Force
-
Sometimes sensor data was recorded even when there is no action -> we need to determine the area of interest
- maybe sliding window, where window avg has to be certain threshold
- general threshold aproach (filter out data below threshold)
- more ideas welcome
-
Data could be normalized by each users relative strength or data entry
Time
- Time is negative for some data, gotta find out why
Model selection
todo