Assignment #3
Does anyone recall why the lab file is coded with ** in the LabTest column to indicate a new patient? It seems like this information can be completely discarded...
Additionally can we throw away rows such as row 286 -- 33,"","" ? This also seems to be another human error. I am more asking to confirm that these are indeed human data entry errors and not some feature of the dataset I do not understand.
Thanks!
I think it can be discarded. That's what I'm doing anyway. What approach are you taking to creating the extra columns? I'm struggling to get this cumulative hour tally mentioned in class to work.
I don't think that you actually need to add any columns, one could just change the 0d coding into a 0 for all intensive purposes.
I think the easiest way to do the tally is to convert the date time into posix format and then prototype the cum function on a subset of the data (for one id). Basically you just want to create a vector of first differences of the datetime and then multiply these by the dosage...Let me know if that makes sense.